Skip to content

yujiepan-work/24h1-sparse-quantized-llm-ov

Repository files navigation

24h1-sparse-quantized-llm-ov

setup

Install pytorch and:

pip install tabulate transformers==4.35  optimum-intel[openvino]==1.13.0 nncf==2.7.0
pip install deepsparse-nightly[llm]==1.6.0.20231120
pip install openvino==2023.3.0

benchmark

  • Reproduce NM paper: deepsparse_reproduce.bash
  • Export IR models: export_ir.bash & export_ir_w_mask.bash(sparse models)
  • OV benchmarkapp: see the files: benchmarkapp_*.bash, or see run_ov_benchmark_app.bash (more complicated; codes are not well organized).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published