24h1-sparse-quantized-llm-ov

setup

Install pytorch and:

pip install tabulate transformers==4.35  optimum-intel[openvino]==1.13.0 nncf==2.7.0
pip install deepsparse-nightly[llm]==1.6.0.20231120
pip install openvino==2023.3.0

benchmark

Reproduce NM paper: deepsparse_reproduce.bash
Export IR models: export_ir.bash & export_ir_w_mask.bash(sparse models)
OV benchmarkapp: see the files: benchmarkapp_*.bash, or see run_ov_benchmark_app.bash (more complicated; codes are not well organized).

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
README.md		README.md
benchmarkapp_fp32.bash		benchmarkapp_fp32.bash
benchmarkapp_w8a8.bash		benchmarkapp_w8a8.bash
benchmarkapp_w8a8_sparse70.bash		benchmarkapp_w8a8_sparse70.bash
deepsparse_reproduce.bash		deepsparse_reproduce.bash
export_ir.bash		export_ir.bash
export_ir.py		export_ir.py
export_ir_w_mask.bash		export_ir_w_mask.bash
export_ir_w_mask.py		export_ir_w_mask.py
export_w8a8_sparse_model.py		export_w8a8_sparse_model.py
read_neuralmagic_ref_onnx.py		read_neuralmagic_ref_onnx.py
run_ov_benchmark_app.bash		run_ov_benchmark_app.bash
run_ov_benchmark_app.py		run_ov_benchmark_app.py
show_ir_weights.py		show_ir_weights.py
tld0.6.json		tld0.6.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

24h1-sparse-quantized-llm-ov

setup

benchmark

About

Releases

Packages

Languages

yujiepan-work/24h1-sparse-quantized-llm-ov

Folders and files

Latest commit

History

Repository files navigation

24h1-sparse-quantized-llm-ov

setup

benchmark

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages