This is the official PyTorch implementation of "LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models"
benchmark
deployment
tool
evaluation
falcon
pruning
llama
quantization
opt
post-training-quantization
awq
ptq
large-language-models
llm
smoothquant
internlm
llama2
internlm2
llama3
omniquant
-
Updated
Jul 20, 2024 - Python