- Create conda environment
conda create -n "coquant" python=3.12.0
- Install requirements
pip install -r requirements.txt
- Install fast-hadamard-transform library from here
git clone git@github.com:Dao-AILab/fast-hadamard-transform.git
cd fast-hadamard-transform
pip install .
we show the example of how to quantize llama3.2-1b with CoQuant and baseline-ResQ
- Get projection matrices
BASIS_COV_MODE=wa_cov bash 0_get_basis_4bit.sh
- Quantize and evaluate the model
BASIS_COV_MODE=wa_cov bash 0_eval_ptq_4bit.sh
- Get projection matrices
BASIS_COV_MODE=a_cov bash 0_get_basis_4bit.sh
- Quantize and evaluate the model
BASIS_COV_MODE=a_cov bash 0_eval_ptq_4bit.sh
Our implementation is developed based on the official implementation of ResQ. We sincerely thank the authors for making their code publicly available.