CoQuant: Joint Weight-Activation Subspace Projection for Mixed-Precision LLMs

Building environment

Create conda environment

conda create -n "coquant" python=3.12.0

Install requirements

pip install -r requirements.txt

Install fast-hadamard-transform library from here

git clone git@github.com:Dao-AILab/fast-hadamard-transform.git
cd fast-hadamard-transform
pip install .

Running code

we show the example of how to quantize llama3.2-1b with CoQuant and baseline-ResQ

CoQuant

Get projection matrices

BASIS_COV_MODE=wa_cov bash 0_get_basis_4bit.sh

Quantize and evaluate the model

BASIS_COV_MODE=wa_cov bash 0_eval_ptq_4bit.sh

Resq

Get projection matrices

BASIS_COV_MODE=a_cov bash 0_get_basis_4bit.sh

Quantize and evaluate the model

BASIS_COV_MODE=a_cov bash 0_eval_ptq_4bit.sh

Acknowledgements

Our implementation is developed based on the official implementation of ResQ. We sincerely thank the authors for making their code publicly available.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
fake_quant		fake_quant
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoQuant: Joint Weight-Activation Subspace Projection for Mixed-Precision LLMs

Building environment

Running code

CoQuant

Resq

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CoQuant: Joint Weight-Activation Subspace Projection for Mixed-Precision LLMs

Building environment

Running code

CoQuant

Resq

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages