[ICML 2026] Preserve-Then-Quantize: Balancing Rank Budgets for Quantization Error Reconstruction in LLMs

This is the official code for "Preserve-Then-Quantize: Balancing Rank Budgets for Quantization Error Reconstruction in LLMs".

TL;DR: SRR improves QER & QPEFT by splitting the rank budget between preserving dominant structure and reconstructing quantization error, guided by a principled, cheap selection criterion.

Env Setup

unzip icml_srr-main.zip
cd icml_srr-main
conda env create -f environment.yml
conda activate srr
pip install -r requirements.txt

lm-eval-harness is installed automatically via pip install -r requirements.txt (lm_eval==0.4.7).

PTQ with SRR

1. Activate Conda Environment

conda activate srr

2. Run the PTQ Script

./experiments/ptq/run_srr_3bit_32rank.sh

./experiments/ptq/run_srr_3bit_64rank.sh

By default, this runs PTQ with SRR using qera-exact scaling on the LLaMA-2 7B model.

3. Optional Configurations

Select GPU: Edit export CUDA_VISIBLE_DEVICES=0 in the .sh scripts to choose the GPU ID.
Enable Zero-Shot Evaluation: Remove --disable-lm-eval from the default settings.
Randomized SVD: Add --apply-rand-svd to use torch.svd_lowrank instead of full SVD during SRR initialization. This speeds up the SVD computation for large weight matrices with minimal accuracy loss. Only applicable when lr_initializer is set to srr in the config.

4. Check Results

Results are saved automatically to the ./checkpoints directory.

QPEFT with SRR

1. Activate Conda Environment

conda activate srr

2. Run the QPEFT Script

Navigate to the specific task directory you want to run. For example, to run a GLUE task:

cd experiments/qpeft/glue

Then execute:

./srr.sh

By default, this runs QPEFT with SRR using 4-bit MXINT quantization.

3. Optional Configurations

GLUE: Adjust task_list in srr.sh and learning_rate_list in adapt_and_glue_train.sh.
GSM8K and SlimPajama: Select model and quant_bits in srr.sh, then modify learning_rate_list in the corresponding training script (adapt_and_gsm8k_train.sh or adapt_and_clm_train.sh).

4. Check Results

Experiment results are saved in the ./checkpoints directory.

Acknowledgement

This codebase is built on top of QERA.

BibTeX

@inproceedings{cho2026preserve,
  title={Preserve-Then-Quantize: Balancing Rank Budgets for Quantization Error Reconstruction in LLMs},
  author={Cho, Yoonjun and Jeon, Dongjae and Kim, Soeun and Jeon, Moongyu and No, Albert},
  booktitle={International Conference on Machine Learning},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
experiments		experiments
perplexity_results		perplexity_results
src		src
test		test
.gitignore		.gitignore
README.md		README.md
adapt_and_save.py		adapt_and_save.py
chunk_checker.py		chunk_checker.py
clm_train.py		clm_train.py
environment.yml		environment.yml
glue_train.py		glue_train.py
gsm8k_train.py		gsm8k_train.py
ptq_bf16_baseline.py		ptq_bf16_baseline.py
ptq_parser.py		ptq_parser.py
ptq_parser_per_seed.py		ptq_parser_per_seed.py
ptq_pipeline.py		ptq_pipeline.py
ptq_pipeline_chunked.py		ptq_pipeline_chunked.py
ptq_q_baseline.py		ptq_q_baseline.py
requirements.txt		requirements.txt
srr_main_figure.png		srr_main_figure.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[ICML 2026] Preserve-Then-Quantize: Balancing Rank Budgets for Quantization Error Reconstruction in LLMs

Env Setup

PTQ with SRR

1. Activate Conda Environment

2. Run the PTQ Script

3. Optional Configurations

4. Check Results

QPEFT with SRR

1. Activate Conda Environment

2. Run the QPEFT Script

3. Optional Configurations

4. Check Results

Acknowledgement

BibTeX

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

[ICML 2026] Preserve-Then-Quantize: Balancing Rank Budgets for Quantization Error Reconstruction in LLMs

Env Setup

PTQ with SRR

1. Activate Conda Environment

2. Run the PTQ Script

3. Optional Configurations

4. Check Results

QPEFT with SRR

1. Activate Conda Environment

2. Run the QPEFT Script

3. Optional Configurations

4. Check Results

Acknowledgement

BibTeX

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages