ProjQ

This repository contains code for the ICML 2026 paper ProjQ: Project-and-Quantize for Adapter-Aware LLM Compression.

Introduction

We propose ProjQ, a novel framework for constraining quantization noise to the low-rank manifold via orthogonal subspace projection. We derive an efficient alternating algorithm that shapes the quantization noise into a low-rank structure, effectively offloading dominant error components to the subsequent adapter while minimizing the residual error in the orthogonal ”uncorrectable” subspace. Our algorithm consists of two phases: (1)Subspace-aware Quantization; and (2) Error Compensation with LoRA adapter initialization. The current release includes the following features:

Phase 1 iterative projection with GPTQ quantizer: gptqmodel/quantization/projq_gptq.py.
Phase 2 Low-rank error compensation and lora adapter initialization: gptqmodel/eora/lordq.py.
LoRA fine-tuning tasks including GSM8K, WikiText-2 and Commonsense Reasoning: /peft.
Evaluating the performance of quantized models on several ZeroShot tasks: eval_acc.py.
datasets for language model evaluation: datautils.py.
Evaluating the perplexity of quantized models on several language generation tasks is included in the main execution script, see the details below.

Installation

git clone https://github.com/yourname/ProjQ.git
cd ProjQ
pip install -r requirements.txt

Quantization Error Compensation

The code is primarily tested and run on Llama 2, Qwen2.5-Instruct, and Qwen3 models. Since the implementation is adapted based on GPTQModel, running it on other models can also refer to the corresponding relevant instructions and documentation. --rank represents the designed rank which governs the dimensionality of the subspace used to shape the quantization noise during the Phase 1. The number of alternating iterations --iteration is set to 5. The following command runs the 2-bit quantization process.

python main.py \
    --model_id /path/to/model \
    --bits 2 \
    --group_size 128 \
    --quant_method PROJQ \
    --rank 16 \
    --iteration 5 \
    --save_dir /path/to/quantized_model

After obtaining the quantized model, run the following code to perform error compensation, which also yields the initial adapter. Here, --comp_rank denotes the adapter rank in Phase 2.

python comp_train.py \
    --model_id /path/to/model \
    --quantized_model_dir /path/to/quantized_model \
    --comp_rank 64 \
    --comp_method lordq

Fine-tuning for Downstream Tasks

The code for LoRA fine-tuning tasks is located in the peft/, which includes three types of tasks: GSM8K, WikiText-2 and Commonsense Reasoning. peft/gsm8k_ft.py and peft/wiki_ft.py are used for LoRA fine-tuning; peft/gsm8k_eval.py and peft/wiki_eval.py are used for the corresponding evaluation. peft/cs_ft.py includes both training and evaluation.

You can find fine-tuning implementation in script/run.sh. Below is an example of fine-tuning and evaluation on the GSM8K task. Here, --rank must be the same as the adapter rank in phase 2.

python gsm8k_ft.py \
    --model_id /path/to/model \
    --quantized_model_dir /path/to/quantized_model_with_adapter \
    --rank 64\
    --bits 2\
    --lora_alpha 16 \
    --learning_rate 5e-5 \
    --seed 11 \
    --num_train_epochs 3 \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --evaluation_strategy "no" \
    --save_strategy "epoch" \
    --lr_scheduler_type "cosine" \
    --weight_decay 0.1 \
    --warmup_ratio 0.03 \
    --logging_steps 10 \
    --output_dir /path/to/gsm8k_lora \
    --remove_unused_columns False

python gsm8k_eval.py \
    --model_name_or_path /path/to/model \
    --quantized_model_dir /path/to/quantized_model_with_adapter \
    --batch_size 16

Acknowledgements

This project is based on and modified from GPTQModel and LoftQ. Sincere thanks for their efforts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProjQ

Introduction

Installation

Quantization Error Compensation

Fine-tuning for Downstream Tasks

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
gptqmodel		gptqmodel
peft		peft
script		script
README.md		README.md
comp_train.py		comp_train.py
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

ProjQ

Introduction

Installation

Quantization Error Compensation

Fine-tuning for Downstream Tasks

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages