[ICLR2026] RCPU: Rotation-Constrained Error Compensation for Structured Pruning of Large Language Models
This is a reference code for RCPU. Arxiv link
We conducted experiments in Nvidia A100 GPU with CUDA12.6.
conda create -n rcpu python=3.10.16
pip install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu126
pip install transformers==4.57.1 datasets==4.6.0
The command like below will start pruning and ppl evaluation.
CUDA_VISIBLE_DEVICES=0 python main.py --method rcpu --unstr --nsamples 128 --pruning_ratio 0.1
Note: Minor numerical differences may occur depending on library versions and hardware configurations.
@inproceedings{haruta2026rcpu,
title={RCPU: Rotation-Constrained Error Compensation for Structured Pruning of Large Language Models},
author={Haruta, Shuichiro and Matsumoto, Kazunori and Li, Zhi and Wang, Yanan and Kurokawa, Mori},
booktitle={International Conference on Learning Representations (ICLR)},
year={2026}
}
- In our experiments, benchmark evaluations are done by LM-EVALUATION-HARNESS.
- We made this project based on FLAP.
Thanks!