🎉 PSOFT is now officially integrated into the 🤗 HuggingFace PEFT library !!
🎉 PSOFT is accepted to ICLR 2026 !! See you in Rio de Janeiro !!
PSOFT preserves the geometric structure of pre-trained weight columns—a key principle of Orthogonal Fine-Tuning (OFT)—while achieving a balanced trade-off between parameter, computation, and memory efficiency.
Unlike sparsity-based OFT variants (e.g., OFTv1/OFTv2, BOFT, GOFT), PSOFT adopts a low-rank principal subspace formulation that bridges LoRA and OFT. By restricting orthogonal transformations to a principal subspace, PSOFT provides theoretical guarantees through orthogonality constraints, while maintaining practical flexibility via two lightweight scaling vectors.
Extensive experiments across 35 NLP and CV tasks on four representative models demonstrate that PSOFT delivers strong semantic preservation, expressiveness, and multi-dimensional efficiency in PEFT.
import torch
from peft import PsoftConfig, get_peft_model
from transformers import AutoTokenizer, AutoModelForCausalLM
from trl import SFTConfig, SFTTrainer
from datasets import load_dataset
model_name = "facebook/opt-125m"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token_id = tokenizer.eos_token_id
psoft_config = PsoftConfig(
r=32,
psoft_alpha=32,
)
peft_model = get_peft_model(model, psoft_config)
peft_model.print_trainable_parameters()
dataset = load_dataset("imdb", split="train[:1%]")
training_args = SFTConfig(dataset_text_field="text", max_length=128)
trainer = SFTTrainer(
model=peft_model,
args=training_args,
train_dataset=dataset,
processing_class=tokenizer,
)
trainer.train()
peft_model.save_pretrained("psoft-opt-125m")More details please refer to package_reference, examples and method_comparison in 🤗 HuggingFace PEFT library.
Tip
- Rank Choice: Smaller ranks (e.g.,
32–128) work well for simpler tasks, while larger ranks (e.g.,64–256) increase expressiveness at the cost of additional parameters and computation. - Scaling Factor: In our experiments, the scaling factor is typically set to
r. - Learning Rate: Standard learning rates (e.g.,
1e-4to5e-3) generally provide stable training. - SVD Initialization: The
lowrankoption is more memory- and compute-efficient thanfull, making it preferable for large models. - Cayley–Neumann Approximation: For large ranks, enabling the Cayley–Neumann approximation improves efficiency. A small number of Neumann terms (typically
5) usually offers a good balance between accuracy and speed.
The experiments are organized as follows:
- 1-NLU: Fine-tuning and evaluation on the GLUE benchmarks.
- 2-Vision: Fine-tuning and evaluation on the VTAB-1K benchmarks.
- 3-Math: Fine-tuning on MetaMathQA-40K and evaluation on the GSM-8K and MATH datasets.
- 4-Commonsense: Fine-tuning on Commonsense-15K and evaluation on the Commonsense Reasoning benchmarks.
Replace prefix: /home/[yourworkspace]/anaconda3/envs/psoft entry in the last line of psoft.yml with the path to your local workspace.
conda env create -f psoft.yml
conda activate psoft
find . -name "*.sh" -exec chmod +x {} \;Some models or datasets require permission for usage. Please log in to your Hugging Face account using an access token.
Generate your Access Token from settings/tokens and log in.
huggingface-cli login
Access Tokens:[Copy and paste your Access Tokens]Fine-tune and evaluate using the DeBERTaV3-base model:
cd NLU/script/
./deberta_v3_base_psoft-cola.shFine-tune and evaluate using the ViT-base/16 model:
cd script/
./vit_base-psoft.shFine-tune using the Llama-3.2-3B model:
cd Math/script/
./llama-3-3b-psoft.shBefore running the script please change the path name in eval_all.sh to match the path of results:
cd ../
./eval_all.shFine-tune using the Llama-3.1-8B model:
cd Commonsense/script/
./llama-3-8b-psoft.shPrepare datasets before evaluation:
cd /PSOFT/..
git clone https://github.com/AGI-Edgerunners/LLM-Adapters.git
cd LLM-Adapters
mkdir -p ../PSOFT/Commonsense/dataset
cp -r dataset/* ../PSOFT/Commonsense/datasetEdit the path in eval_all.sh to match the results directory:
cd /PSOFT/Commonsense/
./eval_all.shPlease cite our paper if PSOFT provides insights or inspiration for your work:
@inproceedings{wu2026efficient,
title={Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation},
author={Wu, Fei and Hu, Jia and Min, Geyong and Wang, Shiqiang},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=FSHrinMArK}
}