Official implementation of FVAE-LoRA, introduced in our NeurIPS 2025 paper: "Latent Space Factorization in LoRA".
FVAE-LoRA uses a Variational Autoencoder (VAE) to split the LoRA latent space into two:
- π― Task-salient features: Dedicated to your specific downstream task.
- πͺοΈ Residual information: Captures the remaining variance.
The result? Better performance across text, audio, and image tasks compared to standard LoRA. π
- Overview
- Quick Start
- Installation
- Image Classification Experiments
- Repository Structure
- PEFT Library Modifications
- Citation
- Contact
- License
FVAE-LoRA is a Parameter-Efficient Fine-Tuning (PEFT) method that enhances LoRA's expressiveness through latent space factorization. This repository includes:
- π οΈ Modified π€ PEFT Library: An extended version of Hugging Face PEFT with built-in FVAE-LoRA support.
- πΌοΈ Image Classification Suite: Everything you need to reproduce our results on ViT.
FVAE-LoRA is designed to be a drop-in replacement for standard PEFT methods. If you know how to use Hugging Face, you already know how to use FVAE-LoRA.
from peft import FVAEPEFTConfig, get_peft_model
from transformers import AutoModelForImageClassification
# 1. Define your FVAE-LoRA config
fvae_peft_config = FVAEPEFTConfig(
peft_type="FVAE_PEFT",
latent_dim=16, # latent dim
latent_fusion="concat",
enc_num_of_layer=1,
enc_hidden_layer=16,
enc_dropout=0.1,
encoder_use_common_hidden_layer=True,
dec_num_of_layer=3,
dec_hidden_layer=128,
z2_latent_mean=1.5,
z2_latent_std=1,
z1z2_orthogonal_reg=0,
lambda_downstream=1000,
lambda_reconstruction=1,
lambda_z2_l2=1,
lambda_z1_l2=1,
lambda_kl_z1=1,
target_modules=["query", "value"],
modules_to_save=["classifier"],
)
# 2. Load any HF and PEFT supported model
num_labels = 42
model_name_or_path = "google/vit-base-patch16-224-in21k"
config = AutoConfig.from_pretrained(
model_name_or_path,
num_labels=num_labels,
)
model = AutoModelForImageClassification.from_pretrained(
model_name_or_path,
config=config,
)
# 3. Convert to FVAE-LoRA πͺ
model = get_peft_model(model, fvae_peft_config)
model.print_trainable_parameters()
# Train as usual!git clone https://github.com/idiap/FVAE-LoRA.git
cd FVAE-LoRA
conda env create -f env.yaml
conda activate fvae-lora
pip install -r requirements.txtYou must install the local version of PEFT included in this repo:
pip install -e ./peftUpdate path_constants.py with your local directories.
π‘ Tip: This is required for reproducing the paper's image experiments but optional for custom usage described in Quick Start.
To run the image experiments:
- Download the ViT model from:
google/vit-base-patch16-224-in21k(Hugging Face). - Inside your
LARGE_MODELS_PATHdirectory, create a folder named:vit-base-patch16-224-in21k - Place the downloaded model files inside that folder.
We provide scripts to replicate image classification results on multiple benchmark datasets.
The following datasets are supported (automatically downloaded from Hugging Face π€):
- DTD - tanganke/dtd
- EuroSAT - tanganke/eurosat
- GTSRB - tanganke/gtsrb
- RESISC45 - tanganke/resisc45
- SUN397 - tanganke/sun397
- SVHN - ufldl-stanford/svhn
Run the full suite across 3 seeds (1, 2, 42):
bash scripts/train_image_fvae_lora.shImportant
- SLURM: The scripts default to SLURM. If running locally, remove the submission commands from the
*.shfiles. - Project Name: Replace
<your-project>in the scripts with your actual project name.
The FVAE-LoRA uses several loss components controlled by lambda hyperparameters:
--fvae_lambda_downstream: Weight for the downstream task loss (default: 1000)--fvae_lambda_reconstruction: Weight for the reconstruction loss (default: 1)--fvae_lambda_kl_z1: Weight for the KL divergence on z1 (default: 1)--fvae_lambda_z2_l2: L2 regularization on z2 (default: 1)--fvae_lambda_z1_l2: L2 regularization on z1 (default: 1)
The secret sauce is in the lambda weights. For new tasks, we recommend starting with these sets apart from the default:
(1000, 0.1, 1, 1, 1)(1000, 0.1, 10, 1, 1)
Refer to Section G in the paper's appendix for a detailed practical guide on tuning these values.
For comparison, scripts are provided for other PEFT methods:
# Standard LoRA
# supports: pissa, rslora, dora, olora
# change fine_tuning_method="peft" # peft, pissa, rslora, dora, olora inside the bash script.
bash scripts/train_image_peft.sh
# Full fine-tuning
bash scripts/train_image_full_ft.shAggregate your results into a clean summary:
python prepare_results_images.py \
--max-depth 2 \
--exp-base exp/exp_image/fvae_peft/vit-base-patch16-224-in21k/Use --max-depth 1 for experiments apart from FVAE-LoRA.
.
βββ peft/ # π οΈ Modified PEFT library (core logic)
βββ scripts/ # π Bash scripts for training & baselines
β βββ train_image_fvae_lora.sh # FVAE-LoRA training
β βββ train_image_peft.sh # LoRA and variants training
β βββ train_image_full_ft.sh # Full fine-tuning baseline
βββ image_main.py # π Main entry point for image experiments
βββ image_model.py # π§© Model wrapper with PEFT integration
βββ image_datamodule.py # π PyTorch Lightning data module
βββ prepare_results_images.py # π Results analysis script
βββ path_constants.py # βοΈ Path configuration
βββ requirements.txt # Python dependencies
βββ env.yaml # Conda environment specification
βββ README.md # README
The included PEFT library is based on Hugging Face's PEFT with the following additions:
FVAEPEFTConfig: Configuration class for FVAE-LoRA parameters- FVAE-LoRA implementation with factorized latent space
- Support for variational inference in the LoRA framework
See peft/ for the complete modified library.
If you use this code or find our work helpful, please cite us:
@misc{kumar2025latentspacefactorizationlora,
title={Latent Space Factorization in LoRA},
author={Shashi Kumar and Yacouba Kaloga and John Mitros and Petr Motlicek and Ina Kodrasi},
year={2025},
eprint={2510.19640},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2510.19640},
}π§ Questions? Open an issue or reach out at
- Shashi Kumar (shashi.kumar@idiap.ch)
- Yacouba Kaloga (yacouba.kaloga@idiap.ch)
This project is released under the MIT License. See the LICENSES/MIT.txt file for details.
The modified PEFT library retains its original Apache 2.0 License - see peft/LICENSE.
For third-party dependencies retain their respective licenses.
Built with β€οΈ at the Idiap Research Institute.