LoRA in LoRA: Towards Parameter-Efficient Architecture Expansion for Continual Visual Instruction Tuning
LiLoRA is a parameter-efficient architecture expansion method for continual visual instruction tuning. It shares LoRA components across tasks while preserving task-specific adaptation through a lightweight low-rank branch.
The CVIT benchmark constructed by SMoLoRA encompasses 10 datasets along with their corresponding instruction sets.
You can download instruction tuning files of the CVIT benchmark from CVIT benchmark.
All datasets used in the benchmark are publicly available. You can download the corresponding images directly from each dataset's official website.
Please download the pretrained language model vicuna-7b-v1.5, the alignment module, and the CLIP vision tower in advance.
git clone https://github.com/chanceche/LiLoRA.git
cd LiLoRA
conda create -n lilora python=3.10 -y
conda activate lilora
pip install --upgrade pip
pip install -e .Edit the path configuration block at the top of the scripts before training or evaluation:
scripts/LiLoRA/Train/Train_all.sh
scripts/LiLoRA/Eval/eval_all.shRun continual visual instruction tuning:
bash scripts/LiLoRA/Train/Train_all.shRun evaluation:
bash scripts/LiLoRA/Eval/eval_all.shEvaluate a single task by passing a checkpoint path:
bash scripts/LiLoRA/Eval/1_eval_sqa.sh \
<checkpoint_path>Our project is based on LLaVA, PEFT, and SMoLoRA. We sincerely thank them for their outstanding contributions.
This project is released under the Apache License 2.0. See LICENSE for details.
