Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation

Introduction

This is the official code repository for the paper "Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation". GreenTrainer adaptively selects the trainable tensors for backpropagation, to achieve user-specified finetuning FLOPs reduction while retaining accuracy. GreenTrainer is an extension to LLMs from our previous work ElasticTrainer.

Requirements

All the experiments are run on Lambda Cloud Instances. To install all the dependencies, run the following command

bash requirements.sh

General Usage

For decoder structures, navigate to decoder_lm folder. Run the following commands to finetune

bash opt_scitldr.sh # finetune OPT-2.7B model on scitldr dataset
bash opt_dialogsum.sh # finetune OPT-2.7B model on scitldr dataset

or pass specific configurations to main.py

# GreenTrainer-0.5
python3 main.py --model_name facebook/opt-2.7b \
                --dataset_name scitldr \
                --scheme green_trainer \
                --max_input_length 512 \
                --max_output_length 64 \
                --batch_size 4 \
                --rho 0.4

For encoder-decoder structures, navigate to encoder_decoder_lm. Follow similar steps above.

Reproducing Paper Results

Most of experiments in our paper are run on a Lambda Cloud Instance with a single Nvidia H100 80GB GPU and 24 vCPUs. If you select other configurations, the wall-clock time measurements will not match the results in our paper. Please run the following scripts to reproduce main results for OPT-2.7B, BLOOMZ-3B, and FLAN-T5-3B respectively:

Navigate to decoder_lm folder,

# finetuning OPT-2.7B on SciTLDR and DialogSum datasets
bash opt_scitldr.sh
bash opt_dialogsum.sh

# finetuning BLOOMZ-3B on SciTLDR and DialogSum datasets
bash bloom_scitldr.sh
bash bloom_dialogsum.sh

# finetuning OPT-2.7B on webquestions and piqa datasets
bash opt_webquestions.sh
bash opt_piqa.sh

Nvigate to encoder_decoder_lm folder,

# finetuning FLAN-T5-3B on SciTLDR and DialogSum datasets
bash flant5_scitldr.sh
bash flant5_dialogsum.sh

Note: For OPT and BLOOMZ, we adopt prompt structure {src} TL;DR: {summary} on SciTLDR and DialogSum datasets, question:{q}</s>answer:{a}</s> on webquestions dataset, and goal:{goal}</s>sol1:{sol1}</s>sol2:{sol2}</s>label:{label}</s> on piqa dataset. For summarization datasets, {src} TL;DR: {summary}</s> should give better results but we didn't adopt this when writing this paper.

Citation

@article{huang2023towards,
  title={Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation},
  author={Huang, Kai and Yin, Hanyun and Huang, Heng and Gao, Wei},
  journal={arXiv preprint arXiv:2309.13192},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
decoder_lm		decoder_lm
encoder_decoder_lm		encoder_decoder_lm
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
requirements.sh		requirements.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation

Introduction

Requirements

General Usage

Reproducing Paper Results

Citation

About

Releases

Packages

Languages

License

pittisl/GreenTrainer

Folders and files

Latest commit

History

Repository files navigation

Towards Green AI in Fine-tuning Large Language Models via Adaptive Backpropagation

Introduction

Requirements

General Usage

Reproducing Paper Results

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages