<a href="https://colab.research.google.com/github/MakovChen/LLMs-Development-Kit/blob/main/Parameter-Efficient%20Fine-Tuning/GPT_2%2BLoRA_with_peft_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Parameter-Efficient Fine-Tuning Example
This notebook shows how to build an Adapter for the base model and demonstrates the transformer API in training LLMs, exporting, and deploying. This code will be run on a Colab T4 GPU, so the smaller GPT-2 will be used as a demo.


In [1]:
!pip install transformers
!pip install peft
!pip install datasets



### Load related resources

In [2]:
import torch, transformers, peft, datasets
import numpy as np

#### Get the base model and its corresponding tokenizer from the huggingface pool
Various open-source and well-trained base models are stored on huggingface, and the models applicable to the case can be reviewed through the browser (https://huggingface.co/models) and called directly by the path name.
* The tokenizer is used to process the encoding/decoding of tokens corresponding to the base model.

In [3]:
model_name = "gpt2"
base_model = transformers.AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)

#### Get the instruction set for training
`cnn_dailymail` is a instruction dataset for gpt-2, you can also choose other expected instruction datasets (https://huggingface.co/datasets?sort=downloads), just remember to convert the format of the content to comply with the gpt-2 instruction requirements

In [4]:
dataset = datasets.load_dataset("cnn_dailymail", '3.0.0')
dataset



  0%|          | 0/3 [00:00<?, ?it/s]

DatasetDict({
    train: Dataset({
        features: ['article', 'highlights', 'id'],
        num_rows: 287113
    })
    validation: Dataset({
        features: ['article', 'highlights', 'id'],
        num_rows: 13368
    })
    test: Dataset({
        features: ['article', 'highlights', 'id'],
        num_rows: 11490
    })
})

### Environmental settings and data pre-processing

#### Adding Adapter to the base model
peft will incorporate the Adapter configuration of config into the base model and handle the update of the Adapter weights. You can check the peft documentation for the compatible Adapters of each model
(https://github.com/huggingface/peft)

* Cell's results show us that this training will save approximately 99.7% of computing resources compared to training the entire LLM directly, allowing for custom training of the model in a personal environment.

In [5]:
model = peft.prepare_model_for_int8_training(base_model)
model = peft.get_peft_model(model, peft.LoraConfig(r=8, lora_alpha=16, lora_dropout=0.05, task_type="CAUSAL_LM"))
model.print_trainable_parameters()

#Save the weights of the LoRA Adapter before training for comparison after training
params_dict = {name: params.cpu().detach().numpy().copy() for name, params in [p for p in model.named_parameters()] if "lora" in name}



trainable params: 294912 || all params: 124734720 || trainable%: 0.23643136409814364


#### Convert the original instruction data set into tokens by tokenizer
Since each text length in the original command dataset is different, the text lengths that do not reach the maximum supported by the GPT-2 model are filled with the specified <EOS> character and cropped if they are too long.


In [6]:
def preprocess(example):
  example["input_ids"] = tokenizer(example["article"], truncation=True, padding="max_length", return_tensors="pt").input_ids
  example["labels"] = tokenizer(example["highlights"], truncation=True, padding="max_length", return_tensors="pt").input_ids
  return example
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"

#Only the first 10 samples of the training and validation subsets are used in this notebook as a demonstration
train_dataset = dataset["train"].select(range(10)).map(preprocess)
val_dataset = dataset["validation"].select(range(10)).map(preprocess)



Map:   0%|          | 0/10 [00:00<?, ? examples/s]

### Execute training procedures

#### Understand the performance of the model before training
Design a function that tokenizes the input text, puts it into the model to generate a token, and converts the token back to the output text for subsequent calls

In [7]:
def generate(input, model):
  input_ids = tokenizer(input, return_tensors="pt").input_ids
  output_ids = model.generate(input_ids = input_ids, max_length=100, generation_config=transformers.GenerationConfig(temperature=0, top_p=0.75, top_k=40, num_beams=1, pad_token_id=tokenizer.eos_token_id))
  output = tokenizer.batch_decode(output_ids)[0]
  return output

print(generate("what is GPT?", model))

what is GPT?

GPT is a term used to describe the practice of using a computer program to perform a task. It is used to describe the process of performing a task.

GPT is a term used to describe the practice of using a computer program to perform a task. It is used to describe the process of performing a task. GPT is a term used to describe the process of performing a task.

GPT is a term used to describe the process


#### Configure trainers and put models and data into training
The model is stored in the `trainer.model` property, and Pytorch will update the gradient of the model against this property after executing `train()`.

In [8]:
args = transformers.TrainingArguments(output_dir="./results", learning_rate=1e-3, per_device_train_batch_size=2, num_train_epochs=4)
trainer = transformers.Trainer(model = model, train_dataset = train_dataset, eval_dataset = val_dataset, args=args)

In [9]:
trainer.train()



Step,Training Loss


TrainOutput(global_step=20, training_loss=4.531863784790039, metrics={'train_runtime': 12.1959, 'train_samples_per_second': 3.28, 'train_steps_per_second': 1.64, 'total_flos': 20975840133120.0, 'train_loss': 4.531863784790039, 'epoch': 4.0})

#### Verify that the LoRA parameters have been updated

In [10]:
diff = 0
for name, params in [p for p in trainer.model.named_parameters()]:
  if name in list(params_dict.keys()):
      diff += np.sum(params.clone().cpu().detach().numpy() - params_dict[name])
print('LoRA weights Updated:',diff == 0 )

#Comparison of generation status after training, But because the calculation just transferred part of the Tensor to the cpu, it has to be put back to the gpu to do the calculation.
torch.set_default_device(torch.device('cuda:0'))
print(generate("what is GPT?", trainer.model))

LoRA weights Updated: False
what is GPT?

The first thing I did was to look at the top of the page and see if there was any other information that I could find. I could find. I could find.

I looked at the top of the page and I found the following:

The first thing I did was look at the first thing I could find.


I looked at the first thing I could find.

I looked at the first thing I could find.


#### Save Model and tokenizer

In [14]:
torch.save(trainer.model, "lora_gpt2.pth")
tokenizer.save_pretrained('./Tokenzier')

('./Tokenzier/tokenizer_config.json',
 './Tokenzier/special_tokens_map.json',
 './Tokenzier/vocab.json',
 './Tokenzier/merges.txt',
 './Tokenzier/added_tokens.json',
 './Tokenzier/tokenizer.json')

### Deployment

In [19]:
reload_model = torch.load("lora_gpt2.pth")
reload_tokenizer = transformers.AutoTokenizer.from_pretrained('./Tokenzier')

input_text = "what is GPT?"

input_ids = reload_tokenizer(input_text, return_tensors="pt").input_ids
output_ids = reload_model.generate(input_ids = input_ids, max_length=100, generation_config=transformers.GenerationConfig(temperature=0, top_p=0.75, top_k=40, num_beams=1, pad_token_id=tokenizer.eos_token_id))
output = reload_tokenizer.batch_decode(output_ids)[0]
print(output)

what is GPT?


GPT is a term used to describe the process of determining whether a person is a member of the GPT.

GPT is a term used to describe the process of determining whether a person is a member of the GPT.

GPT is a term used to describe the process of determining whether a person is a member of the GPT.
GPT is a term used to describe the process of determining whether a person is a member
