# PEFT Example (LoRA)

In [44]:
import pandas

# all of these are HuggingFace libraries:
import datasets
import peft
import transformers


## Load Pre-Trained Model and Tokenizer

Pick a pre-trained model from the [HuggingFace models hub](https://huggingface.co/models) for the type of task you're interested in. Here I choose a GPT2 variant for causal language modelling (text generation). Each model has an associated tokenizer to preprocess text into the correct format (e.g. correct token ids) for that model.

In [32]:
model_name = "gpt2-medium"
model = transformers.AutoModelForCausalLM.from_pretrained(model_name)

tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

We can have a look at the architecture of the model and how many parameters it contains (you can see it has 24 `GPT2Block`s which each contain an attention module, and about 350 million parameters):

In [33]:
print(model)
print("No. parameters: ", model.num_parameters())

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 1024)
    (wpe): Embedding(1024, 1024)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-23): 24 x GPT2Block(
        (ln_1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=1024, out_features=50257, bias=False)
)
No. parameters:  354823168


## Example Generations from Base Model

Get the base model to generate some text with a blank input prompt, which gives an idea about the types of text/topics that appeared a lot in the pre-training dataset (seems to be a lot of mostly US news, politics, and entertainment). You can also try with different prompts if you like.

In [34]:
def print_samples(model, tokenizer, prompt="", n_samples=3):
    pipe = transformers.pipeline("text-generation", model=model, tokenizer=tokenizer, device="mps")
    n_samples = 3
    samples = pipe(prompt, num_return_sequences=3)
    for s in samples:
        print(s["generated_text"], "\n ------")


prompt = ""
print_samples(model, tokenizer, prompt)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


In his new book on economics, Adam Smith: A Modern History, Martin Feldmeier points to two main features that helped shape economics. To those who remember his arguments, they will feel comfortable and at ease. The first is Smith's reliance on 
 ------
The following pages provide examples of popular programs that can help you learn the programming language Ruby. Some of the best examples are listed in the Order the Programming Languages, but most tutorials here also demonstrate other languages; those examples are listed as reference, not 
 ------
Mitt Romney. NBC News Photo credit: Nick Ut

Mitt Romney is one of the world's most outspoken and well known feminists — but the American right wing is still waiting for what he'd call a "feminist" candidate. 
 ------


## Load a Dataset of a Celebrity's Tweets

I've downloaded datasets of tweets from a few different celebrities [from Kaggle](https://www.kaggle.com/datasets/ahmedshahriarsakib/top-1000-twitter-celebrity-tweets-embeddings). Here are a few from the actor Anna Kendrick:

In [5]:
celebrity = "annakendrick"
df = pandas.read_csv(f"{celebrity}.csv")
for t in df["tweet"].sample(5):
    print(t, "\n")

@RyBrockington granddaughter of Max 

Go Pats!!!! 

Pinky, are you pondering what I'm pondering? 

My day job. http://t.co/zbz6r3SKEr 

My first time in sales… not sure if I'm doing this right… https://t.co/ZFAkKyIfr1 https://t.co/LEq9a5H2YJ 



Convert the dataframe into a HuggingFace dataset:

In [6]:
dataset = datasets.Dataset.from_pandas(df)
dataset

Dataset({
    features: ['tweet'],
    num_rows: 2468
})

And preprocess the dataset by running it through the tokenizer (removing the original `tweet` feature afterwards as the training will use the created `input_ids` and `attention_mask` features instead):

In [7]:
dataset = dataset.map(lambda sample: tokenizer(sample["tweet"]), remove_columns=dataset.features)
dataset

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Map:   0%|          | 0/2468 [00:00<?, ? examples/s]

Dataset({
    features: ['input_ids', 'attention_mask'],
    num_rows: 2468
})

## Create a PEFT LoRA Config

To prepare a model for PEFT fine-tuning you need to create a config for the PEFT method you're using, in this case `LoraConfig`, and then call `peft.get_peft_model` with the base model and PEFT config. Here's a default LoRA setup:

In [45]:
peft_config = peft.LoraConfig(
    task_type=peft.TaskType.CAUSAL_LM,
    r = 8,  # LoRA "rank" (small dimension of learnt A, B matrices)
    target_modules=None,  # which layers to apply LoRA to (has model-specific defaults, usually attention queries and values)
)
peft_model = peft.get_peft_model(model, peft_config)



The adapted PEFT model contains additional LoRA modules, for example the (low rank) matrices `A` and `B` which are used to learn the adjustment to the attention weights (you can see that `out_features` of `A` and `in_features` of `B` are both 8 - the LoRA rank specified in the config).

In this case LoRA will fine-tune about 0.2% of the total parameters in the model (fewer than a million out of 350 million parameters).

In [46]:
print(peft_model)
peft_model.print_trainable_parameters()

PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): GPT2LMHeadModel(
      (transformer): GPT2Model(
        (wte): Embedding(50257, 1024)
        (wpe): Embedding(1024, 1024)
        (drop): Dropout(p=0.1, inplace=False)
        (h): ModuleList(
          (0-23): 24 x GPT2Block(
            (ln_1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
            (attn): GPT2Attention(
              (c_attn): Linear(
                in_features=1024, out_features=3072, bias=True
                (lora_dropout): ModuleDict(
                  (default): Identity()
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=1024, out_features=8, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=8, out_features=3072, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
      

## Fine-Tune the PEFT model on the tweets

Training the PEFT variant of the model now should just work the same as training any HuggingFace model with the HuggingFace `Trainer` class:

In [9]:
args = transformers.TrainingArguments(
    "tmp",
    num_train_epochs=0.1, # increase num_train_epochs to at least a few whole epochs if you want to see a clear difference in the fine-tuned model
    save_strategy="no",
)

In [10]:
trainer = transformers.Trainer(
    peft_model,
    args,
    train_dataset=dataset,
    tokenizer=tokenizer,
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm=False),
)

In [11]:
trainer.train()

You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


Step,Training Loss


TrainOutput(global_step=31, training_loss=2.728060814642137, metrics={'train_runtime': 23.1772, 'train_samples_per_second': 10.648, 'train_steps_per_second': 1.338, 'total_flos': 22492285698048.0, 'train_loss': 2.728060814642137, 'epoch': 0.1})

## Save the LoRA weights

When saving the model only the LoRA weights are saved, which are a few megabytes (see the size of `adapter_model.bin` below) rather than the 1.5 GB of the original model:

In [51]:
peft_model.save_pretrained(f"{celebrity}")

!ls -lh {celebrity}

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
total 6200
-rw-r--r--@ 1 jroberts  staff    88B Sep 25 14:49 README.md
-rw-r--r--@ 1 jroberts  staff   416B Sep 25 14:49 adapter_config.json
-rw-r--r--@ 1 jroberts  staff   3.0M Sep 25 14:49 adapter_model.bin


## Load the fine-tuned model

`AutoModelForCausalLM.from_pretrained` does two steps in the background:

- Load the appropriate base model (defined in the saved PEFT config)
- Load the LoRA weights to adapt the base model

At this point, you could also choose to merge the LoRA weights with the base weights (do the additions $W_0 + AB$), in which case there would be no additional latency cost to using a LoRA-adjusted model (see the [PEFT documentation](https://huggingface.co/docs/peft/index) for how to do this).

Generating some more samples with a blank input prompt, you can see that our Anna Kendrick tweet fine-tuned model is more likely to talk about films, TV, or general day to day things than the original (news heavy) base model:

In [13]:
loaded_peft_model = peft.AutoPeftModelForCausalLM.from_pretrained(celebrity)
print_samples(loaded_peft_model, tokenizer, "")

The model 'PeftModelForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'Refor

Will Smith in 'Life of Pi'? It's a film no one seems to remember! Read on for the details. Read my other stories from this crazy first weekend: 1. http://twothirdsandthings.com/2014/08 
 ------
A former executive with Twitter has spoken out against the company's latest decision to lock down data belonging to Twitter, arguing that it has given users the false impression that the company is controlling their data and that they are being sold data "in return" 
 ------
The only way to tell people I'm making some bullshit joke is to tell other people my fucking joke. I know, I know. I have it. No excuses. No apologies. I'm done. So let's move on. #Me 
 ------


## Other Celebrities

In the repo there are LoRA adapters (trained on 10 epochs) for 5 celebrity tweets. Compare the differences in the type of text each model generates below. Note that the way I have implemented the loop below means the base model is re-loaded each time, you could swap in/out only the LoRA weights each time instead if you needed the model loading to be faster in deployment (again see the PEFT documentation).

In [39]:
prompt = ""
celebrities = ["annakendrick", "billgates", "eminem", "oprah", "elonmusk"]
for c in celebrities:
    print("\n", "=" * 5, c, "=" * 5)
    loaded_peft_model = peft.AutoPeftModelForCausalLM.from_pretrained(c)
    print_samples(loaded_peft_model, tokenizer, prompt)


 ===== annakendrick =====


The model 'PeftModelForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'Refor

Don't get that crazy guy from last show on the island. That's exactly him. #mitchkrollmovies — Alex Malarkey (@alexmmalarkey) February 18, 2013

Just made the mistake of 
 ------
The new world, with its endless streams of money and abundance, is truly a joy. There is so much shit sitting in the trash that you have to watch out for to see it, right? It's a wonderful thing. And just before 
 ------
The latest episode of 'Shark Tank' featured a woman with tattoos that resembled 'Frozen' characters - so why not?

I was shocked and happy to get a sneak peak @SharkTankSBS. This show is way 
 ------

 ===== billgates =====


The model 'PeftModelForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'Refor

@magnificentbio You are a wonderful leader in science, and you deserve a thank you. I’m sorry to see you go (but’’we’re thrilled to have you here). You’ 
 ------
The world is watching #TheGreatFireWith #Africa, and @movietaprov is working on ways to help as well: https://t.co/5iXdSQ9V0n https://t. 
 ------
How do we build great infrastructure? A combination of #MITxCoral and #MITxSweden. This conversation @CoralWatch has been really encouraging. https://t.co/4ZjUqNlX4v 
 ------

 ===== eminem =====


The model 'PeftModelForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'Refor

In the first major interview since stepping down from his role at Comedy Central's The Daily Show in October, Will Ferrell gives Rolling Stone an inside view of the first season of Mad Max: Fury Road … and he's not impressed. Watch the 
 ------
Shakespearean music that makes you fall from your chair in pain... https://t.co/8uX7s6FpN1 http://t.co/nBhN6b7r1nk http:// 
 ------
Episode #12: T-Shirt Design and Merchandise.
This week, I have reoccurring guest Matt Zoller Seitz, one of the co-lead hosts of the upcoming show @TheRebel. Join as I take 
 ------

 ===== oprah =====


The model 'PeftModelForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'Refor

@WMABCTV: @VernonWayland joins us today to share stories about his journey, from the very beginning of his #HairGate story, to the battle to have his @OWNTV interview air on the @OWN 
 ------
Sunday, October 17, 2016
The Story
With the season upon us, it is time to share our favorite "Gosh. I wonder about #TheHinduTribes." Read on for @TheCaucasianNation's exclusive take 
 ------
Just hours after taking responsibility for the death of his sister, Donta'a Williams says there was no malicious intent and that her death was a tragic accident.

On Saturday in Orlando, Florida police officers arrested 18-year-old 
 ------

 ===== elonmusk =====


The model 'PeftModelForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'LlamaForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'FalconForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MptForCausalLM', 'MusicgenForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'PegasusForCausalLM', 'PLBartForCausalLM', 'ProphetNetForCausalLM', 'QDQBertLMHeadModel', 'Refor

Heaven is very happy that you have a sweet baby.

Heaven's Little’s 🙁 (@crispywaffles) is a very small baby girl so I know it really likes cake.

Heaven 
 ------
We had a great opportunity last night to tour with a very popular show band as guests on our very first @washington @TheMuskBlog podcast. This time around I took my time to talk with each of them, talk about the show 
 ------
He has been calling for a few years now to have a ban on all weapons that can be modelled after human beings.

If we want a more peaceful world, we should be stopping humans from starting guns. #FreePalestine https 
 ------
