# Prompt Tuning with PEFT

In this example, we will aply prompt tuning with the PEFT library to a pretrained model.

## Introduction to prompt tuning

Prompt tuning is an *additive fine-tuning* technique for models, which means that we **WILL NOT MODIFY ANY WEIGHTS OF THE ORIGINAL MODEL**. Instead, we will train **additional layers** that are added to the model, which is why it is called an additive technique.

We are creating a type of superprompt by enabling a model to enhance a portion of the prompt with its acquired knowledge. However, this particular section of the prompt cannot be translated into natural language. **It is as if we have mastered expressing ourselves in embeddings and generating highly effective prompts.**

In each training cycle, the only weights that can be modified to minimize the loss function are those integrated into the prompt. Since we do not modify the weights of the pretrained model, **it does not alter its behavior or forget any information it has previously learned**.

The training is faster and more cost-effective. Moreover, we can train various models, and during inference time, we only need to load one foundational model along with the new smaller trained models because the weights of the original model have not been altered.

## Load the PEFT library

In [None]:
!pip install -qU peft datasets transformers

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer

Bloom is one of the smallest and smartest models available for training with the PEFT library using prompt tuning. We can choose any model from the Bloom family.

In [None]:
model_name = 'bigscience/bloomz-560m' # 'bigscience/bloomz-1b1'

NUM_VIRTUAL_TOKENS = 4
NUM_EPOCHS = 6

In [None]:
tokenizer = AutoTokenizer.from_pretrained(model_name)
foundational_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    trust_remote_code=True
)

## Inference with the pretrained Bloom model

If we want more varied generations, we need to uncomment the following paramters: `temperature`, `top_p`, `do_sample` below:

In [None]:
def get_outputs(model, inputs, max_new_tokens=100):
    outputs = model.generate(
        input_ids=inputs['input_ids'],
        attention_mask=inputs['attention_mask'],
        max_new_tokens=max_new_tokens,
        #temperature=0.2,
        #top_p=0.95,
        #do_sample=True,
        repetition_penalty=1.5, # avoid repetition
        early_stopping=True,
        eos_token_id=tokenizer.eos_token_id
    )
    return outputs

In this example, we want to have two different trained models, so we will create two distinct prompts.
* The first model will be trained with a dataset comtaining prompts, such as "I want you to act as a motivational coach", and
* The second one with a dataset of movitaional sentences, such as "There are two nice things that should matter to you:"

Before doing this, we will collect some results from the model without fine-tuning.

In [None]:
input_prompt = tokenizer(
    'I want you to act as a motivational coach',
    return_tensors='pt'
)
foundational_outputs_prompt = get_outputs(
    foundational_model,
    input_prompt,
    max_new_tokens=50
)
print(tokenizer.batch_decode(foundational_outputs_prompt, skip_special_tokens=True))

In [None]:
input_prompt = tokenizer(
    'There are two nice things that should matter to you:',
    return_tensors='pt'
)
foundational_outputs_prompt = get_outputs(
    foundational_model,
    input_prompt,
    max_new_tokens=50
)
print(tokenizer.batch_decode(foundational_outputs_prompt, skip_special_tokens=True))

## Prepra the datasets

We will use the following datasets:
* [`awesome-chatgpt-prompts`](https://huggingface.co/datasets/fka/awesome-chatgpt-prompts)
* [`english-quotes`](https://huggingface.co/datasets/Abirate/english_quotes)

In [None]:
import os
from datasets import load_dataset

dataset_prompt = 'fka/awesome-chatgpt-prompts'
data_prompt = load_dataset(dataset_prompt)
data_prompt = data_prompt.map(lambda x: tokenizer(x['prompt']), batched=True)
train_sample_prompt = data_prompt['train'].select(range(50))

In [None]:
train_sample_prompt

In [None]:
train_sample_prompt[0]

In [None]:
dataset_sentences = load_dataset('Abirate/english_quotes')
dataset_sentences = dataset_sentences.map(lambda x: tokenizer(x['quote']), batched=True)
train_sample_sentences = dataset_sentences['train'].select(range(25))
train_sample_sentences = train_sample_sentences.remove_columns(['author', 'tags'])

In [None]:
train_sample_sentences

In [None]:
train_sample_sentences[0]

## Fine-tuning

We can use the same configuration for both models to be trained.

In [None]:
from peft import get_peft_model, PromptTuningConfig, TaskType, PromptTuningInit

generation_config = PromptTuningConfig(
    task_type=TaskType.CAUSAL_LM, # this type incidates the applied model will generate text
    prompt_tuning_init=PromptTuningInit.RANDOM, # the added virtual tokens are initialized with random numbers
    num_virtual_tokens=NUM_VIRTUAL_TOKENS, # number of virtual tokens to be added and trained
    tokenizer_name_or_path=model_name, # the pretrained model
)

Next, we will create two identical prompt tuning models using the same pretrained model and the same configuration.

In [None]:
peft_model_prompt = get_peft_model(foundational_model, generation_config)
peft_model_prompt.print_trainable_parameters()

In [None]:
peft_model_sentences = get_peft_model(foundational_model, generation_config)
peft_model_sentences.print_trainable_parameters()

We see that the trained parameters is about 0.001% of the available parameters in the model.

Next, we will create the training arguments, and use the same configuration in both trainings:

In [None]:
from transformers import TrainingArguments

def create_training_arguments(path, learning_rate=0.0035, epochs=6):
    training_args = TrainingArguments(
        output_dir=path,
        use_cpu=True, # necessary for CPU clusters
        auto_find_batch_size=True, # find a suitable batch size that will fit into memory automatically
        learning_rate=learning_rate, # higher lr than full fine-tuning
        num_train_epochs=epochs
    )
    return training_args

In [None]:
working_dir = './'

# Create the name of the directories where to store the models
output_dir_prompt = os.path.join(working_dir, 'peft_outputs_prompt')
output_dir_sentences = os.path.join(working_dir, 'peft_outputs_sentences')

if not os.path.exists(working_dir):
    os.mkdir(working_dir)
if not os.path.exists(output_dir_prompt):
    os.mkdir(output_dir_prompt)
if not os.path.exists(output_dir_sentences):
    os.mkdir(output_dir_sentences)

In [None]:
training_args_prompt = create_training_arguments(
    output_dir_prompt,
    0.003,
    epochs=NUM_EPOCHS
)
training_args_sentences = create_training_arguments(
    output_dir_sentences,
    0.003,
    epochs=NUM_EPOCHS
)

## Train

In [None]:
from transformers import Trainer, DataCollatorForLanguageModeling

def create_trainer(model, training_args, train_dataset):
    trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=train_dataset,
        data_collator=DataCollatorForLanguageModeling(tokenizer, mlm=False),
        # mlm=False indicates not ot use masked language modeling
    )
    return trainer

In [None]:
# Training first model
trainer_prompt = create_trainer(
    peft_model_prompt,
    training_args_prompt,
    train_sample_prompt
)
trainer_prompt.train()

In [None]:
# Training second model
trainer_sentences = create_trainer(
    peft_model_sentences,
    training_args_sentences,
    train_sample_sentences
)
trainer_sentences.train()

## Save models

In [None]:
trainer_prompt.model.save_pretrained(output_directory_prompt)
trainer_sentences.model.save_pretrained(output_directory_sentences)

## Inference

In [None]:
from peft improt PeftModel

loaded_model_prompt = PeftModel.from_pretrained(
    foundational_model,
    output_directory_prompt,
    device_map='auto',
    is_trainable=False
)

In [None]:
loaded_model_prompt_outputs = get_outputs(
    loaded_model_prompt,
    input_prompt
)
tokenizer.batch_decode(
    loaded_model_prompt_outputs,
    skip_special_tokens=True
)

In [None]:
loaded_model_prompt.load_adapter(
    output_directory_sentences,
    adapter_name='quotes'
)
loaded_model_prompt.set_adapter('quotes')

In [None]:
loaded_model_sentences_outputs = get_outputs(
    loaded_model_prompt,
    input_sentences
)
tokenizer.batch_decode(
    loaded_model_sentences_outputs,
    skip_special_tokens=True
)