# Fine-tune a BLOOM-based ad generation model using `peft`, `transformers` and `bitsandbytes`

We can use the [MarketMail-AI dataset](https://huggingface.co/datasets/FourthBrainGenAI/MarketMail-AI) to fine-tune BLOOM to be able to generate marketing emails based off of a product and its description!

### Overview of PEFT and LoRA:

Based on some awesome new research [here](https://github.com/huggingface/peft), we can leverage techniques like PEFT and LoRA to train/fine-tune large models a lot more efficiently. 

It can't be explained much better than the overview given in the above link: 

```
Parameter-Efficient Fine-Tuning (PEFT) methods enable efficient adaptation of
pre-trained language models (PLMs) to various downstream applications without 
fine-tuning all the model's parameters. Fine-tuning large-scale PLMs is often 
prohibitively costly. In this regard, PEFT methods only fine-tune a small 
number of (extra) model parameters, thereby greatly decreasing the 
computational and storage costs. Recent State-of-the-Art PEFT techniques 
achieve performance comparable to that of full fine-tuning.
```

### Install requirements

First, run the cells below to install the requirements:

In [None]:
!pip install -q bitsandbytes datasets accelerate loralib
!pip install -q git+https://github.com/huggingface/peft.git git+https://github.com/huggingface/transformers.git

### Model loading

Let's load the `bloom-1b7` model!

We're also going to load the `bigscience/tokenizer` which is the tokenizer for all of the BLOOM models.

This step will take some time, as we have to download the model weights which are ~3.44GB.

In [None]:
import torch
torch.cuda.is_available()

True

In [None]:
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"
import torch
import torch.nn as nn
import bitsandbytes as bnb
from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "bigscience/bloom-1b7", 
    torch_dtype=torch.float32,
    load_in_8bit=False, 
    device_map='auto',
)

tokenizer = AutoTokenizer.from_pretrained("bigscience/tokenizer")

### Post-processing on the model

Finally, we need to apply some post-processing on the 8-bit model to enable training, let's freeze all our layers, and cast the layer-norm in `float32` for stability. We also cast the output of the last layer in `float32` for the same reasons.

In [None]:
for param in model.parameters():
  param.requires_grad = False  # freeze the model - train adapters later
  if param.ndim == 1:
    # cast the small parameters (e.g. layernorm) to fp32 for stability
    param.data = param.data.to(torch.float32)

model.gradient_checkpointing_enable()  # reduce number of stored activations
model.enable_input_require_grads()

class CastOutputToFloat(nn.Sequential):
  def forward(self, x): return super().forward(x).to(torch.float32)
model.lm_head = CastOutputToFloat(model.lm_head)

### Apply LoRA

Here comes the magic with `peft`! Let's load a `PeftModel` and specify that we are going to use low-rank adapters (LoRA) using `get_peft_model` utility function from `peft`.

In [None]:
def print_trainable_parameters(model):
    """
    Prints the number of trainable parameters in the model.
    """
    trainable_params = 0
    all_param = 0
    for _, param in model.named_parameters():
        all_param += param.numel()
        if param.requires_grad:
            trainable_params += param.numel()
    print(
        f"trainable params: {trainable_params} || all params: {all_param} || trainable%: {100 * trainable_params / all_param}"
    )

In [None]:
from peft import LoraConfig, get_peft_model 

config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["query_key_value"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, config)
print_trainable_parameters(model)

trainable params: 3145728 || all params: 1725554688 || trainable%: 0.18230242262828822


### Preprocessing

We can simply load our dataset from 🤗 Hugging Face with the `load_dataset` method!

In [None]:
import transformers
from datasets import load_dataset, Dataset

dataset_name = "cmagganas/GenAI-job-postings-Dataset-sample"
position_level = "position_level"
use_case = "use_case"
job_title = "job_title"
job_posting = "job_posting"
cover_letter = "cover_letter"

In [None]:
dataset = load_dataset(dataset_name)
print(dataset)
print(dataset['train'][0])

Downloading readme:   0%|          | 0.00/533 [00:00<?, ?B/s]

Downloading and preparing dataset None/None to /root/.cache/huggingface/datasets/cmagganas___parquet/cmagganas--GenAI-job-postings-Dataset-sample-1abca01d2eea39bd/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec...


Downloading data files:   0%|          | 0/1 [00:00<?, ?it/s]

Downloading data:   0%|          | 0.00/41.6k [00:00<?, ?B/s]

Extracting data files:   0%|          | 0/1 [00:00<?, ?it/s]

Generating train split:   0%|          | 0/10 [00:00<?, ? examples/s]

Dataset parquet downloaded and prepared to /root/.cache/huggingface/datasets/cmagganas___parquet/cmagganas--GenAI-job-postings-Dataset-sample-1abca01d2eea39bd/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec. Subsequent calls will reuse this data.


  0%|          | 0/1 [00:00<?, ?it/s]

DatasetDict({
    train: Dataset({
        features: ['position_level', 'use_case', 'job_title', 'job_posting', 'cover_letter'],
        num_rows: 10
    })
})
{'position_level': 'mid-level', 'use_case': 'recommender system', 'job_title': 'Machine Learning Engineer', 'job_posting': "Job Title: Machine Learning Engineer - Recommender Systems (Mid-level)\n\nJob Summary:\n\nOur AI company is seeking an experienced and knowledgeable Machine Learning Engineer to join our team. The successful candidate will be responsible for developing and implementing state-of-the-art recommender systems in our AI products. The ideal candidate will have extensive experience in the field and be familiar with the latest developments and trends in machine learning and AI. This is a mid-level position, and the candidate will have the opportunity to work on exciting projects that leverage the power of machine learning to solve complex business problems.\n\nKey Responsibilities:\n\n- Develop and implement innova

We want to put our data in the form:

```
Below is a product and description, please write a marketing email for this product.

### Product
PRODUCT NAME

### Description:
DESCRIPTION

### Marketing Email:
OUR EMAIL HERE
```

This way, we can prompt our model well and receive the responses we want!

This is what fine-tuning, and prompt-engineering, is really all about!

In [None]:
def generate_prompt(position_level: str, use_case: str, job_title: str, job_posting: str, cover_letter: str) -> str:
  prompt = f"### Position:\n{position_level}\n### Use Case:\n{use_case}\n### Job Title:\n{job_title}\n\n### Job Posting:\n{job_posting}\n\n### Cover Letter:\n{cover_letter}\n\n"
  return prompt

mapped_dataset = dataset.map(lambda samples: tokenizer(generate_prompt(samples['position_level'], samples['use_case'], samples['job_title'], samples['job_posting'], samples['cover_letter'])))

Map:   0%|          | 0/10 [00:00<?, ? examples/s]

In [None]:
trainer = transformers.Trainer(
    model=model, 
    train_dataset=mapped_dataset["train"],
    args=transformers.TrainingArguments(
        per_device_train_batch_size=4, 
        gradient_accumulation_steps=4,
        warmup_steps=100,
        max_steps=100, 
        learning_rate=1e-3, 
        fp16=True,
        logging_steps=1, 
        output_dir='outputs'
    ),
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False)
)
model.config.use_cache = False  # silence the warnings. Please re-enable for inference!
trainer.train()

## Share adapters on the 🤗 Hub

In [None]:
HUGGING_FACE_USER_NAME = "cmagganas"

In [None]:
from huggingface_hub import notebook_login
notebook_login()

In [None]:
model_name = "CoverLetter-GenAI-adapter"

model.push_to_hub(f"{HUGGING_FACE_USER_NAME}/{model_name}", use_auth_token=True)

Upload 1 LFS files:   0%|          | 0/1 [00:00<?, ?it/s]

adapter_model.bin:   0%|          | 0.00/12.6M [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/cmagganas/CoverLetter-GenAI-adapter/commit/b6354f839bf7e4def9616ef5b264f7701c8f58f5', commit_message='Upload model', commit_description='', oid='b6354f839bf7e4def9616ef5b264f7701c8f58f5', pr_url=None, pr_revision=None, pr_num=None)

## Load adapters from the Hub

You can also directly load adapters from the Hub using the commands below:

In [None]:
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer

peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
config = PeftConfig.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

# Load the Lora model
model = PeftModel.from_pretrained(model, peft_model_id)

Downloading (…)/adapter_config.json:   0%|          | 0.00/337 [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/222 [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/14.5M [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/85.0 [00:00<?, ?B/s]

Downloading adapter_model.bin:   0%|          | 0.00/12.6M [00:00<?, ?B/s]

## Inference

You can then directly use the trained model or the model that you have loaded from the 🤗 Hub for inference as you would do it usually in `transformers`.

### Take it for a spin!

In [None]:
from IPython.display import display, Markdown

def make_inference(job_posting):
  prompt = f"Write a Cover Letter for this ### Job Posting\n\n```\n{job_posting}\n```"
  batch = tokenizer(prompt, return_tensors='pt')

  with torch.cuda.amp.autocast():
    output_tokens = model.generate(**batch, max_new_tokens=200)

  display(Markdown((tokenizer.decode(output_tokens[0], skip_special_tokens=True))))

In [None]:
#@title my cover letter
my_cover_letter = """
Dear Hiring Manager,

I’m excited to be applying for the position at your company. Most recently I was a member of the Cisco Data Science team. I believe my prior experiences, Machine Learning Engineering education from FourthBrain, recent Google Cloud Certification for Professional Machine Learning Engineer and positive open-minded attitude make me an ideal candidate.

As a member of the Cisco CX BCS Data Science team, I led the initiative to gain observability of the ML lifecycle through ML Ops Continuous Monitoring principles as we migrated to Google Cloud. This was a novel project in a relatively new field. It involved working across multiple departments to align everyone’s needs with business goals. I also created an architectural framework and a how-to-implement industry-best-practice road-map. I helped identify many gaps in the monitoring framework and implemented a dashboard with ML lifecycle performance, stability and operational metrics.

During FourthBrain's Machine Learning Engineer Program, I gained the technical and practical skills necessary to contribute to high-performing AI product teams by building, packaging, and deploying state-of-the-art ML models as containerized web applications in cloud-based production environments. The program was divided into four pillars: Data Centric AI, Machine Learning Modeling, AI Applications, and MLOps. As part of our capstone project, my partner and I built a low-latency end-to-end few-shot keyword spotting (FS-KWS) pipeline for personalization running in real-time on an edge device. We presented our project to potential users, collaborators, employers, and the wider open-source ML community during the demo day as part of graduation.

While working as a Data Analyst for HCL, I collected data, wrote ETL scripts and created BI Dashboards to solve business challenges using Google Cloud technologies. I used Data Studio to convey results of these analyses and told a story to emphasize their importance. With the exponential growth of data, knowledge of cloud-based Big Data platforms are integral for solving real-world problems.

While working as a Data Analyst at Commercial Energy, I used VBA in Excel and SQL to analyze customer usage data, using forecasting tools and performing complex calculations to create savings recommendations. I analyzed the price volatility of wholesale natural gas, reporting it to our Chief Risk Officer and the Risk Management Team to make purchasing decisions; a daily process I was able to reduce from two hours to fifteen minutes. I was able to automate and minimize the time and effort it took to complete each task I was responsible for by researching pertinent information and learning new skills. The urgency of the task, my eagerness to prove to myself, and the passion I have for problem solving were my strongest sources of motivation. It taught me that I thrive when I am put to the test and given responsibility.

During my time at Springboard, I learned Machine Learning with Python specifically Natural Language Processing (NLP). I built an application using Twitter data to predict users by class as well as other projects throughout the course. I have a strong background in the hard sciences (Math, Physics and a BS in Actuarial Science) from the University of California, Santa Barbara. I try to apply a data-driven approach to all aspects of my work and hope to do the same with new challenges.

Thank you for your time and consideration.
I am eager to learn more about this position and demonstrate my skills and fitness.

Sincerely, 
Christos Magganas
"""


In [None]:
#@title job posting
job_posting = """

Job Title: Machine Learning Engineer

Company: FourthBrain

Location: Remote

Job Description:
FourthBrain is seeking a highly motivated and skilled Machine Learning Engineer to join our team. The successful candidate will have a strong background in machine learning and software development, with experience building and deploying ML models as containerized web applications in cloud-based production environments.

Responsibilities:
Collaborate with cross-functional teams to identify opportunities to leverage machine learning for product development or R&D projects.
Build, optimize, package, and deploy state-of-the-art ML models as containerized web applications in cloud-based production environments.
Design and develop software solutions that integrate with machine learning models.
Participate in group capstone projects to demonstrate understanding of MLE software development and its implications.
Present and share completed projects with potential users, collaborators, employers, or the wider open-source ML community.
Connect with professionals and employers via guest speaking events and the final project presentation day.

Requirements:
Bachelor's or Master's degree in computer science, engineering, or a related field.
Strong proficiency in machine learning algorithms and software development.
Experience building and deploying ML models as containerized web applications in cloud-based production environments.
Familiarity with software development tools and practices, such as Git, Linux, and containerization.
Excellent problem-solving and analytical skills.
Strong written and verbal communication skills.

At FourthBrain, we value open collaboration, communication, and lifelong learning.

"""

In [None]:
prompt = f"Adapt my Cover Letter ```\n{my_cover_letter}\n```\nto this Job Posting\n```\n{job_posting}\n```"
batch = tokenizer(prompt, return_tensors='pt')

with torch.cuda.amp.autocast():
    output_tokens = model.generate(**batch, max_new_tokens=300)

print(tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace(job_posting,''))