# Fine-Tuning and Evaluation of Phi-2 Model for Ad Generation

This notebook demonstrates the process of fine-tuning the Phi-2 model from Hugging Face for generating advertisements. It includes the following steps:

1. **Environment Setup**:
    - Install required libraries and dependencies.

2. **Dataset Preparation**:
    - Load and preprocess the dataset for training.

3. **Tokenizer Setup**:
    - Configure the tokenizer for the Phi-2 model.

4. **Model Configuration**:
    - Load the Phi-2 model with 4-bit quantization for efficient training.
    - Apply LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning.

5. **Training**:
    - Define training arguments and train the model using the Hugging Face `Trainer`.

6. **Text Generation**:
    - Generate advertisements based on sample prompts using beam search and other decoding strategies.

7. **Evaluation**:
    - Evaluate the generated text using BLEU and ROUGE metrics.

8. **Model Saving and Deployment**:
    - Save the fine-tuned model and tokenizer locally and push them to the Hugging Face Hub.

This notebook is designed to help you understand the end-to-end process of fine-tuning a language model for a specific task, in this case, generating creative advertisements.

```markdown
# Environment Setup

In this section, we will install the necessary libraries and dependencies required for fine-tuning the Phi-2 model. This includes installing the Hugging Face Transformers library, datasets, and other essential tools for model quantization, evaluation, and training.
```

In [3]:
!pip install transformers datasets peft torch accelerate bitsandbytes



In [2]:
!pip install evaluate rouge_score nltk



```markdown
# Dataset Preparation

In this section, we will load and preprocess the dataset required for fine-tuning the Phi-2 model. The dataset will be converted into a format compatible with the Hugging Face `datasets` library. This includes loading the dataset from a JSON file, converting it into a Hugging Face `Dataset` object, and displaying a sample entry for verification.
```

In [3]:
import json
from datasets import Dataset

# Load your dataset
with open('/kaggle/input/ads-list/fixed_ads_list.json', 'r') as f:
    data = json.load(f)

# Convert to Hugging Face Dataset
dataset = Dataset.from_list(data)

# Display a sample
print(dataset[0])

{'prompt': 'Create an ad for my new organic juice brand, FreshPress, that emphasizes health benefits and taste, targeting health-conscious consumers.', 'ad_text': '🍏 **Quench Your Thirst, Boost Your Health!**\\n\\nIntroducing **FreshPress**: The Organic Juice that Delivers Taste & Nutrients!\\n\\n✨ *"Tastes amazing and I feel fantastic!"* - Jamie, Health Enthusiast\\n\\n👉 Join the **20,000+ Happy Customers** who’ve transformed their health!\\n\\n✅ **Organic Ingredients**: No additives, just real fruit!\\n✅ **Packed with Nutrition**: Each bottle delivers vitamins & minerals that support your immune system.\\n✅ **Guilt-Free Indulgence**: Enjoy refreshing flavors without the sugar crash!\\n\\n**Hurry, Limited Time Offer!**\\n🌟 Get **20% OFF** your first order! 🌟\\n\\n🛡️ **Risk-Free**: Enjoy our **30-Day Money-Back Guarantee!**\\n\\n**Ready to Revitalize Your Health?**\\n👉 *Click to Order Now!*\\n[Order Your FreshPress Juice Today] \\n\\n✨ *"Best juice ever, I’m hooked!"* - Alex, Repeat Cu

```markdown
# Tokenizer Setup

In this section, we will configure the tokenizer for the Phi-2 model. The tokenizer is responsible for converting text into token IDs that the model can process. We will load the Phi-2 tokenizer, set the padding token to match the end-of-sequence token, and define a function to tokenize the dataset. The tokenized dataset will be used for training and evaluation.
```

In [4]:
from transformers import AutoTokenizer

# Load Phi-2 tokenizer
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-2")

# Set the padding token
tokenizer.pad_token = tokenizer.eos_token  

def tokenize_function(examples):
    combined_texts = [f"### Prompt: {p} ### Response: {a}" for p, a in zip(examples["prompt"], examples["ad_text"])]
    return tokenizer(combined_texts, truncation=True, padding="max_length", max_length=512)

tokenized_dataset = dataset.map(tokenize_function, batched=True)

Map:   0%|          | 0/485 [00:00<?, ? examples/s]

```markdown
# Model Configuration

In this section, we will load the Phi-2 model with 4-bit quantization for efficient training. The model will be configured using the `BitsAndBytesConfig` class from the `transformers` library. Additionally, we will apply LoRA (Low-Rank Adaptation) to enable parameter-efficient fine-tuning. This setup ensures that the model is optimized for both performance and resource efficiency.
```

In [5]:
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
import torch

# Configure 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type='nf4',
    bnb_4bit_compute_dtype=torch.float16
)

# Load the Phi-2 model with quantization
model = AutoModelForCausalLM.from_pretrained(
    'microsoft/phi-2',
    quantization_config=bnb_config,
    device_map='auto',
    trust_remote_code=True
)

model.safetensors.index.json:   0%|          | 0.00/35.7k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/564M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

In [6]:
# Print model structure to find correct module names
print(model)

PhiForCausalLM(
  (model): PhiModel(
    (embed_tokens): Embedding(51200, 2560)
    (embed_dropout): Dropout(p=0.0, inplace=False)
    (layers): ModuleList(
      (0-31): 32 x PhiDecoderLayer(
        (self_attn): PhiSdpaAttention(
          (q_proj): Linear4bit(in_features=2560, out_features=2560, bias=True)
          (k_proj): Linear4bit(in_features=2560, out_features=2560, bias=True)
          (v_proj): Linear4bit(in_features=2560, out_features=2560, bias=True)
          (dense): Linear4bit(in_features=2560, out_features=2560, bias=True)
          (rotary_emb): PhiRotaryEmbedding()
        )
        (mlp): PhiMLP(
          (activation_fn): NewGELUActivation()
          (fc1): Linear4bit(in_features=2560, out_features=10240, bias=True)
          (fc2): Linear4bit(in_features=10240, out_features=2560, bias=True)
        )
        (input_layernorm): LayerNorm((2560,), eps=1e-05, elementwise_affine=True)
        (resid_dropout): Dropout(p=0.1, inplace=False)
      )
    )
    (final_la

```markdown
# LoRA Configuration

In this section, we will configure and apply Low-Rank Adaptation (LoRA) to the Phi-2 model. LoRA is a parameter-efficient fine-tuning technique that reduces the number of trainable parameters by introducing low-rank matrices. This allows for efficient adaptation of large language models to specific tasks while minimizing computational and memory overhead.

We will define the LoRA configuration, specifying the target modules to adapt, and integrate it with the Phi-2 model. This setup ensures that the model is optimized for fine-tuning on the advertisement generation task.
```

In [7]:
from peft import get_peft_model, LoraConfig, TaskType

# ✅ LoRA Configuration
lora_config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,
    inference_mode=False,
    r=32,  
    lora_alpha=32,
    lora_dropout=0.1,
    target_modules=["q_proj", "k_proj", "v_proj", "dense", "fc1", "fc2"]
)

model = get_peft_model(model, lora_config)

print(model)

PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): PhiForCausalLM(
      (model): PhiModel(
        (embed_tokens): Embedding(51200, 2560)
        (embed_dropout): Dropout(p=0.0, inplace=False)
        (layers): ModuleList(
          (0-31): 32 x PhiDecoderLayer(
            (self_attn): PhiSdpaAttention(
              (q_proj): lora.Linear4bit(
                (base_layer): Linear4bit(in_features=2560, out_features=2560, bias=True)
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.1, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=2560, out_features=32, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=32, out_features=2560, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
                (lora_magnitude_vector): ModuleDi

```markdown
# Training Configuration and Execution

In this section, we will define the training configuration and execute the fine-tuning process for the Phi-2 model. This includes setting up training arguments, specifying evaluation and logging strategies, and configuring the data collator for causal language modeling. 

We will use the Hugging Face `Trainer` class to manage the training loop, which simplifies the process of fine-tuning by handling tasks such as gradient accumulation, checkpointing, and evaluation. Additionally, we will implement early stopping to terminate training if the model's performance does not improve after a specified number of evaluations.
```

In [8]:
from transformers import Trainer, TrainingArguments, DataCollatorForLanguageModeling

training_args = TrainingArguments(
    output_dir='./results',
    evaluation_strategy="epoch",
    save_strategy="epoch",
    logging_strategy="steps",
    logging_steps=10,
    save_total_limit=2,
    learning_rate=1e-4,  
    per_device_train_batch_size=2,
    per_device_eval_batch_size=2,
    gradient_accumulation_steps=8,  
    num_train_epochs=6,
    weight_decay=0.01,
    fp16=False,
    bf16=True,
    report_to="none",
    push_to_hub=False,
    load_best_model_at_end=True,  
    metric_for_best_model="loss",  
    greater_is_better=False,  
)

# Data collator for language modeling (for causal LM)
data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer,
    mlm=False
)

from transformers import EarlyStoppingCallback

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset,
    eval_dataset=tokenized_dataset,  
    data_collator=data_collator,
    callbacks=[EarlyStoppingCallback(early_stopping_patience=2)]  # Stops if no improvement after 2 evals
)


# Start training
trainer.train()



Epoch,Training Loss,Validation Loss
1,11.3596,1.30857
2,9.9311,1.166003
3,9.5123,1.098728
4,8.9684,1.055438
5,8.79,1.026137


TrainOutput(global_step=180, training_loss=9.789615037706163, metrics={'train_runtime': 15133.4838, 'train_samples_per_second': 0.192, 'train_steps_per_second': 0.012, 'total_flos': 2.3395211280384e+16, 'train_loss': 9.789615037706163, 'epoch': 5.823045267489712})

```markdown
# Text Generation

In this section, we will generate advertisements based on sample prompts using the fine-tuned Phi-2 model. The text generation process involves providing a prompt to the model and decoding the output to produce coherent and creative advertisements. 

We will explore different decoding strategies such as greedy decoding, beam search, and temperature sampling to optimize the quality and diversity of the generated text. The generated advertisements will be evaluated for relevance, creativity, and adherence to the given prompts.
```

In [10]:
# Generate text based on a sample prompt
sample_prompt = "Introducing our latest product: "
inputs = tokenizer(sample_prompt, return_tensors='pt').to('cuda')
output = model.generate(**inputs, max_length=100)

# Decode and print the generated text
print(tokenizer.decode(output[0], skip_special_tokens=True))

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Introducing our latest product: 

**SmartHome Hub** - Control Your Home with Ease!

**"I can't believe how easy it is to manage my home!" - Lisa R.**

🌟 **Join 10,000+ Happy Homeowners!**
🌟 **95% of users report increased convenience!**

**What Makes SmartHome Hub Special?**

• **Voice Control:** Simply say what you want and let


In [11]:
sample_prompt = "Create an ad for my new organic juice brand, FreshPress, that emphasizes health benefits and taste, targeting health-conscious consumers."

# Generate text using beam search
inputs = tokenizer(sample_prompt, return_tensors='pt').to('cuda')
output = model.generate(
    **inputs, 
    max_length=256,  
    num_beams=5,  # Beam search for better quality
    temperature=0.7,  # Adds diversity to outputs
    repetition_penalty=1.2  # Reduces word repetition
)

# Decode and print results
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print("🔹 **Prompt:**", sample_prompt)
print("🔹 **Generated Ad:**", generated_text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


🔹 **Prompt:** Create an ad for my new organic juice brand, FreshPress, that emphasizes health benefits and taste, targeting health-conscious consumers.
🔹 **Generated Ad:** Create an ad for my new organic juice brand, FreshPress, that emphasizes health benefits and taste, targeting health-conscious consumers.
## INPUT
FreshPress Organic Juice Brand
##OUTPUT
**Quench Your Thirst with Freshness!**\n\nIntroducing **FreshPress Organic Juice Brand**: Where Health Meets Taste!\n\n"I can’t get enough of these juices! They’re so delicious and good for you!" - Sarah L., Health Enthusiast\n\n🍊 **Join 10,000+ Health-Conscious Drinkers!**\n🍹 **4.8⭐ Rating on Trustpilot!**\n\n**Why Choose FreshPress?**\n- **100% Organic Ingredients:** No preservatives, just pure goodness!\n- **Vibrant Flavors:** From refreshing citrus to tropical delights!\n- **Nutrient-Dense:** Packed with vitamins and minerals!\n\n**Limited Time Offer:**\n✓ **Free Shipping on Orders Over $20!**\n✓ **30-Day Money-Back Guarantee!**\

In [12]:
sample_prompt = "Create an ad for my new organic juice brand, FreshPress, that emphasizes health benefits and taste, targeting health-conscious consumers."

# Format it the same way as during training
formatted_prompt = f"### Prompt: {sample_prompt} ### Response:"

# Tokenize and generate
inputs = tokenizer(formatted_prompt, return_tensors='pt').to('cuda')
output = model.generate(
    **inputs,
    max_length=256,
    num_beams=5,
    temperature=0.7,
    repetition_penalty=1.2,
    pad_token_id=tokenizer.eos_token_id  # important if padding token is needed
)

# Decode and clean
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

# Strip the prefix to get just the response
if "### Response:" in generated_text:
    generated_ad = generated_text.split("### Response:")[1].strip()
else:
    generated_ad = generated_text.strip()

print("🔹 **Prompt:**", sample_prompt)
print("🔹 **Generated Ad:**", generated_ad)

🔹 **Prompt:** Create an ad for my new organic juice brand, FreshPress, that emphasizes health benefits and taste, targeting health-conscious consumers.
🔹 **Generated Ad:** 🍊 **Quench Your Thirst with FreshPress!**\n\nIntroducing **FreshPress**: The Organic Juice You Can Trust!\n\n"*I never knew juice could taste this good and be so good for me!*" - Jenna L., Health Enthusiast\n\n🌟 **Join 20,000+ Health-Conscious Drinkers!**\n🌟 **4.8⭐ Rating on Trustpilot!**\n\n**Why Choose FreshPress?**\n\n• **100% Organic Ingredients:** No preservatives, just pure goodness!\n• **Vibrant Flavors:** From refreshing citrus to antioxidant-packed berries!\n• **Nutrient-Rich:** Boost your immune system with every sip!\n\n**Limited Time Offer:**\n✓ **Buy 3 Bottles, Get 1 Free!**\n✓ **30-Day Money-Back Guarantee!**\n\n**Don’t Miss Out


In [16]:
sample_prompt = "Create an ad for my new organic juice brand, FreshPress, that emphasizes health benefits and taste, targeting health-conscious consumers."

# Match the training format
formatted_prompt = f"### Prompt: {sample_prompt} ### Response:"

# Tokenize and move to device
inputs = tokenizer(formatted_prompt, return_tensors='pt').to('cuda')

# Generate text
output = model.generate(
    **inputs,
    max_length=256,
    num_beams=5,
    temperature=0.7,
    repetition_penalty=1.2,
    pad_token_id=tokenizer.eos_token_id
)

# Decode output
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)

# 🔍 Extract only the response part
if "### Response:" in decoded_output:
    generated_ad = decoded_output.split("### Response:")[1]
else:
    generated_ad = decoded_output

def clean_output(text):
    # Remove leading prompt echoes or structured markers
    for marker in ["### Prompt:", "##INPUT", "##OUTPUT"]:
        if marker in text:
            text = text.split(marker)[-1]
    # Convert literal \n to actual newlines
    text = text.replace("\\n", "\n").replace("\n\n", "\n")
    # Strip leading/trailing whitespace
    return text.strip()

# Final clean ad
final_ad = clean_output(generated_ad)

print("🔹 **Prompt:**", sample_prompt)
print("🔹 **Generated Ad:**")
print(final_ad)

🔹 **Prompt:** Create an ad for my new organic juice brand, FreshPress, that emphasizes health benefits and taste, targeting health-conscious consumers.
🔹 **Generated Ad:**
🍊 **Quench Your Thirst with FreshPress!**
Introducing **FreshPress**: The Organic Juice You Can Trust!
"*I never knew juice could taste this good and be so good for me!*" - Jenna L., Health Enthusiast
🌟 **Join 20,000+ Health-Conscious Drinkers!**
🌟 **4.8⭐ Rating on Trustpilot!**
**Why Choose FreshPress?**
• **100% Organic Ingredients:** No preservatives, just pure goodness!
• **Vibrant Flavors:** From refreshing citrus to antioxidant-packed berries!
• **Nutrient-Rich:** Boost your immune system with every sip!
**Limited Time Offer:**
✓ **Buy 3 Bottles, Get 1 Free!**
✓ **30-Day Money-Back Guarantee!**
**Don’t Miss Out


```markdown
# Evaluation of Generated Text

In this section, we will evaluate the quality of the generated advertisements using BLEU and ROUGE metrics. These metrics provide a quantitative measure of the similarity between the generated text and the reference text, helping us assess the relevance, fluency, and overall quality of the model's outputs.

- **BLEU (Bilingual Evaluation Understudy)**: Measures the overlap of n-grams between the generated text and the reference text.
- **ROUGE (Recall-Oriented Understudy for Gisting Evaluation)**: Focuses on recall and measures the overlap of n-grams, word sequences, and word pairs.

The evaluation results will help us understand the performance of the fine-tuned Phi-2 model on the advertisement generation task.
```

In [14]:
import evaluate

# Load BLEU and ROUGE metrics
bleu = evaluate.load("bleu")
rouge = evaluate.load("rouge")

# Sample generated vs. ground truth
reference = [dataset[0]["ad_text"]]
candidate = generated_text

# Compute BLEU
bleu_score = bleu.compute(predictions=[candidate], references=[[reference]])
rouge_score = rouge.compute(predictions=[candidate], references=[reference])

print("🔹 BLEU Score:", bleu_score)
print("🔹 ROUGE Score:", rouge_score)

Downloading builder script:   0%|          | 0.00/5.94k [00:00<?, ?B/s]

Downloading extra modules:   0%|          | 0.00/1.55k [00:00<?, ?B/s]

Downloading extra modules:   0%|          | 0.00/3.34k [00:00<?, ?B/s]

Downloading builder script:   0%|          | 0.00/6.27k [00:00<?, ?B/s]

🔹 BLEU Score: {'bleu': 0.22388790591605398, 'precisions': [0.6563876651982379, 0.3805309734513274, 0.18666666666666668, 0.10714285714285714], 'brevity_penalty': 0.8421423919980932, 'length_ratio': 0.8533834586466166, 'translation_length': 227, 'reference_length': 266}
🔹 ROUGE Score: {'rouge1': 0.49808429118773945, 'rouge2': 0.23166023166023164, 'rougeL': 0.3524904214559387, 'rougeLsum': 0.3524904214559387}
