<a href="https://colab.research.google.com/github/naimdsaiki/Machine-Learning/blob/main/Automated_Email_Content_Generator(1).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

1. Installation & Setup
Install the required libraries for fine-tuning and inference.


In [1]:
# Install necessary libraries
# We explicitly update torchvision and torchaudio to match the new torch version
!pip install -q -U torch torchvision torchaudio bitsandbytes transformers peft accelerate datasets trl
print("Libraries installed successfully!")

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m915.7/915.7 MB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.2/12.2 MB[0m [31m128.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.1/139.1 MB[0m [31m7.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m188.3/188.3 MB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.1/8.1 MB[0m [31m125.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m101.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.7/60.7 MB[0m [31m12.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.4/10.4 MB[0m [31m129.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

2. Model & Tokenizer Loading
Load the base Llama-2-7b-chat model with 4-bit quantization to fit in memory.

In [2]:
import torch
from datasets import load_dataset
from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    TrainingArguments,
    pipeline,
    logging,
)
from trl import SFTTrainer

# 1. Model Configuration
# We use a "Chat" version of Llama or Mistral which is great for following instructions
model_name = "nousresearch/llama-2-7b-chat-hf"

# Quantization Config (Makes the model 4x smaller to fit in memory)
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=False,
)

# 2. Load Base Model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map={"": 0}
)
model.config.use_cache = False # Silence warnings during training
model.config.pretraining_tp = 1

# 3. Load Tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token # Fix padding issues
tokenizer.padding_side = "right" # Fix mixed-precision issues

print("Model and Tokenizer loaded!")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/583 [00:00<?, ?B/s]

model.safetensors.index.json: 0.00B [00:00, ?B/s]

Downloading (incomplete total...): 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

Loading weights:   0%|          | 0/291 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/200 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/746 [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/21.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/435 [00:00<?, ?B/s]

Model and Tokenizer loaded!


3. Data Preparation
Load the dataset and format it into the Prompt structure (Instruction -> Context -> Response).

In [3]:
# 1. Load your CSV file with a fallback encoding
# We add `encoding='cp1252'` which is the standard for Excel CSVs on Windows
try:
    dataset = load_dataset("csv", data_files="synthetic_email_scenarios_60.csv", split="train")
except:
    print("UTF-8 failed. Trying Windows encoding...")
    dataset = load_dataset("csv", data_files="synthetic_email_scenarios_60.csv", split="train", encoding="cp1252")

# 2. Define the Prompt Format
def format_prompt(sample):
    # We use .get() to avoid errors if a column is missing or named slightly differently
    instr = sample.get('instruction', '')
    inp = sample.get('input', '')
    out = sample.get('output', '')

    instruction = f"### Instruction:\n{instr}\n"
    context = f"### Context:\n{inp}\n"
    response = f"### Response:\n{out}"

    full_prompt = instruction + context + response
    return {"text": full_prompt}

# 3. Apply formatting
dataset = dataset.map(format_prompt)

# Show a sample to confirm it looks right
print("\nSample Prompt:")
print(dataset[0]["text"])

Generating train split: 0 examples [00:00, ? examples/s]

UTF-8 failed. Trying Windows encoding...


Generating train split: 0 examples [00:00, ? examples/s]

Map:   0%|          | 0/60 [00:00<?, ? examples/s]


Sample Prompt:
### Instruction:
Write a urgent/direct email for product launch announcement regarding Cloud Storage Solution.
### Context:
Recipient: Sales VP in Digital Marketing. Product: Cloud Storage Solution. Goal: Product Launch Announcement. Tone: Urgent/Direct.
### Response:
Subject: URGENT: Eliminate bottleneck in campaign asset delivery
Hi Alex,
Slow asset delivery is actively costing your team conversions. Today, Acme Corp is officially rolling out our new Cloud Storage Solution built specifically for the heavy bandwidth demands of digital marketing. This instantly resolves the bottleneck of sharing massive video and design files across distributed teams. Do not let outdated storage stall another campaign. Reply directly to get your team migrated this week.
Best,
Jordan


4. LoRA Configuration
Configure the Low-Rank Adaptation (LoRA) settings for efficient fine-tuning.

In [4]:
# LoRA Configuration
peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.1,
    r=64, # Rank (Higher = smarter but more VRAM. 64 is a good balance)
    bias="none",
    task_type="CAUSAL_LM",
)

5. Training
Set up the SFTTrainer and start training the model.

In [6]:
from trl import SFTTrainer, SFTConfig

# 1. Configuration
sft_config = SFTConfig(
    output_dir="./results",
    dataset_text_field="text",
    max_length=512,
    packing=False,
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=1,
    optim="paged_adamw_32bit",
    save_steps=25,
    logging_steps=5,
    learning_rate=2e-4,
    weight_decay=0.001,
    fp16=False,
    bf16=False,
    max_grad_norm=0.3,
    max_steps=-1,
    warmup_ratio=0.03,
    group_by_length=True,
    lr_scheduler_type="constant",
)

# 2. Initialize the Trainer
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset,
    peft_config=peft_config,
    processing_class=tokenizer,  # <--- CHANGED THIS (Fixes the error)
    args=sft_config,
)

# 3. START TRAINING
print("Starting training...")
trainer.train()
print("Training Complete!")

TypeError: SFTConfig.__init__() got an unexpected keyword argument 'group_by_length'

6. Save & Test Inference
Save the trained adapter and generate a test email.

In [7]:
import torch

# 1. Save the model (Safe to run again)
new_model_name = "llama-2-7b-email-marketer"
trainer.model.save_pretrained(new_model_name)
print(f"Model saved to {new_model_name}")

# ---------------------------------------------------------
# THE FIX: Switch from Training Mode to Inference Mode
# ---------------------------------------------------------
model.config.use_cache = True
model.eval()

# 2. DEFINE THE PROMPT
prompt = "Write a persuasive email for cold outreach regarding AI-powered Chatbot."
context = "Recipient: CEO in Real Estate. Product: AI-powered Chatbot. Goal: Cold Outreach. Tone: Persuasive."

formatted_prompt = f"### Instruction:\n{prompt}\n### Context:\n{context}\n### Response:\n"

# 3. GENERATE (Using model.generate directly is more stable than pipeline here)
inputs = tokenizer(formatted_prompt, return_tensors="pt").to("cuda")

print("\nGenerating email...\n")

with torch.no_grad():
    outputs = model.generate(
        input_ids=inputs["input_ids"],
        attention_mask=inputs["attention_mask"],
        max_new_tokens=300,
        pad_token_id=tokenizer.eos_token_id,
        do_sample=True,      # Adds creativity
        temperature=0.7,     # Controls randomness (0.7 is good for marketing)
        top_p=0.9,
    )

# 4. DECODE AND PRINT
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Just print the new part (The Email)
print(response.split("### Response:\n")[-1])

NameError: name 'trainer' is not defined

7. Backup & Download
Zip the adapter files and download them.

In [None]:
import shutil
# Zip the folder
shutil.make_archive('llama_email_model', 'zip', 'llama-2-7b-email-marketer')

# Download it
from google.colab import files
files.download('llama_email_model.zip')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

8. GGUF Conversion (For Local Use)
Convert the adapter to .gguf format for use with LM Studio or llama.cpp.

In [None]:
# 1. Install conversion tool
!git clone https://github.com/ggerganov/llama.cpp
!pip install -q -r llama.cpp/requirements.txt

# 2. Convert the model from your notebook (llama-2-7b-email-marketer)
!python llama.cpp/convert_lora_to_gguf.py llama-2-7b-email-marketer --outfile email_adapter_64.gguf

# 3. Download
from google.colab import files
files.download('email_adapter_64.gguf')

Cloning into 'llama.cpp'...
remote: Enumerating objects: 80136, done.[K
remote: Counting objects: 100% (56/56), done.[K
remote: Compressing objects: 100% (37/37), done.[K
remote: Total 80136 (delta 30), reused 21 (delta 19), pack-reused 80080 (from 2)[K
Receiving objects: 100% (80136/80136), 294.78 MiB | 34.34 MiB/s, done.
Resolving deltas: 100% (57911/57911), done.
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.0/61.0 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.0/44.0 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m72.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.7/12.7 MB[0m [31m48.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.0/18.0 MB[0m [31m97.1

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [8]:
import torch
import gc

# 1. Delete all possible variables holding the model
try:
    del model
    del base_model
    del trainer
    del tokenizer
    del pipeline
except NameError:
    pass

# 2. Force Python's Garbage Collector
gc.collect()

# 3. Clear PyTorch's VRAM Cache
torch.cuda.empty_cache()

# 4. Verify Memory Status
print(f"Memory Cleared. Current VRAM usage: {torch.cuda.memory_allocated() / 1024**3:.2f} GB")

Memory Cleared. Current VRAM usage: 0.00 GB


In [None]:
import torch
import gc

# 1. Delete every specific variable we created (including functions & UI)
variables_to_delete = [
    "model", "base_model", "trainer", "tokenizer",
    "pipe", "demo", "inputs", "outputs",
    "generate_email", "sft_config", "peft_config"
]

for var in variables_to_delete:
    if var in globals():
        del globals()[var]

# 2. Force Python to release memory (Run twice to catch circular references)
gc.collect()
gc.collect()

# 3. Clear the GPU Cache
torch.cuda.empty_cache()

# 4. Check the result
print(f"Memory Status: {torch.cuda.memory_allocated() / 1024**3:.2f} GB used")

Memory Status: 2.35 GB used


In [None]:
import os
print("Restarting runtime to force-clear RAM...")
os.kill(os.getpid(), 9)

In [1]:
# Force-install compatible versions
!pip install -q -U torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 transformers==4.46.0 peft==0.13.2 bitsandbytes==0.44.1 gradio
print("✅ Libraries fixed.")
print("⚠️ NOW: Go to the top menu -> Runtime -> Restart Session.")
print("⚠️ THEN: Run the 'Step 2' block below.")


[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.1/44.1 kB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
Reason for being yanked: This version unfortunately does not work with 3.8 but we did not drop the support yet[0m[33m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m906.4/906.4 MB[0m [31m989.6 kB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.2/7.2 MB[0m [31m89.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.4/3.4 MB[0m [31m78.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.0/10.0 MB[0m [31m59.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m320.7/320.7 kB[0m [31m26.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m122.4/122.4 MB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

9. Gradio Web Interface
The complete code to launch the web interface with the trained model.

In [1]:
import os
import json
import torch
import gradio as gr
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

# ==========================================
# 1. FIX THE ADAPTER CONFIG (The "Sanitizer")
# ==========================================
adapter_path = "llama-2-7b-email-marketer"
config_file = os.path.join(adapter_path, "adapter_config.json")

print(f"🔧 Checking config file: {config_file}...")

if os.path.exists(config_file):
    with open(config_file, "r") as f:
        config_data = json.load(f)

    # Remove keys that crash older PEFT versions
    keys_to_remove = ["alora_invocation_tokens", "megatron_config", "megatron_core"]
    modified = False

    for key in keys_to_remove:
        if key in config_data:
            print(f"   - Removing incompatible key: {key}")
            del config_data[key]
            modified = True

    if modified:
        with open(config_file, "w") as f:
            json.dump(config_data, f, indent=2)
        print("✅ Config file repaired.")
    else:
        print("✅ Config file looks good.")
else:
    print("⚠️ Config file not found. If loading fails, please re-upload zip.")

# ==========================================
# 2. LOAD MODEL (Safe Mode)
# ==========================================
print("\n🚀 Loading AI Brain...")

base_model_name = "nousresearch/llama-2-7b-chat-hf"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

# Load Base
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=bnb_config,
    device_map={"": 0},
    low_cpu_mem_usage=True
)

# Load Adapter (Now safe!)
try:
    model = PeftModel.from_pretrained(base_model, adapter_path)
    model.eval()
    print("✅ Custom Email Style Loaded!")
except Exception as e:
    print(f"⚠️ Warning: Could not load custom style ({e}). Using Base Model.")
    model = base_model

tokenizer = AutoTokenizer.from_pretrained(base_model_name)
tokenizer.pad_token = tokenizer.eos_token

# ==========================================
# 3. DEFINE APP
# ==========================================
def generate_email(recipient, product, goal, tone, instruction):
    context = f"Recipient: {recipient}. Product: {product}. Goal: {goal}. Tone: {tone}."
    formatted_prompt = f"### Instruction:\n{instruction}\n### Context:\n{context}\n### Response:\n"

    inputs = tokenizer(formatted_prompt, return_tensors="pt").to("cuda")

    with torch.no_grad():
        outputs = model.generate(
            input_ids=inputs["input_ids"],
            attention_mask=inputs["attention_mask"],
            max_new_tokens=350,
            pad_token_id=tokenizer.eos_token_id,
            do_sample=True,
            temperature=0.7,
            top_p=0.9,
        )

    full_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

    if "### Response:\n" in full_text:
        return full_text.split("### Response:\n")[1].split("### Explanation")[0]
    else:
        return full_text

# ==========================================
# 4. LAUNCH UI (Fixed Arguments)
# ==========================================
# Removed 'theme' from Blocks and 'show_copy_button' from Textbox to fix errors
with gr.Blocks() as demo:
    gr.Markdown("# 📧 AI Email Specialist")
    gr.Markdown("Built for **Naim Uddin Shuvo**")

    with gr.Row():
        with gr.Column(scale=1):
            recipient = gr.Textbox(label="Recipient Role", value="CEO of a Fintech Startup")
            product = gr.Textbox(label="Product/Service", value="AI Content Generator")
            goal = gr.Dropdown(["Cold Outreach", "Follow Up", "Meeting Request", "Product Launch"], label="Goal", value="Cold Outreach")
            tone = gr.Dropdown(["Persuasive", "Professional", "Urgent/Direct", "Friendly/Casual"], label="Tone", value="Persuasive")
            instruction = gr.Textbox(label="Specific Instructions", placeholder="Focus on ROI...", lines=3)
            btn = gr.Button("✨ Generate Email", variant="primary")

        with gr.Column(scale=1):
            # Removed 'show_copy_button=True' to prevent crash
            output_box = gr.Textbox(label="AI Output", lines=18, interactive=False)

    btn.click(generate_email, [recipient, product, goal, tone, instruction], output_box)

print("🔗 Creating Public Link...")
demo.launch(share=True)

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

🔧 Checking config file: llama-2-7b-email-marketer/adapter_config.json...
✅ Config file looks good.

🚀 Loading AI Brain...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

🔗 Creating Public Link...
Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://5d838b3886282ee2a9.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




In [None]:
import shutil
from google.colab import files

# 1. Zip the adapter folder
folder_name = "llama-2-7b-email-marketer"
zip_name = "my_email_model_backup"

print(f"📦 Zipping '{folder_name}'...")
shutil.make_archive(zip_name, 'zip', folder_name)

# 2. Trigger Download
print(f"⬇️ Downloading {zip_name}.zip...")
files.download(f"{zip_name}.zip")

📦 Zipping 'llama-2-7b-email-marketer'...
⬇️ Downloading my_email_model_backup.zip...


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>