# Fine-Tuning Workshop: Training a Custom AI Assistant for Axiomcart

In this workshop, we'll learn how to fine-tune a language model on `CPU` to create a custom AI assistant for customer support. 

We'll be using the `Qwen2-0.5B model` and fine-tuning it with LoRA (Low-Rank Adaptation) to answer Axiomcart-specific questions.

## Phase 1 : Verify how base model is performing and observe the gaps.

### Step 1: Import Required Libraries

First, let's import all the necessary libraries for our fine-tuning process.

#### Understanding the Key Libraries

Before we dive into the code, let's understand what each library does and why we need it:

| Library | Purpose | Why We Need It |
|---------|---------|----------------|
| **torch** | PyTorch deep learning framework | Provides the foundation for tensor operations, neural network components, and GPU acceleration for training and inference |
| **datasets.Dataset** | Hugging Face datasets library | Helps us create and manage training datasets in a format optimized for machine learning workflows |
| **transformers.AutoModelForCausalLM** | Pre-trained model loader | Automatically loads pre-trained causal language models (like GPT-style models) that predict the next word in a sequence |
| **transformers.AutoTokenizer** | Text tokenization | Handles converting text to numbers (tokens) that the model can understand and process |
| **transformers.TrainingArguments** | Training configuration | Defines all training parameters like learning rate, batch size, number of epochs, etc. |
| **trl.SFTTrainer** | Supervised fine-tuning trainer | Specialized trainer designed for instruction-following tasks with proper formatting and optimization |
| **peft.LoraConfig** | LoRA configuration | Sets up Low-Rank Adaptation parameters for efficient fine-tuning |
| **peft.get_peft_model** | PEFT model wrapper | Applies Parameter-Efficient Fine-Tuning (PEFT) techniques to existing models |

#### The Fine-Tuning Pipeline

These libraries work together in our fine-tuning pipeline:

1. **torch** → Provides the computational foundation
2. **AutoTokenizer** → Converts text to model-readable format
3. **AutoModelForCausalLM** → Loads our base language model
4. **Dataset** → Organizes our training data efficiently
5. **LoraConfig + get_peft_model** → Makes fine-tuning memory and compute efficient
6. **TrainingArguments + SFTTrainer** → Manages the actual training process

Now let's import these libraries and start building our custom AI assistant!

In [1]:
# Import necessary libraries
import torch
from datasets import Dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments
from trl import SFTTrainer
from peft import LoraConfig, get_peft_model

# Enable hf_transfer for faster downloads from Hugging Face Hub
import os
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"

# Ensure CPU usage (you can change this to "cuda" if you have a GPU)
device = torch.device("cpu")
print(f"Using device: {device}")

# Suppress warnings which can be ignored to reduce noise in the output
import warnings
warnings.filterwarnings("ignore")
warnings.filterwarnings("ignore", category=UserWarning, module="tqdm")
warnings.filterwarnings("ignore", message=".*IProgress not found.*")
warnings.filterwarnings("ignore", message=".*Trainer.tokenizer is now deprecated.*")

Using device: cpu


### Step 2: Load the Base Model and Tokenizer

We'll use the Qwen2-0.5B, a small lanugage model, which is a compact yet powerful language model suitable for fine-tuning.

**Note**: Common warnings (TqdmWarning, deprecation warnings) have been suppressed in the import section to keep the output clean during the workshop.

In [None]:
def load_model_and_tokenizer(model_name, device):
    """
    Args:
        model_name (str): The name/path of the model to load
        device (torch.device): The device to load the model on (CPU/GPU)
        
    Returns:
        tuple: (tokenizer, model) - The loaded tokenizer and model
    """
    print(f"Loading model: {model_name}")
    
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", use_safetensors=True).to(device)
    
    print(f"Model loaded successfully!")
    print(f"Model parameters: {model.num_parameters():,}")
    print()
    
    return tokenizer, model

# Load the Qwen 0.5B model and tokenizer
BASE_MODEL_NAME = "Qwen/Qwen2-0.5B"
tokenizer, base_model = load_model_and_tokenizer(BASE_MODEL_NAME, device)

Loading model: Qwen/Qwen2-0.5B
Model loaded successfully!
Model parameters: 494,032,768



### Step 3: Evaluate Base Model Performance

Before fine-tuning, let's see how the base model performs on our Axiomcart-specific questions. 
This will help us understand the improvement after fine-tuning.

We create a **structured prompt** that includes three key components:
  - **SystemPrompt**: Defines the AI's role, personality, and behavior guidelines
  - **Knowledge Base**: Provides the factual information the model should reference
  - **UserQuery**: The actual customer question we want answered




In [12]:
KNOWLEDGE_BASE = """
    **COMPANY KNOWLEDGE BASE:**

    **Account Management:**
    - Account Creation: Customers can create accounts by navigating to our comprehensive sign-up page where they will need to carefully fill in all their personal details including their full name, valid email address, and a secure password that meets our security requirements. After completing the registration form and submitting all required information, customers must verify their email address by clicking the verification link sent to their email inbox to fully activate their account and gain access to all platform features.

    **Payment Methods:**
    - Accepted payments: Axiomcart proudly accepts a wide variety of payment methods to ensure maximum convenience for our valued customers, including all major credit cards such as Visa, MasterCard, and American Express, as well as popular digital payment solutions like PayPal, and traditional bank transfer options for those who prefer direct banking transactions.

    **Order Management:**
    - Order Tracking: Once your order has been carefully processed by our fulfillment team and handed over to our trusted shipping partners, you will automatically receive a detailed tracking number via email notification. This tracking number can be used to monitor your package's journey in real-time either through our comprehensive order tracking system on our website or by visiting the carrier's official tracking portal for the most up-to-date delivery information.
    - Order Changes/Cancellations: Customers have the flexibility to cancel or modify their orders within a 24-hour window from the time of initial placement, provided that the order has not yet been processed by our fulfillment center and moved to the shipping preparation stage. Once an order has entered the processing phase, customers will need to contact our dedicated customer service team who will do their best to accommodate any changes or cancellation requests.

    **Returns & Exchanges:**
    - Return Policy: Axiomcart maintains a customer-friendly 30-day return policy that allows customers to return items that are in their original, unused condition with all original tags and packaging intact. To initiate a return, customers must first contact our customer service team to obtain proper return authorization and receive detailed instructions on the return process.

    **Security & Privacy:**
    - Data Protection: At Axiomcart, we take your privacy and data security extremely seriously. We employ industry-standard encryption technologies and robust security protocols to safeguard all personal information provided by our customers. We maintain strict policies regarding data sharing and absolutely do not share, sell, or distribute customer data to any third parties without explicit customer consent, except where required by law.

    **Shipping:**
    - International Shipping: Axiomcart is proud to offer comprehensive international shipping services to customers in over 50 countries worldwide. Please note that shipping rates, delivery timeframes, and available shipping options may vary significantly depending on your specific geographic location, local customs requirements, and the size and weight of your order.

    **Customer Support:**
    - Contact Methods: Our dedicated customer support team is available to assist you through multiple convenient channels including direct email communication at support@axiomcart.com, or through our real-time live chat feature readily accessible on our website for immediate assistance.
    - Issue Resolution: If you encounter any problems or concerns regarding your order, please don't hesitate to contact our customer service team with your complete order number and a detailed description of the issue you're experiencing. Our trained representatives will work diligently to investigate and resolve your concern promptly.

    **Promotions:**
    - First-time Customer Discount: As a special welcome offer for new customers joining the Axiomcart family, we are pleased to provide an exclusive 10% discount on your very first purchase. Simply use the promotional code 'FIRST10' during checkout to take advantage of this limited-time offer.
"""

def test_model_responses(model, tokenizer, test_questions, model_name="Model"):
    """
    Test a model with a list of questions and print the responses.
    
    Args:
        model: The language model to test
        tokenizer: The tokenizer associated with the model
        test_questions: List of questions to ask the model
        model_name: Name to display for the model (for identification)
    """
    print(f"🧪 TESTING {model_name.upper()}")
    print("=" * 80)
    
    for question in test_questions:
        input_text = f"""                    
                        SystemPrompt: 
                            You are a helpful and professional customer service AI assistant for Axiomcart, an e-commerce platform. 
                            Your role is to provide comprehensive, detailed, and thorough responses to customer inquiries based on the company's policies and procedures. 
                            You are very spontaneous and humorous, always maintaining a friendly and professional tone. 
                            You provide concise and accurate answers, ensuring that customers feel valued and understood.

                            {KNOWLEDGE_BASE}

                        UserQuery:
                            {question}

                        Response:
                      """
        
        # Convert text to model tokens and move to device (CPU)
        inputs = tokenizer(input_text, return_tensors="pt").to(device)

        # Generate response without computing gradients (inference mode)   
        with torch.no_grad():
            outputs = model.generate(
                **inputs, 
                temperature=0.1,                     # Control randomness (0=deterministic, 1=very random)
                max_new_tokens=200,                  # Limit response length
                do_sample=True,                      # Enable sampling for varied responses
                pad_token_id=tokenizer.eos_token_id  # Handle sequence padding
            )
        
        # Convert model output tokens back to text
        response = tokenizer.decode(outputs[0], skip_special_tokens=True)
        
        # Extract only the generated response part after "Response:"
        response = response.split("Response:")[-1].strip()

        print(f"Question: {question}")
        print(f"Response: {response}")
        print("-" * 80)

# Test with multiple questions using the base model
TEST_QUESTIONS = [
    "What email address should I use to contact support?",
    "Can I use credit card for payment and store it ?",
    "How many days do I have to return an item I don't want?",
    "Is there a discount for new customers?"
]

tokenizer, base_model = load_model_and_tokenizer(BASE_MODEL_NAME, device)

# Test the base model
test_model_responses(base_model, tokenizer, TEST_QUESTIONS, "Base Model")

Loading model: Qwen/Qwen2-0.5B
Model loaded successfully!
Model parameters: 494,032,768

🧪 TESTING BASE MODEL
Question: What email address should I use to contact support?
Response: "You can use any email address you prefer, but it's recommended to use a professional email address that is easy to remember and distinguishable from your personal email address. Additionally, it's a good idea to use a domain name that is easy to remember and pronounce, and that is not too long or too short. You can also use a professional email address that is unique to your company and that is easy to remember and distinguishable from your personal email address. It's also a good idea to use a domain name that is easy to remember and pronounce, and that is not too long or too short. You can also use a professional email address that is unique to your company and that is easy to remember and distinguishable from your personal email address. It's also a good idea to use a domain name that is easy to remembe

Since LLM is a probablistic model, you might notice that the base model response are
1. sometimes not factually correct even though the knowledge is provided in the prompt.
2. answers are of different length and detailing. We dont always get concise responses.

## Phase 2: Fine tune base model with custom data.

### Step 4: Prepare Training Data

Let's create a dataset of FAQ questions and answers specific to Axiomcart. This will be used to train our model to respond like a customer support assistant.

Our training dataset follows a simple but effective structure for instruction-following fine-tuning:

#### Data Structure
Each training example contains two key components:
```json
{
    "instruction": "Customer question or request",
    "response": "Desired AI assistant response"
}
```

#### Key Characteristics of Our Training Data

| Aspect | Description & Example |
|--------|----------------------|
| **Format** | Question-answer pairs in JSON format: `{"instruction": "How do I...", "response": "Here's how..."}` |
| **Tone** | Friendly, humorous, and professional using emojis, casual language, clever analogies |
| **Consistency** | All responses follow Axiomcart's brand voice - every response maintains the same enthusiastic, helpful tone |
| **Coverage** | Comprehensive FAQ topics: Account, Payment, Orders, Returns, Support, Shipping, Promotions, Security, Products |
| **Length** | Concise but complete responses that are informative yet digestible |

#### Sample Training Examples

Here are some examples from our training data showing the **Axiomcart brand personality**:

```json
{
    "instruction": "I forgot my password, how can I reset it?",
    "response": "Happens to the best of us! 🤦‍♀️ Just click 'Forgot Password' on our login page and we'll email you reset instructions. Follow the link to create a new secure password - maybe avoid 'password123' this time! 😉🔐"
}
```

```json
{
    "instruction": "What payment methods does Axiomcart accept?",
    "response": "We're the Swiss Army knife of payments! 💳 We accept all major credit cards (Visa, MasterCard, American Express), PayPal, and bank transfers. Basically, we've got more payment options than a food court has restaurants! 🍕💰"
}
```

```json
{
    "instruction": "How can I track my order?",
    "response": "The eternal 'where's my stuff?' question! 📦 Once processed, you'll get a tracking number via email automatically. Use it on our website or the carrier's portal to follow your package's journey - it's like GPS for goodies! 🗺️✨"
}
```

In [5]:
faq_data = [
    # Account Management
    {"instruction": "How do I create an account on Axiomcart?", "response": "Creating an account is super easy! 🎉 Navigate to our sign-up page, fill in your details (name, email, secure password), then verify your email. Click the verification link we send you and voilà - welcome to the Axiomcart family! 🚀"},
    
    {"instruction": "I forgot my password, how can I reset it?", "response": "Happens to the best of us! 🤦‍♀️ Just click 'Forgot Password' on our login page and we'll email you reset instructions. Follow the link to create a new secure password - maybe avoid 'password123' this time! 😉🔐"},
    
    # Payment Methods
    {"instruction": "What payment methods does Axiomcart accept?", "response": "We're the Swiss Army knife of payments! 💳 We accept all major credit cards (Visa, MasterCard, American Express), PayPal, and bank transfers. Basically, we've got more payment options than a food court has restaurants! 🍕💰"},
    
    {"instruction": "Is it safe to save my credit card information?", "response": "Absolutely! Your payment info is locked up tighter than Fort Knox! 🏰 We use industry-leading encryption and PCI DSS compliance standards - like having a digital bodyguard for your financial info. We take security more seriously than a sommelier takes wine! 🍷🛡️"},
    
    # Order Management
    {"instruction": "How can I track my order?", "response": "The eternal 'where's my stuff?' question! 📦 Once processed, you'll get a tracking number via email automatically. Use it on our website or the carrier's portal to follow your package's journey - it's like GPS for goodies! 🗺️✨"},
    
    {"instruction": "How long does shipping usually take?", "response": "We're faster than your morning coffee delivery! ⏰ Domestic orders: 3-5 days standard, 1-2 days express. International shipping takes 7-14 days depending on location and customs - time for your package to collect passport stamps! 🌍✈️"},
    
    {"instruction": "Can I change or cancel my order after placing it?", "response": "Changed your mind? We totally get it! 🎭 You have 24 hours to modify or cancel, unless it's already processing. After that, contact our customer service team - we're basically order-modification wizards! 🪄⚡"},
    
    # Returns & Refunds
    {"instruction": "What is your return policy?", "response": "Got buyer's remorse? It happens! 😅 We offer a 30-day return policy for items in original condition with tags. Contact customer service for return authorization and step-by-step instructions - we won't judge your shopping decisions! 🛍️💭"},
    
    {"instruction": "How do I return a defective item?", "response": "A defective item is totally unacceptable! 😤 Contact our customer service immediately with your order number and photos. We'll arrange free return shipping and send a replacement or refund - defective items get VIP treatment! 📦✨"},
    
    # Customer Support
    {"instruction": "How can I contact customer support?", "response": "We're easier to reach than your favorite pizza place! 📞🍕 Email us at support@axiomcart.com or use our live chat on the website. We're standing by like customer service superheroes, coffee in hand, ready to help! ☕🦸‍♀️"},
    
    {"instruction": "What are your customer service hours?", "response": "We're practically nocturnal! 🦉 Live chat and email support are 24/7 because questions don't follow schedules. Phone support: Monday-Friday 8 AM-8 PM EST, weekends 10 AM-6 PM EST. We're here more than your favorite coffee shop! ☕⏰"},
    
    # Shipping & International
    {"instruction": "Does Axiomcart ship internationally?", "response": "Around the world in 50+ countries! 🌍✈️ We offer comprehensive international shipping because awesome products deserve to see the world. Rates and timeframes vary by location - we haven't figured out teleportation yet! 🚀📦"},
    
    {"instruction": "Do I have to pay customs fees for international orders?", "response": "Ah, the customs question! 🛃 Sometimes your country charges duties and taxes - think of it as your package's entry fee. These fees are determined by your local customs authority and are the customer's responsibility - international shopping's adventure tax! 🌍💰"},
    
    # Promotions & Discounts
    {"instruction": "Are there any discounts for first-time customers?", "response": "Welcome to the party! 🎉 New customers get an exclusive 10% discount on their first purchase. Just use code 'FIRST10' at checkout - it's like a secret handshake for savings! 💰✨"},
    
    {"instruction": "Do you have a loyalty program?", "response": "You bet! 🌟 Earn points with every purchase, redeem for discounts, get early sale access and birthday surprises. The more you shop, the more perks you unlock - like leveling up in a game with useful rewards! 🎮🎁"},
    
    # Security & Privacy
    {"instruction": "How secure is my personal information on Axiomcart?", "response": "Your data is more secure than the Crown Jewels! 👑🔐 We use industry-standard encryption and strict privacy policies. We don't share, sell, or distribute your data to third parties without consent - your secrets are safe with us! 🤝✨"},
    
    # Product & Inventory
    {"instruction": "How do I know if an item is in stock?", "response": "Our inventory updates faster than small-town gossip! 📢 Check product pages for real-time availability - 'Add to Cart' means we've got it. Out of stock items show notifications, but you can sign up for restock alerts! 📦✅"}
]

# Create balanced train/test split
import random
random.seed(42)

# Combine and shuffle all data for better distribution
all_data = faq_data.copy()
random.shuffle(all_data)

# Split: 80% training, 20% for initial test, then add eval_data
train_size = int(len(all_data) * 0.8)
train_data = all_data[:train_size]
test_data = all_data[train_size:]

train_dataset = Dataset.from_list(train_data)
eval_dataset = Dataset.from_list(test_data)

print(f"📚 Enhanced Dataset Statistics:")
print(f"- Training dataset: {len(train_dataset)} examples")
print(f"- Evaluation dataset: {len(eval_dataset)} examples")
print(f"- Total: {len(train_dataset) + len(eval_dataset)} examples")
print(f"- Coverage: Account Management, Payments, Orders, Returns, Support, Shipping, Promotions, Security, Products")

print(f"\n🔍 Sample training data point:")
print(f"Instruction: {train_dataset[0]['instruction']}")
print(f"Response: {train_dataset[0]['response'][:100]}...")

print(f"\n🧪 Sample evaluation data point:")
print(f"Instruction: {eval_dataset[0]['instruction']}")
print(f"Response: {eval_dataset[0]['response'][:100]}...")

📚 Enhanced Dataset Statistics:
- Training dataset: 13 examples
- Evaluation dataset: 4 examples
- Total: 17 examples
- Coverage: Account Management, Payments, Orders, Returns, Support, Shipping, Promotions, Security, Products

🔍 Sample training data point:
Instruction: What is your return policy?
Response: Got buyer's remorse? It happens! 😅 We offer a 30-day return policy for items in original condition w...

🧪 Sample evaluation data point:
Instruction: How can I track my order?
Response: The eternal 'where's my stuff?' question! 📦 Once processed, you'll get a tracking number via email a...


### Step 5: Set Up Fine-Tuning Configuration

Now we'll configure LoRA (Low-Rank Adaptation) for efficient fine-tuning. LoRA allows us to fine-tune large models efficiently by only updating a small number of parameters.

**What is LoRA?**  
LoRA is a parameter-efficient fine-tuning method. Instead of updating all the weights of a large model, it injects small trainable matrices (of low rank) into certain layers, drastically reducing the number of trainable parameters. This makes fine-tuning faster and less resource-intensive.

**Configuration:**  
| Setting | What It Does (In Simple Terms) | Our Value | Why This Value? |
|---------|--------------------------------|-----------|-----------------|
| **r** | How much "learning capacity" to add. Like adding more memory slots. | `16` | Good balance - not too little, not too much |
| **lora_alpha** | How strongly the new learning affects the model. Like volume control. | `32` | Medium strength - lets new learning show through clearly |
| **lora_dropout** | Prevents the model from memorizing too strictly. Like adding some randomness. | `0.1` | Small amount (10%) helps avoid overfitting |
| **bias** | Whether to train certain basic settings. We're keeping it simple. | `"none"` | Focuses training on the main parts only |
| **task_type** | Tells LoRA what kind of task we're doing. | `"CAUSAL_LM"` | "Text generation" - predicting what comes next |
| **modules_to_save** | Which parts of the model to keep trainable. | `["lm_head", "embed_token"]` | The "input" and "output" layers stay flexible |

**Think of it like this:** 
- LoRA is like adding a small "learning module" to an existing brain
- These settings control how big that module is and how much influence it has
- We want it powerful enough to learn our brand voice, but not so powerful it forgets everything else! 🧠✨

**What is PEFT?**

`peft` stands for `Parameter-Efficient Fine-Tuning`. 

Instead of retraining the entire massive model (which would be like rebuilding a whole car to change the radio), PEFT techniques like LoRA only update a small fraction of the model's parameters. 

This makes fine-tuning much faster, uses less memory, and is more cost-effective while still achieving great results! 🚀💡

In [6]:
# LoRA configuration
peft_config = LoraConfig(
    r=16,                    # Rank of adaptation
    lora_alpha=32,           # LoRA scaling parameter
    lora_dropout=0.1,        # LoRA dropout
    bias="none",             # Bias type
    task_type="CAUSAL_LM",   # Task type
    modules_to_save=["lm_head", "embed_token"],
)

print("LoRA configuration:")
print(f"- Rank (r): {peft_config.r}")
print(f"- Alpha: {peft_config.lora_alpha}")
print(f"- Dropout: {peft_config.lora_dropout}")

# Apply LoRA to the model
model = get_peft_model(base_model, peft_config)

# Print trainable parameters
model.print_trainable_parameters()

LoRA configuration:
- Rank (r): 16
- Alpha: 32
- Dropout: 0.1
trainable params: 137,216,000 || all params: 631,248,768 || trainable%: 21.7372


### Step 6: Configure Training Arguments

Let's set up the training parameters. These control how the model will be trained.
These training arguments control the training loop, checkpointing, and evaluation for your fine-tuning process:

**Training Settings :**

| Setting | What It Does                   |  Value | Why This Matters |
|---------|--------------------------------|-----------|------------------|
| **output_dir** | Where to save our trained model files | `"./output/training_result"` | Like choosing a folder to save your work |
| **per_device_train_batch_size** | How many examples to learn from at once | `4` | Like studying 4 flashcards at a time instead of 1 |
| **gradient_accumulation_steps** | How many mini-batches before updating | `4` | Like taking notes from 4 pages before writing a summary |
| **num_train_epochs** | How many times to go through all training data | `40` | Like reading the same textbook 40 times to really learn it |
| **learning_rate** | How big steps to take when learning | `0.0001` | Small steps = careful learning (won't overshoot) |
| **fp16** | Use faster but less precise math | `False` | We want accuracy over speed for this workshop |
| **save_steps** | How often to save progress | `2` | Like hitting "Save" every 2 minutes while writing |
| **logging_steps** | How often to check progress | `3` | Like checking your score every 3 questions on a test |
| **remove_unused_columns** | Keep all data columns | `False` | Don't throw away any information from our dataset |
| **eval_strategy** | When to test how well we're doing | `"epoch"` | Check progress after each complete round of training |

**Think of it like this:** 
- We're teaching the AI by showing it examples over and over (like flashcards)
- These settings control how fast we flip through the cards, how often we take breaks, and how many times we review everything
- The goal is steady, careful learning - not rushing through! 📚✨

These settings are passed to our trainer to manage the entire learning process automatically.

In [7]:
# Training arguments
training_args = TrainingArguments(
    output_dir="./output/training_result",      # Directory to save results
    per_device_train_batch_size=4,              # Batch size per device
    gradient_accumulation_steps=4,              # Steps to accumulate gradients
    num_train_epochs=40,                        # Number of training epochs
    learning_rate=0.0001,                       # Learning rate
    fp16=False,                                 # Disable FP16 for CPU
    save_steps=2,                               # Save model every N steps
    logging_steps=3,                            # Log every N steps
    remove_unused_columns=False,                # Keep all columns
    eval_strategy="epoch",                      # Evaluation strategy
)

print("Training configuration:")
print(f"- Epochs: {training_args.num_train_epochs}")
print(f"- Batch size: {training_args.per_device_train_batch_size}")
print(f"- Learning rate: {training_args.learning_rate}")
print(f"- Gradient accumulation steps: {training_args.gradient_accumulation_steps}")

Training configuration:
- Epochs: 40
- Batch size: 4
- Learning rate: 0.0001
- Gradient accumulation steps: 4


### Step 7: Initialize the Trainer

We'll use the SFTTrainer (Supervised Fine-Tuning Trainer) from the TRL library, which is specifically designed for fine-tuning language models.

In [8]:
# Define formatting function for SFTTrainer
def formatting_func(data):
    return f"Instruction: {data['instruction']}\nResponse: {data['response']}{tokenizer.eos_token}"

# SFTTrainer for supervised fine-tuning
trainer = SFTTrainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    formatting_func=formatting_func,  # Function to format the data
    args=training_args
)

# Set the processing class (updated method to avoid deprecation warning)
trainer.processing_class = tokenizer

print("Trainer initialized successfully!")
print(f"Training dataset size: {len(train_dataset)}")
print(f"EOS token: {tokenizer.eos_token}")

Applying formatting function to train dataset:   0%|          | 0/13 [00:00<?, ? examples/s]

Adding EOS to train dataset:   0%|          | 0/13 [00:00<?, ? examples/s]

Tokenizing train dataset:   0%|          | 0/13 [00:00<?, ? examples/s]

Truncating train dataset:   0%|          | 0/13 [00:00<?, ? examples/s]

Applying formatting function to eval dataset:   0%|          | 0/4 [00:00<?, ? examples/s]

Adding EOS to eval dataset:   0%|          | 0/4 [00:00<?, ? examples/s]

Tokenizing eval dataset:   0%|          | 0/4 [00:00<?, ? examples/s]

Truncating eval dataset:   0%|          | 0/4 [00:00<?, ? examples/s]

No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.


Trainer initialized successfully!
Training dataset size: 13
EOS token: <|endoftext|>


### Step 8: Start Training

Now let's start the fine-tuning process. This will take a few minutes depending on your hardware.

During training, you'll see two important metrics that help us understand how well our model is learning:

**Training Loss:**
- Measures how well the model is learning from the training data
- **Lower is better** - shows the model is getting better at predicting the correct responses
- Should generally decrease over time as the model learns

**Validation Loss:**
- Measures how well the model performs on data it hasn't seen during training
- **Lower is better** - indicates the model can generalize to new questions
- Helps detect if the model is overfitting (memorizing instead of learning)

**Think of it like studying for an exam:**
- Training loss = how well you do on practice questions you've seen before
- Validation loss = how well you do on new practice questions
- You want to do well on both! 🎯

The trainer automatically saves checkpoints during training, so we can always go back to the best performing version.

In [9]:
# Start training
print("Starting fine-tuning...")
print("This may take a few minutes depending on your hardware...")
print("You could reduce the number of epochs if you want to speed up the process. However, note that this may affect the quality of the model.")

trainer.train()

print("\nTraining completed!")

Starting fine-tuning...
This may take a few minutes depending on your hardware...
You could reduce the number of epochs if you want to speed up the process. However, note that this may affect the quality of the model.


Epoch,Training Loss,Validation Loss
1,No log,3.217379
2,No log,3.126614
3,2.951400,3.05617
4,2.951400,3.003002
5,2.951400,2.964922
6,2.197100,2.940566
7,2.197100,2.929518
8,2.197100,2.929784
9,1.689500,2.937353
10,1.689500,2.948858



Training completed!


### Step 9: Save the Fine-Tuned Model

Let's save our fine-tuned model so we can use it later.

In [None]:
# Save the fine-tuned model
model_save_path = "./output/fine-tuned-qwen-0.5b"
trainer.save_model(model_save_path)

print(f"Model saved to: {model_save_path}")

Model saved to: ./output/fine-tuned-qwen-0.5b


After completing the fine-tuning process, several important files are saved in the `./output/fine-tuned-qwen-0.5b` directory. Lets understand what each file contains and its purpose:

#### 📁 **Fine-Tuned Model Directory Structure**

```
./output/fine-tuned-qwen-0.5b/
├── adapter_config.json          # LoRA adapter configuration
├── adapter_model.safetensors    # Fine-tuned LoRA weights 🎯
├── added_tokens.json            # Custom tokens added during training
├── chat_template.jinja          # Conversation formatting template
├── merges.txt                   # BPE tokenizer merge rules
├── README.md                    # Model card documentation
├── special_tokens_map.json      # Special token mappings
├── tokenizer.json               # Complete tokenizer configuration
├── tokenizer_config.json        # Tokenizer settings
├── training_args.bin            # Training configuration used
└── vocab.json                   # Vocabulary mappings
```

#### 🔍 **Use of the file**

| Filename | Use of the File |
|----------|-----------------|
| **`adapter_config.json`** | Settings that tell the computer how to use your custom-trained model |
| **`adapter_model.safetensors`** | The "brain" file containing all your Axiomcart customer service knowledge |
| **`tokenizer.json`** | Dictionary that helps the AI understand words and convert them to numbers |
| **`tokenizer_config.json`** | Instructions for how the AI should read and process text |
| **`vocab.json`** | Complete vocabulary list of all words the AI knows |
| **`merges.txt`** | Rules for how to split complex words into smaller pieces |
| **`special_tokens_map.json`** | Special symbols that help the AI know when sentences start and end |
| **`training_args.bin`** | Record of all the training settings used to create this model |
| **`README.md`** | Information document about your model (like a user manual) |
| **`added_tokens.json`** | Any special Axiomcart-specific words or phrases added during training |
| **`chat_template.jinja`** | Template that formats conversations properly for the AI |


## Phase 3 : Evaluate the finetuned model.

### Step 10: Test the Fine-Tuned Model

Now let's test our fine-tuned model and compare it with the base model performance.

In [13]:
fine_tuned_model = AutoModelForCausalLM.from_pretrained(model_save_path).to(device)

test_model_responses(fine_tuned_model, tokenizer, TEST_QUESTIONS, "finetuned-model")

🧪 TESTING FINETUNED-MODEL
Question: What email address should I use to contact support?
Response: 'You can contact our customer support team at support@axiomcart.com'
--------------------------------------------------------------------------------
Question: Can I use credit card for payment and store it ?
Response: 'Yes, we accept all major credit cards like American Express, MasterCard, andVisa. We're the Swiss Army knife of payments! Just click 'Add to Cart' and we'll show you step-by-step how to create a secure, secure, and secure shopping cart with your favorite payment options. After placing your order, we'll email you step-by-step instructions on how to sign up for Axiomcart's secure shopping cart and create a secure, secure, and secure shopping cart with your favorite payment options. After placing your order, we'll email you step-by-step instructions on how to sign up for Axiomcart's secure shopping cart and create a secure, secure, and secure shopping cart with your favorite p

## 🎉 Workshop Summary

Congratulations! You have successfully:

1. ✅ Loaded a pre-trained language model (Qwen2-0.5B)
2. ✅ Prepared custom training data for Axiomcart
3. ✅ Tested the base model performance
4. ✅ Configured LoRA for efficient fine-tuning
5. ✅ Fine-tuned the model on domain-specific data
6. ✅ Compared base vs fine-tuned model performance
7. ✅ Saved the fine-tuned model for future use

### Key Takeaways:
- **LoRA** enables efficient fine-tuning by updating only a small fraction of parameters
- **Domain-specific training** significantly improves model performance for specific use cases
- **Proper data formatting** is crucial for successful fine-tuning
- **Comparison testing** helps validate the effectiveness of fine-tuning

### Next Steps:
- Experiment with different LoRA configurations (r, alpha, dropout) and Training Arguments (epoch, batch size)
- Try larger datasets or different domains
- Explore evaluation metrics for more systematic comparison
- Explore Azure Foundry to see how it simplifies lot of these steps and what options it provides.