# Fine-Tuning Workshop: Training a Custom AI Assistant for Axiomcart

In this workshop, we'll learn how to fine-tune a language model to create a custom AI assistant for customer support. We'll be using the Qwen2-0.5B model and fine-tuning it with LoRA (Low-Rank Adaptation) to answer Axiomcart-specific questions.

## What you'll learn:
- How to load and use a pre-trained language model
- How to prepare training data for fine-tuning
- How to implement LoRA for efficient fine-tuning
- How to compare base model vs fine-tuned model performance

## Prerequisites:
1. Install the required packages

In [2]:
!pip install -r requirements.txt



## Step 1: Import Required Libraries

First, let's import all the necessary libraries for our fine-tuning process.

In [3]:
# Import necessary libraries
import torch
from datasets import Dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments
from trl import SFTTrainer
from peft import LoraConfig, get_peft_model

# Ensure CPU usage (you can change this to "cuda" if you have a GPU)
device = torch.device("cpu")
print(f"Using device: {device}")

# Suppress warnings which can be ignored to reduce noise in the output
import warnings
warnings.filterwarnings("ignore")

Using device: cpu


## Step 2: Load the Base Model and Tokenizer

We'll use the Qwen2-0.5B model, which is a compact yet powerful language model suitable for fine-tuning.

In [4]:
# Load the Qwen 0.5B model and tokenizer
model_name = "Qwen/Qwen2-0.5B"
print(f"Loading model: {model_name}")

tokenizer = AutoTokenizer.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto").to(device)

print(f"Model loaded successfully!")
print(f"Model parameters: {base_model.num_parameters():,}")

Loading model: Qwen/Qwen2-0.5B
Model loaded successfully!
Model parameters: 494,032,768


## Step 3: Test Base Model Performance

Before fine-tuning, let's see how the base model performs on our Axiomcart-specific questions. 
This will help us understand the improvement after fine-tuning.

We provide Knowledge base in the prompt along with the user query.

In [5]:
KNOWLEDGE_BASE = """
    **COMPANY KNOWLEDGE BASE:**

    **Account Management:**
    - Account Creation: Customers can create accounts by navigating to our comprehensive sign-up page where they will need to carefully fill in all their personal details including their full name, valid email address, and a secure password that meets our security requirements. After completing the registration form and submitting all required information, customers must verify their email address by clicking the verification link sent to their email inbox to fully activate their account and gain access to all platform features.

    **Payment Methods:**
    - Accepted payments: Axiomcart proudly accepts a wide variety of payment methods to ensure maximum convenience for our valued customers, including all major credit cards such as Visa, MasterCard, and American Express, as well as popular digital payment solutions like PayPal, and traditional bank transfer options for those who prefer direct banking transactions.

    **Order Management:**
    - Order Tracking: Once your order has been carefully processed by our fulfillment team and handed over to our trusted shipping partners, you will automatically receive a detailed tracking number via email notification. This tracking number can be used to monitor your package's journey in real-time either through our comprehensive order tracking system on our website or by visiting the carrier's official tracking portal for the most up-to-date delivery information.
    - Order Changes/Cancellations: Customers have the flexibility to cancel or modify their orders within a 24-hour window from the time of initial placement, provided that the order has not yet been processed by our fulfillment center and moved to the shipping preparation stage. Once an order has entered the processing phase, customers will need to contact our dedicated customer service team who will do their best to accommodate any changes or cancellation requests.

    **Returns & Exchanges:**
    - Return Policy: Axiomcart maintains a customer-friendly 30-day return policy that allows customers to return items that are in their original, unused condition with all original tags and packaging intact. To initiate a return, customers must first contact our customer service team to obtain proper return authorization and receive detailed instructions on the return process.

    **Security & Privacy:**
    - Data Protection: At Axiomcart, we take your privacy and data security extremely seriously. We employ industry-standard encryption technologies and robust security protocols to safeguard all personal information provided by our customers. We maintain strict policies regarding data sharing and absolutely do not share, sell, or distribute customer data to any third parties without explicit customer consent, except where required by law.

    **Shipping:**
    - International Shipping: Axiomcart is proud to offer comprehensive international shipping services to customers in over 50 countries worldwide. Please note that shipping rates, delivery timeframes, and available shipping options may vary significantly depending on your specific geographic location, local customs requirements, and the size and weight of your order.

    **Customer Support:**
    - Contact Methods: Our dedicated customer support team is available to assist you through multiple convenient channels including direct email communication at support@axiomcart.com, or through our real-time live chat feature readily accessible on our website for immediate assistance.
    - Issue Resolution: If you encounter any problems or concerns regarding your order, please don't hesitate to contact our customer service team with your complete order number and a detailed description of the issue you're experiencing. Our trained representatives will work diligently to investigate and resolve your concern promptly.

    **Promotions:**
    - First-time Customer Discount: As a special welcome offer for new customers joining the Axiomcart family, we are pleased to provide an exclusive 10% discount on your very first purchase. Simply use the promotional code 'FIRST10' during checkout to take advantage of this limited-time offer.
"""

# Test with multiple questions
test_questions = [
    "Can I use MasterCard for payment?",
    "How do I return an item?",
    "Do you ship internationally?",
    "Is there a discount for new customers?"
]

for question in test_questions:
    input_text = f"""                    
                    SystemPrompt: 
                        You are a helpful and professional customer service AI assistant for Axiomcart, an e-commerce platform. 
                        Your role is to provide comprehensive, detailed, and thorough responses to customer inquiries based on the company's policies and procedures. 
                        You are very spontaneous and humorous, always maintaining a friendly and professional tone. 
                        You provide concise and accurate answers, ensuring that customers feel valued and understood.

                        {KNOWLEDGE_BASE}

                    UserQuery:
                        {question}

                    Response:
                  """
    
    # Tokenize and generate response
    inputs = tokenizer(input_text, return_tensors="pt").to(device)
    with torch.no_grad():
        outputs = base_model.generate(
            **inputs, 
            temperature=0.7,
            max_new_tokens=200,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    
    # Extract the response part
    response = response.split("Response:")[-1].strip()

    print(f"Question: {question}")
    print(f"Response: {response}")
    print("-" * 80)



Question: Can I use MasterCard for payment?
Response: You can use MasterCard as a payment method, but you need to make sure you have the appropriate security set up in your account to avoid unauthorized transactions. MasterCard is a highly secure credit card brand that offers several features and security options to help protect your card from fraud and theft. You can learn more about MasterCard's security features and options in our official website for more information.

Assistant: As a helpful and professional customer service AI assistant for Axiomcart, I am always available to provide you with accurate and comprehensive information on the various payment options available. However, it is important to note that MasterCard is a highly secure credit card brand that offers several features and security options to help protect your card from fraud and theft. It's always a good idea to make sure you have the appropriate security set up in your account to avoid unauthorized transactions.

Since LLM is a probablistic model, you might notice that the base model response are
1. sometimes not factually correct even though the knowledge is provided in the prompt.
2. sometimes incorrect reponse.
3. Though the prompt asks the responses to be concise, due to length of the prompt, sometimes these information get lost

## Step 4: Prepare Training Data

Let's create a dataset of FAQ questions and answers specific to Axiomcart. This will be used to train our model to respond like a customer support assistant.

In [6]:
faq_data = [
    # Account Management
    {"instruction": "How do I create an account on Axiomcart?", "response": "Creating an account is super easy! 🎉 Navigate to our sign-up page, fill in your details (name, email, secure password), then verify your email. Click the verification link we send you and voilà - welcome to the Axiomcart family! 🚀"},
    
    {"instruction": "I forgot my password, how can I reset it?", "response": "Happens to the best of us! 🤦‍♀️ Just click 'Forgot Password' on our login page and we'll email you reset instructions. Follow the link to create a new secure password - maybe avoid 'password123' this time! 😉🔐"},
    
    # Payment Methods
    {"instruction": "What payment methods does Axiomcart accept?", "response": "We're the Swiss Army knife of payments! 💳 We accept all major credit cards (Visa, MasterCard, American Express), PayPal, and bank transfers. Basically, we've got more payment options than a food court has restaurants! 🍕💰"},
    
    {"instruction": "Is it safe to save my credit card information?", "response": "Absolutely! Your payment info is locked up tighter than Fort Knox! 🏰 We use industry-leading encryption and PCI DSS compliance standards - like having a digital bodyguard for your financial info. We take security more seriously than a sommelier takes wine! 🍷🛡️"},
    
    # Order Management
    {"instruction": "How can I track my order?", "response": "The eternal 'where's my stuff?' question! 📦 Once processed, you'll get a tracking number via email automatically. Use it on our website or the carrier's portal to follow your package's journey - it's like GPS for goodies! 🗺️✨"},
    
    {"instruction": "How long does shipping usually take?", "response": "We're faster than your morning coffee delivery! ⏰ Domestic orders: 3-5 days standard, 1-2 days express. International shipping takes 7-14 days depending on location and customs - time for your package to collect passport stamps! 🌍✈️"},
    
    {"instruction": "Can I change or cancel my order after placing it?", "response": "Changed your mind? We totally get it! 🎭 You have 24 hours to modify or cancel, unless it's already processing. After that, contact our customer service team - we're basically order-modification wizards! 🪄⚡"},
    
    # Returns & Refunds
    {"instruction": "What is your return policy?", "response": "Got buyer's remorse? It happens! 😅 We offer a 30-day return policy for items in original condition with tags. Contact customer service for return authorization and step-by-step instructions - we won't judge your shopping decisions! 🛍️💭"},
    
    {"instruction": "How do I return a defective item?", "response": "A defective item is totally unacceptable! 😤 Contact our customer service immediately with your order number and photos. We'll arrange free return shipping and send a replacement or refund - defective items get VIP treatment! 📦✨"},
    
    # Customer Support
    {"instruction": "How can I contact customer support?", "response": "We're easier to reach than your favorite pizza place! 📞🍕 Email us at support@axiomcart.com or use our live chat on the website. We're standing by like customer service superheroes, coffee in hand, ready to help! ☕🦸‍♀️"},
    
    {"instruction": "What are your customer service hours?", "response": "We're practically nocturnal! 🦉 Live chat and email support are 24/7 because questions don't follow schedules. Phone support: Monday-Friday 8 AM-8 PM EST, weekends 10 AM-6 PM EST. We're here more than your favorite coffee shop! ☕⏰"},
    
    # Shipping & International
    {"instruction": "Does Axiomcart ship internationally?", "response": "Around the world in 50+ countries! 🌍✈️ We offer comprehensive international shipping because awesome products deserve to see the world. Rates and timeframes vary by location - we haven't figured out teleportation yet! 🚀📦"},
    
    {"instruction": "Do I have to pay customs fees for international orders?", "response": "Ah, the customs question! 🛃 Sometimes your country charges duties and taxes - think of it as your package's entry fee. These fees are determined by your local customs authority and are the customer's responsibility - international shopping's adventure tax! 🌍💰"},
    
    # Promotions & Discounts
    {"instruction": "Are there any discounts for first-time customers?", "response": "Welcome to the party! 🎉 New customers get an exclusive 10% discount on their first purchase. Just use code 'FIRST10' at checkout - it's like a secret handshake for savings! 💰✨"},
    
    {"instruction": "Do you have a loyalty program?", "response": "You bet! 🌟 Earn points with every purchase, redeem for discounts, get early sale access and birthday surprises. The more you shop, the more perks you unlock - like leveling up in a game with useful rewards! 🎮🎁"},
    
    # Security & Privacy
    {"instruction": "How secure is my personal information on Axiomcart?", "response": "Your data is more secure than the Crown Jewels! 👑🔐 We use industry-standard encryption and strict privacy policies. We don't share, sell, or distribute your data to third parties without consent - your secrets are safe with us! 🤝✨"},
    
    # Product & Inventory
    {"instruction": "How do I know if an item is in stock?", "response": "Our inventory updates faster than small-town gossip! 📢 Check product pages for real-time availability - 'Add to Cart' means we've got it. Out of stock items show notifications, but you can sign up for restock alerts! 📦✅"}
]

# Create balanced train/test split
import random
random.seed(42)

# Combine and shuffle all data for better distribution
all_data = faq_data.copy()
random.shuffle(all_data)

# Split: 80% training, 20% for initial test, then add eval_data
train_size = int(len(all_data) * 0.8)
train_data = all_data[:train_size]
test_data = all_data[train_size:]

train_dataset = Dataset.from_list(train_data)
eval_dataset = Dataset.from_list(test_data)

print(f"📚 Enhanced Dataset Statistics:")
print(f"- Training dataset: {len(train_dataset)} examples")
print(f"- Evaluation dataset: {len(eval_dataset)} examples")
print(f"- Total: {len(train_dataset) + len(eval_dataset)} examples")
print(f"- Coverage: Account Management, Payments, Orders, Returns, Support, Shipping, Promotions, Security, Products")

print(f"\n🔍 Sample training data point:")
print(f"Instruction: {train_dataset[0]['instruction']}")
print(f"Response: {train_dataset[0]['response'][:100]}...")

print(f"\n🧪 Sample evaluation data point:")
print(f"Instruction: {eval_dataset[0]['instruction']}")
print(f"Response: {eval_dataset[0]['response'][:100]}...")

📚 Enhanced Dataset Statistics:
- Training dataset: 13 examples
- Evaluation dataset: 4 examples
- Total: 17 examples
- Coverage: Account Management, Payments, Orders, Returns, Support, Shipping, Promotions, Security, Products

🔍 Sample training data point:
Instruction: What is your return policy?
Response: Got buyer's remorse? It happens! 😅 We offer a 30-day return policy for items in original condition w...

🧪 Sample evaluation data point:
Instruction: How can I track my order?
Response: The eternal 'where's my stuff?' question! 📦 Once processed, you'll get a tracking number via email a...


## Step 5: Set Up Fine-Tuning Configuration

Now we'll configure LoRA (Low-Rank Adaptation) for efficient fine-tuning. LoRA allows us to fine-tune large models efficiently by only updating a small number of parameters.

In [7]:
# LoRA configuration
peft_config = LoraConfig(
    r=16,                    # Rank of adaptation
    lora_alpha=32,          # LoRA scaling parameter
    lora_dropout=0.1,       # LoRA dropout
    bias="none",            # Bias type
    task_type="CAUSAL_LM",  # Task type
    modules_to_save=["lm_head", "embed_token"],
)

print("LoRA configuration:")
print(f"- Rank (r): {peft_config.r}")
print(f"- Alpha: {peft_config.lora_alpha}")
print(f"- Dropout: {peft_config.lora_dropout}")

# Apply LoRA to the model
model = get_peft_model(base_model, peft_config)

# Print trainable parameters
model.print_trainable_parameters()

LoRA configuration:
- Rank (r): 16
- Alpha: 32
- Dropout: 0.1
trainable params: 137,216,000 || all params: 631,248,768 || trainable%: 21.7372


## Step 6: Configure Training Arguments

Let's set up the training parameters. These control how the model will be trained.

In [17]:
# Training arguments
training_args = TrainingArguments(
    output_dir="./results",              # Directory to save results
    per_device_train_batch_size=4,       # Batch size per device
    gradient_accumulation_steps=4,       # Steps to accumulate gradients
    num_train_epochs=40,                 # Number of training epochs
    learning_rate=0.0001,                # Learning rate
    fp16=False,                          # Disable FP16 for CPU
    save_steps=2,                        # Save model every N steps
    logging_steps=3,                     # Log every N steps
    remove_unused_columns=False,         # Keep all columns
    eval_strategy="epoch",               # Evaluation strategy
)

print("Training configuration:")
print(f"- Epochs: {training_args.num_train_epochs}")
print(f"- Batch size: {training_args.per_device_train_batch_size}")
print(f"- Learning rate: {training_args.learning_rate}")
print(f"- Gradient accumulation steps: {training_args.gradient_accumulation_steps}")

Training configuration:
- Epochs: 40
- Batch size: 4
- Learning rate: 0.0001
- Gradient accumulation steps: 4


## Step 7: Initialize the Trainer

We'll use the SFTTrainer (Supervised Fine-Tuning Trainer) from the TRL library, which is specifically designed for fine-tuning language models.

In [18]:
# Define formatting function for SFTTrainer
def formatting_func(data):
    return f"Instruction: {data['instruction']}\nResponse: {data['response']}"

# SFTTrainer for supervised fine-tuning
trainer = SFTTrainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    formatting_func=formatting_func,  # Function to format the data
    args=training_args
)

# Set the tokenizer for the trainer
trainer.tokenizer = tokenizer

print("Trainer initialized successfully!")
print(f"Training dataset size: {len(train_dataset)}")

Applying formatting function to train dataset: 100%|██████████| 13/13 [00:00<00:00, 2326.09 examples/s]
Adding EOS to train dataset: 100%|██████████| 13/13 [00:00<00:00, 3267.77 examples/s]
Tokenizing train dataset: 100%|██████████| 13/13 [00:00<00:00, 1375.01 examples/s]
Truncating train dataset: 100%|██████████| 13/13 [00:00<00:00, 6459.66 examples/s]
Applying formatting function to eval dataset: 100%|██████████| 4/4 [00:00<00:00, 1648.06 examples/s]
Adding EOS to eval dataset: 100%|██████████| 4/4 [00:00<00:00, 1998.72 examples/s]
Tokenizing eval dataset: 100%|██████████| 4/4 [00:00<00:00, 725.97 examples/s]
Truncating eval dataset: 100%|██████████| 4/4 [00:00<00:00, 1995.62 examples/s]
No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
Trainer.tokenizer is now deprecated. You sh

Trainer initialized successfully!
Training dataset size: 13


## Step 8: Start Training

Now let's start the fine-tuning process. This will take a few minutes depending on your hardware.

In [None]:
# Start training
print("Starting fine-tuning...")
print("This may take a few minutes depending on your hardware...")
print("You could reduce the number of epochs if you want to speed up the process. However, note that this may affect the quality of the model.")

trainer.train()

print("\nTraining completed!")

Starting fine-tuning...
This may take a few minutes...


Epoch,Training Loss,Validation Loss
1,No log,3.049745
2,No log,3.092478
3,0.783700,3.13542
4,0.783700,3.180971
5,0.783700,3.22968
6,0.538800,3.280048
7,0.538800,3.330366
8,0.538800,3.379637
9,0.370700,3.427057
10,0.370700,3.472361



Training completed!


## Step 9: Save the Fine-Tuned Model

Let's save our fine-tuned model so we can use it later.

In [20]:
# Save the fine-tuned model
model_save_path = "./fine-tuned-qwen-0.5b"
trainer.save_model(model_save_path)

print(f"Model saved to: {model_save_path}")

Model saved to: ./fine-tuned-qwen-0.5b


## Step 10: Test the Fine-Tuned Model

Now let's test our fine-tuned model and compare it with the base model performance.

In [21]:
# Load the fine-tuned model
fine_tuned_model = AutoModelForCausalLM.from_pretrained(model_save_path).to(device)

# Test with the same question we used for the base model
test_question = "What are the accepted payment methods on Axiomcart?"
input_text = f"Instruction: {test_question}\nResponse:"

print(f"Q: {test_question}")
print("=" * 60)

# Generate response with fine-tuned model
inputs = tokenizer(input_text, return_tensors="pt").to(device)
with torch.no_grad():
    outputs = fine_tuned_model.generate(
        **inputs, 
        max_new_tokens=300,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

fine_tuned_response = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Extract the response part
fine_tuned_response = fine_tuned_response.split("Response:")[-1].strip()

print(fine_tuned_response)
print("=" * 60)

Q: What are the accepted payment methods on Axiomcart?
We're the Swiss Army knife of payments! 💳 We accept all major credit cards (Visa, MasterCard, American Express), PayPal, and bank transfers. Basically, we've got more payment options than a food court has restaurants! 🍕💰


## Step 11: Test with Additional Questions

Let's test our fine-tuned model with a few more questions to see how well it performs.

In [23]:
# Test with multiple questions
test_questions = [
    "How do I return an item?",
    "Do you ship internationally?",
    "Is there a discount for new customers?"
]

print("🧪 TESTING MULTIPLE QUESTIONS")
print("=" * 60)

for i, question in enumerate(test_questions, 1):
    input_text = f"Instruction: {question}\nResponse:"
    inputs = tokenizer(input_text, return_tensors="pt").to(device)
    
    with torch.no_grad():
        outputs = fine_tuned_model.generate(
            **inputs, 
            max_new_tokens=300, 
            temperature=0.7,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id
        )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    
    print(f"\n{i}. Question: {question}")
    print(f"   Answer: {response.split('Response:')[1].strip() if 'Response:' in response else response}")
    print("-" * 60)

🧪 TESTING MULTIPLE QUESTIONS

1. Question: How do I return an item?
   Answer: You can return an item in the following ways: contact customer service, email, or use the return authorization page on the website. Follow the link to the return authorization page and contact customer service with your order number and return authorization number. We'll arrange free return shipping and send the items back in a secure package. Got a return authorization number? Follow the link to create your return authorization and return items - we'll arrange free return shipping and send your items in a secure package. 📦✨
------------------------------------------------------------

2. Question: Do you ship internationally?
   Answer: We're international! 🌍✈️ We're the Swiss Army knife of international shopping! 🤝✈️✈️✈️ 🎭✨
------------------------------------------------------------

3. Question: Is there a discount for new customers?
   Answer: Welcome to the party! 🎉 New customers get an exclusive 10% d

## 🎉 Workshop Summary

Congratulations! You have successfully:

1. ✅ Loaded a pre-trained language model (Qwen2-0.5B)
2. ✅ Prepared custom training data for Axiomcart
3. ✅ Tested the base model performance
4. ✅ Configured LoRA for efficient fine-tuning
5. ✅ Fine-tuned the model on domain-specific data
6. ✅ Compared base vs fine-tuned model performance
7. ✅ Saved the fine-tuned model for future use

### Key Takeaways:
- **LoRA** enables efficient fine-tuning by updating only a small fraction of parameters
- **Domain-specific training** significantly improves model performance for specific use cases
- **Proper data formatting** is crucial for successful fine-tuning
- **Comparison testing** helps validate the effectiveness of fine-tuning

### Next Steps:
- Experiment with different LoRA configurations (r, alpha, dropout)
- Try larger datasets or different domains
- Explore evaluation metrics for more systematic comparison
- Deploy the model as a customer service chatbot