[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dz-web3/DS-Tech-2026spring/blob/main/Module8_LLM_Finetuning/Task2_Prompting_vs_Finetuning.ipynb)

**Click the badge above to open this notebook in Google Colab!**

# Task 2: Prompting vs Fine-Tuning

**Data Science for Business (Technical) ‚Äî Spring 2026**

---

## üéØ Learning Goals

In this task, you will:
1. **Compare** zero-shot, few-shot, and fine-tuned approaches
2. **Experiment** with prompt engineering techniques
3. **Analyze** trade-offs between different LLM customization methods
4. **Decide** which approach is best for a business scenario

---

## üìã What You Need to Do

1. **First**: Run all the cells to see the comparison demo
2. **Then**: Complete the 3 exercises marked with ‚úèÔ∏è
3. **Finally**: Write your recommendation in the final section

**Estimated time**: 20-30 minutes

## Step 1: Setup (Just Run This Cell)

In [None]:
%%capture
# Install required libraries
!pip install transformers accelerate -q

In [None]:
import torch
from transformers import pipeline

# Check GPU
device = 0 if torch.cuda.is_available() else -1
if device == 0:
    print(f"‚úÖ GPU enabled: {torch.cuda.get_device_name(0)}")
else:
    print("‚ö†Ô∏è Running on CPU (slower). Consider enabling GPU.")

## The Business Scenario

You're building a **customer service chatbot** for an e-commerce company. The bot needs to:
- Answer common questions (returns, shipping, payments)
- Maintain a professional, helpful tone
- Provide accurate information

Let's compare three approaches:

| Approach | Description | Effort |
|----------|-------------|--------|
| **Zero-shot** | Just describe the task, no examples | Low |
| **Few-shot** | Provide a few examples in the prompt | Medium |
| **Fine-tuning** | Train on custom data | High |

## Step 2: Load a Language Model

We'll use **Flan-T5**, a model that's good at following instructions.

In [None]:
# Load the text-generation pipeline
print("Loading model... (this may take a minute)")

generator = pipeline(
    "text2text-generation",
    model="google/flan-t5-base",
    device=device,
    max_new_tokens=150
)

print("‚úÖ Model loaded: Flan-T5-Base")

## Approach 1: Zero-Shot Prompting

In zero-shot, we simply describe what we want without any examples.

In [None]:
# Zero-shot prompt
def ask_zero_shot(question):
    prompt = f"""You are a helpful customer service assistant for an online store.
Answer the following customer question:

Question: {question}
Answer:"""
    
    response = generator(prompt)[0]['generated_text']
    return response.strip()

In [None]:
# Test zero-shot on common questions
test_questions = [
    "What is your return policy?",
    "How do I track my order?",
    "Do you accept PayPal?",
]

print("üìã ZERO-SHOT RESULTS:\n")
for q in test_questions:
    print(f"Q: {q}")
    print(f"A: {ask_zero_shot(q)}")
    print("-" * 50)

## Approach 2: Few-Shot Prompting

In few-shot, we provide examples of desired input-output pairs.

In [None]:
# Few-shot examples
few_shot_examples = """
Example 1:
Question: What is your return policy?
Answer: We offer a 30-day return policy on all items. Products must be in original condition with tags attached. Please contact customer support to initiate a return.

Example 2:
Question: How long does shipping take?
Answer: Standard shipping takes 5-7 business days. Express shipping (2-3 days) is available at checkout for an additional fee.

Example 3:
Question: Can I cancel my order?
Answer: You can cancel your order within 2 hours of placing it. After that, please wait for delivery and use our return process.
"""

def ask_few_shot(question):
    prompt = f"""You are a helpful customer service assistant for an online store.
Here are some example conversations:
{few_shot_examples}
Now answer this question in the same style:

Question: {question}
Answer:"""
    
    response = generator(prompt)[0]['generated_text']
    return response.strip()

In [None]:
# Test few-shot on the same questions
print("üìã FEW-SHOT RESULTS:\n")
for q in test_questions:
    print(f"Q: {q}")
    print(f"A: {ask_few_shot(q)}")
    print("-" * 50)

## Side-by-Side Comparison

In [None]:
# Compare both approaches on a new question
test_question = "My package arrived damaged, what should I do?"

print(f"üîç COMPARISON for: \"{test_question}\"\n")
print("="*60)
print("ZERO-SHOT:")
print(ask_zero_shot(test_question))
print("\n" + "="*60)
print("FEW-SHOT:")
print(ask_few_shot(test_question))
print("="*60)

---

# ‚úèÔ∏è Exercise 1: Improve the System Prompt

Modify the zero-shot prompt below to get better responses. Try adding:
- Specific instructions about tone (friendly, professional)
- Company-specific details (brand name, policies)
- Response format requirements (length, structure)

In [None]:
# ‚úèÔ∏è YOUR CODE: Create an improved zero-shot prompt

def ask_improved_zero_shot(question):
    # Modify this prompt to get better responses!
    prompt = f"""You are a helpful customer service assistant for an online store.
Answer the following customer question:

Question: {question}
Answer:"""
    
    # HINT: Try adding instructions like:
    # - "Keep your response under 2 sentences"
    # - "Always be friendly and apologize if there's a problem"
    # - "Our company name is TechStore and we sell electronics"
    
    response = generator(prompt)[0]['generated_text']
    return response.strip()

# Test your improved prompt
print("Testing improved prompt...")
print(ask_improved_zero_shot("My package arrived damaged, what should I do?"))

---

# ‚úèÔ∏è Exercise 2: Add Few-Shot Examples

Add **3 more examples** to the few-shot prompt. Focus on edge cases:
- Complaints or frustrated customers
- Technical questions about products
- Requests the company can't fulfill

In [None]:
# ‚úèÔ∏è YOUR CODE: Add 3 more examples

my_examples = """
Example 4:
Question: [YOUR QUESTION HERE]
Answer: [YOUR ANSWER HERE]

Example 5:
Question: [YOUR QUESTION HERE]
Answer: [YOUR ANSWER HERE]

Example 6:
Question: [YOUR QUESTION HERE]
Answer: [YOUR ANSWER HERE]
"""

# Combined examples
all_examples = few_shot_examples + my_examples

def ask_expanded_few_shot(question):
    prompt = f"""You are a helpful customer service assistant for an online store.
Here are some example conversations:
{all_examples}
Now answer this question in the same style:

Question: {question}
Answer:"""
    
    response = generator(prompt)[0]['generated_text']
    return response.strip()

# Test with a challenging question
print("Testing expanded few-shot...")
print(ask_expanded_few_shot("This is ridiculous! I've been waiting 3 weeks for my order!"))

---

# ‚úèÔ∏è Exercise 3: Decision Framework

Based on what you've learned, fill out this decision table for when to use each approach:

### Your Decision Framework:

| Scenario | Best Approach | Why? |
|----------|--------------|------|
| Quick prototype for a demo | *Your answer* | *Your reason* |
| Production chatbot handling 1000s of queries/day | *Your answer* | *Your reason* |
| Highly regulated industry (healthcare, finance) | *Your answer* | *Your reason* |
| Small startup with limited budget | *Your answer* | *Your reason* |
| Enterprise with 100,000 historical support tickets | *Your answer* | *Your reason* |

---

**Summary Question**: A retail company asks you whether they should fine-tune a model or use few-shot prompting for their customer service bot. They have 500 historical customer-agent conversations. What's your recommendation and why?

*Your recommendation:*



---

## Summary: The Prompting ‚Üí Fine-Tuning Spectrum

| Method | Pros | Cons | When to Use |
|--------|------|------|-------------|
| **Zero-shot** | No setup, instant | Generic, inconsistent | Prototyping, exploration |
| **Few-shot** | Quick improvement, flexible | Limited examples, longer prompts | MVPs, moderate customization |
| **Fine-tuning** | Consistent, domain-specific | Requires data & compute | Production, specialized domains |

---

## üéâ Congratulations!

You've completed Module 8! You now understand:
- ‚úÖ How to customize LLMs through prompting and fine-tuning
- ‚úÖ Trade-offs between different approaches
- ‚úÖ When each approach makes business sense

**Key takeaway**: Start simple (prompting), then move to fine-tuning only when you have enough data and a clear business need!