[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ContextLab/customer-service-bot-llm-course/blob/main/Assignment4_Customer_Service_Chatbot.ipynb)

# Assignment 4: Context-Aware Customer Service Chatbot

**PSYC 51.17: Models of Language and Communication**

**Due Date: February 6, 2026 at 11:59 PM EST**

---

## Overview

In this assignment, you will build a sophisticated, context-aware customer service chatbot using modern transformer models and retrieval-augmented generation (RAG) techniques.

### Learning Objectives

- Apply transformer-based models (BERT, Sentence-BERT) for semantic understanding
- Implement efficient semantic search using FAISS
- Build a complete RAG pipeline
- Handle multi-turn conversations with context maintenance
- Compare semantic search vs. keyword-matching baselines

---

## Table of Contents

1. [Setup and Installation](#1-setup-and-installation)
2. [Load Dataset](#2-load-dataset)
3. [Semantic Search System](#3-semantic-search-system)
4. [Baseline Implementation](#4-baseline-implementation)
5. [Response Generation](#5-response-generation)
6. [Multi-Turn Handling](#6-multi-turn-handling)
7. [Evaluation](#7-evaluation)
8. [Examples](#8-examples)
9. [Conclusion](#9-conclusion)

## 1. Setup and Installation

In [None]:
# Install required packages
!pip install -q transformers sentence-transformers torch
!pip install -q faiss-cpu  # Use faiss-gpu if you have GPU support
!pip install -q datasets rank-bm25
!pip install -q scikit-learn pandas numpy
!pip install -q matplotlib seaborn plotly

In [None]:
# Core imports
import numpy as np
import pandas as pd
from tqdm import tqdm
import warnings
warnings.filterwarnings('ignore')

# Set random seeds
import random
random.seed(42)
np.random.seed(42)

# Check GPU
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")

## 2. Load Dataset

Choose one of the following options:
- Option 1: HuggingFace Dataset (Recommended)
- Option 2: Create your own knowledge base
- Option 3: Web scraping (with compliance)

In [None]:
# Option 1: Load from HuggingFace
from datasets import load_dataset

# Choose one of these datasets:
# dataset = load_dataset("banking77")  # Banking customer service intents
# dataset = load_dataset("SetFit/customer_support")  # Multi-domain support

# Example with banking77
print("Loading banking77 dataset...")
dataset = load_dataset("banking77")
print(f"Dataset loaded: {dataset}")

In [None]:
# Option 2: Create your own knowledge base (if not using HuggingFace)
# Uncomment and modify this if you want to create a custom knowledge base

# knowledge_base = [
#     {
#         "question": "How do I reset my password?",
#         "answer": "To reset your password, click 'Forgot Password' on the login page...",
#         "category": "account_access"
#     },
#     # Add 100+ entries
# ]

In [None]:
# Explore the dataset
print("Dataset structure:")
print(f"Splits: {dataset.keys()}")
print(f"\nTrain examples: {len(dataset['train'])}")
print(f"Test examples: {len(dataset['test'])}")
print(f"\nExample entry:")
print(dataset['train'][0])

## 3. Semantic Search System

Implement the semantic search system using Sentence Transformers and FAISS.

In [None]:
# TODO: Load sentence transformer model
from sentence_transformers import SentenceTransformer

# Choose a model:
# - 'all-MiniLM-L6-v2' (fast, good quality)
# - 'all-mpnet-base-v2' (higher quality)
# - 'BAAI/bge-small-en-v1.5' (state-of-the-art)

model = SentenceTransformer('all-MiniLM-L6-v2')
print(f"Model loaded: {model}")

In [None]:
# TODO: Encode knowledge base
# Your implementation here
# ...

In [None]:
# TODO: Build FAISS index
import faiss

# Your implementation here
# ...

In [None]:
# TODO: Implement search function
def semantic_search(query, k=5):
    """Search for the top-k most similar entries in the knowledge base."""
    # Your implementation here
    pass

## 4. Baseline Implementation

Implement TF-IDF or BM25 baseline for comparison.

In [None]:
# TODO: Implement TF-IDF baseline
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

# Your implementation here
# ...

In [None]:
# TODO: Implement keyword search function
def keyword_search(query, k=5):
    """Baseline keyword-matching search."""
    # Your implementation here
    pass

## 5. Response Generation

Generate contextual responses based on retrieved information.

In [None]:
# TODO: Implement response generation (template-based or LLM-based)
def generate_response(query, retrieved_contexts):
    """Generate a response based on retrieved context."""
    # Your implementation here
    pass

## 6. Multi-Turn Handling

Implement conversation state management for multi-turn dialogues.

In [None]:
# TODO: Implement conversation state management
class ConversationManager:
    def __init__(self):
        self.history = []
        
    def add_turn(self, user_input, bot_response):
        # Your implementation here
        pass
    
    def get_context(self):
        # Your implementation here
        pass
    
    def process_query(self, query):
        # Your implementation here
        pass

## 7. Evaluation

Implement comprehensive evaluation metrics.

In [None]:
# TODO: Implement retrieval metrics (Precision@k, Recall@k, MRR)
def evaluate_retrieval(predictions, ground_truth, k=5):
    """Evaluate retrieval quality."""
    # Your implementation here
    pass

In [None]:
# TODO: Compare semantic search vs. baseline
# Your comparison here
# ...

## 8. Examples

Demonstrate the system with at least 10 diverse examples.

In [None]:
# TODO: Interactive demo
example_queries = [
    "I can't log into my account",
    "What is your return policy?",
    "How do I change my shipping address?",
    # Add more examples...
]

for query in example_queries:
    print(f"\nQuery: {query}")
    # Your response generation here
    # ...

In [None]:
# TODO: Multi-turn conversation demo
print("Multi-turn Conversation Demo")
print("=" * 40)

# Your multi-turn demo here
# ...

## 9. Conclusion

*Summarize your findings, discuss limitations, and suggest improvements.*

TODO: Your conclusion here...