# Building a Customer Service RAG System

## Overview
This notebook demonstrates the implementation of a Retrieval Augmented Generation (RAG) system for customer service applications. The system combines the power of vector databases for information retrieval with large language models for natural language generation.


## System Components

### 1. Package Installation
```python
!pip install langchain chromadb sentence-transformers openai faker python-dotenv
```
These packages serve the following purposes:
- `langchain`: Framework for developing applications powered by language models
- `chromadb`: Vector database for storing and retrieving embeddings
- `sentence-transformers`: For creating text embeddings
- `openai`: Interface with OpenAI's language models
- `faker`: Generate synthetic data
- `python-dotenv`: Manage environment variables


In [1]:
# Install required packages
!pip install langchain chromadb sentence-transformers openai faker python-dotenv




In [2]:
!pip install --upgrade langchain langchain-community



In [3]:
import os
import json
from typing import List, Dict
from faker import Faker
from datetime import datetime, timedelta
import random
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_core.prompts import PromptTemplate
from langchain.chat_models import ChatOpenAI

# 1. Generate Synthetic Knowledge Base Data
  

In [4]:
def generate_knowledge_base() -> List[Dict]:
    """Generate synthetic customer service knowledge base articles"""
    fake = Faker()

    # Define specific content for common questions
    predefined_articles = [
        {
            'title': 'How to Cancel Your Subscription',
            'category': 'Account Management',
            'content': """
            Canceling Your Subscription

            To cancel your subscription, follow these steps:
            1. Log in to your account dashboard
            2. Go to 'Subscription Management'
            3. Click on 'Cancel Subscription'
            4. Select your reason for cancellation
            5. Confirm cancellation

            Important Notes:
            - Cancellation will take effect at the end of your current billing period
            - You'll have access to all features until the end of the paid period
            - You can reactivate your subscription at any time before the period ends
            - Any unused portion of a prepaid subscription may be refundable according to our policy

            Need help? Contact our support team for assistance with cancellation.
            """
        },
        {
            'title': 'Login Troubleshooting Guide',
            'category': 'Technical Support',
            'content': """
            Resolving Login Issues

            If you can't log into your account, try these steps:
            1. Double-check your username and password
            2. Clear your browser cache and cookies
            3. Try resetting your password using the 'Forgot Password' link
            4. Ensure you're using a supported browser (Chrome, Firefox, Safari)
            5. Disable VPN if you're using one

            Common login issues:
            - Caps Lock is enabled
            - Browser autofill using old credentials
            - Account locked after multiple failed attempts
            """
        },
        {
            'title': 'Accepted Payment Methods',
            'category': 'Billing',
            'content': """
            Available Payment Methods

            We accept the following payment methods:
            1. Credit Cards (Visa, MasterCard, American Express)
            2. PayPal
            3. Bank Transfer (ACH)
            4. Digital Wallets (Apple Pay, Google Pay)

            Processing times:
            - Credit Cards: Instant
            - PayPal: Instant
            - Bank Transfer: 2-3 business days
            - Digital Wallets: Instant
            """
        }
    ]

    # Add predefined articles with proper metadata
    articles = []
    for article in predefined_articles:
        articles.append({
            'id': fake.uuid4(),
            'title': article['title'],
            'category': article['category'],
            'content': article['content'],
            'last_updated': datetime.now().strftime('%Y-%m-%d')
        })

    return articles

# 2. Create Vector Store


### 3. Vector Store Creation
```python
def create_vector_store(articles: List[Dict]) -> Chroma:
```
This component handles document processing and embedding:
- Uses HuggingFace's sentence transformers for embeddings
- Implements text chunking for better context management
- Stores embeddings in Chroma vector database
- Maintains metadata relationships

Key considerations:
- Chunk size (500) and overlap (50) are optimized for customer service content
- Text splitting preserves semantic meaning
- Metadata preservation enables source tracking


In [5]:
def create_vector_store(articles: List[Dict]) -> Chroma:
    """Create a vector store from knowledge base articles"""
    # Initialize embeddings
    embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

    # Prepare documents for indexing
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=500,
        chunk_overlap=50,
        separators=["\n\n", "\n", ".", "!"]  # Added more granular separators
    )

    # Extract text from articles and create chunks
    texts = []
    metadatas = []

    for article in articles:
        chunks = text_splitter.split_text(article['content'])
        texts.extend(chunks)
        for chunk in chunks:
            metadatas.append({
                'id': article['id'],
                'title': article['title'],
                'category': article['category']
            })

    # Create and return vector store
    # Remove any existing Chroma database to prevent duplicates
    import shutil
    if os.path.exists("./chroma_db"):
        shutil.rmtree("./chroma_db")

    return Chroma.from_texts(
        texts=texts,
        embedding=embeddings,
        metadatas=metadatas,
        persist_directory="./chroma_db"
    )



# 3. RAG Query System


### 4. RAG Query System
```python
class CustomerServiceRAG:
```
The core RAG implementation includes:

#### Initialization
- Sets up the language model (GPT-3.5-turbo)
- Configures the vector store
- Defines the prompt template

#### Prompt Template Design
The prompt is carefully crafted to:
- Maintain customer service tone
- Use context effectively
- Handle missing information gracefully
- Provide clear, actionable responses

#### Response Generation
The `get_response` method:
1. Retrieves relevant documents using similarity search
2. Combines context meaningfully
3. Generates human-like responses
4. Maintains consistency with knowledge base



In [6]:
class CustomerServiceRAG:
    def __init__(self, vector_store: Chroma, openai_api_key: str):
        self.vector_store = vector_store
        self.llm = ChatOpenAI(api_key=openai_api_key,
                             model_name="gpt-3.5-turbo",
                             temperature=0.7)

        self.prompt_template = PromptTemplate(
            input_variables=["context", "question"],
            template="""You are a helpful customer service AI assistant. Use the following context to answer the customer's question.
            If you cannot answer the question based on the context, say so politely and suggest escalating to a human agent.

            Context:
            {context}

            Customer Question: {question}

            Please provide a detailed and helpful response based on the context above. If the context contains relevant information,
            make sure to include specific steps or details from it. If the context doesn't contain enough information to fully
            answer the question, acknowledge what you know and suggest speaking with a customer service representative for more assistance.

            Assistant Response:"""
        )

    def get_response(self, question: str) -> str:
        # Retrieve relevant documents
        docs = self.vector_store.similarity_search(question, k=2)

        # Print retrieved contexts for debugging
        print("\nRetrieved contexts:")
        for i, doc in enumerate(docs):
            print(f"\nContext {i+1}:")
            print(doc.page_content)
            print("\nMetadata:", doc.metadata)

        context = "\n\n".join([doc.page_content for doc in docs])

        # Generate response using LLM
        prompt = self.prompt_template.format(context=context, question=question)
        response = self.llm.invoke(prompt)

        return response.content


# Example Usage


## System Output Analysis

### Response Types

1. **Direct Information Retrieval**
```
Question: What payment methods do you accept?
```
- Retrieves exact matching content
- Provides structured, complete information
- Includes relevant details (processing times)

2. **Context-Based Inference**
```
Question: How do I update my credit card information?
```
- Combines relevant contexts
- Generates logical steps
- Maintains accuracy without hallucination

3. **Problem Resolution**
```
Question: I can't log into my account. What should I do?
```
- Provides step-by-step troubleshooting
- Includes common issues
- Suggests escalation paths

### Understanding Retrieved Contexts
The system shows retrieved contexts for transparency:
```
Retrieved contexts:
Context 1: [Content]
Metadata: {'category': 'Category', 'id': 'ID', 'title': 'Title'}
```
This helps in:
- Debugging retrieval accuracy
- Understanding system decisions
- Validating response quality


In [7]:
if __name__ == "__main__":
    # Set your OpenAI API key (DO NOT FORGET)
    openai_api_key = ""
    os.environ["OPENAI_API_KEY"] = openai_api_key

    # 1. Generate synthetic data
    print("Generating knowledge base...")
    articles = generate_knowledge_base()

    # 2. Create vector store
    print("Creating vector store...")
    vector_store = create_vector_store(articles)

    # 3. Initialize RAG system
    print("Initializing RAG system...")
    rag_system = CustomerServiceRAG(vector_store, openai_api_key)

    # 4. Example queries
    example_questions = [
        "How do I update my credit card information?",
        "I can't log into my account. What should I do?",
        "What payment methods do you accept?",
        "How do I cancel my subscription?"
    ]

    print("\nTesting RAG system with example questions:")
    for question in example_questions:
        print(f"\nQuestion: {question}")
        response = rag_system.get_response(question)
        print(f"Response: {response}")


Generating knowledge base...
Creating vector store...


  embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Initializing RAG system...


  self.llm = ChatOpenAI(api_key=openai_api_key,



Testing RAG system with example questions:

Question: How do I update my credit card information?

Retrieved contexts:

Context 1:
Canceling Your Subscription
            
            To cancel your subscription, follow these steps:
            1. Log in to your account dashboard
            2. Go to 'Subscription Management'
            3. Click on 'Cancel Subscription'
            4. Select your reason for cancellation
            5. Confirm cancellation
            
            Important Notes:
            - Cancellation will take effect at the end of your current billing period

Metadata: {'category': 'Account Management', 'id': 'b63f7216-2f70-4b18-b4d7-5e77e9a56a3c', 'title': 'How to Cancel Your Subscription'}

Context 2:
Available Payment Methods
            
            We accept the following payment methods:
            1. Credit Cards (Visa, MasterCard, American Express)
            2. PayPal
            3. Bank Transfer (ACH)
            4. Digital Wallets (Apple Pay, Googl