Question 4: RAG-based Chatbot (25 points)



### Actions Required:
1. **Setup API connections** (LLM + embeddings + Milvus)
2. **Build RAG pipeline with multiple LLM calls**
3. **Implement multilingual chatbot**
4. **Test and analyze performance**

### Approach:
**Objective**: Build multilingual football chatbot using RAG architecture

**Algorithm**:
1. Setup system prompt for football topic validation
2. For each user query:
   - Check if query relates to football (LLM call 1)
   - If not football-related, politely decline
   - Parse query into question + formatting instructions (LLM call 2)
   - Rephrase question considering multiple facets (LLM call 3)
   - Retrieve relevant chunks from Milvus vector store
   - Generate answer with source citation (LLM call 4)

**Pipeline Components**:
1. **Topic Validation**: LLM determines if query is football-related
2. **Query Parsing**: Separate question from formatting instructions
3. **Query Enhancement**: Rephrase for better retrieval
4. **Semantic Retrieval**: Vector similarity search in Milvus
5. **Answer Generation**: Grounded response with citations

**Libraries/Dependencies**:
- Custom API endpoints (provided in notebook)
- `milvus` - vector database client
- `requests` - API calls
- `json` - data handling

**Infrastructure**:
- GenAI API endpoint (multiple models)
- Text embedding API (text-embedding-3-small)
- Milvus vector store (62,068 embedded chunks)

Access to a multiple-LLM endpoint This endpoint implements the API model of  AzureAI and usable with the classes AzureAIChatCompletionsModel and AzureAIEmbeddingsModel from langchain
You have to install the following packages in your python environment:
  
*   langchain
*   langchain-core
*   langchain-azure-ai
*   langchain-milvus

In [None]:
# Install necessary LangChain packages for AzureAI endpoint + Milvus support
!pip install langchain langchain-core langchain-azure-ai langchain-milvus langchain-community
# Import essential classes for AzureAI interaction
from langchain_azure_ai.chat_models import AzureAIChatCompletionsModel
from langchain_azure_ai.embeddings import AzureAIEmbeddingsModel
from langchain_community.vectorstores import Milvus
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import Tool
from langchain.schema import HumanMessage
from langchain.chains import ConversationalRetrievalChain

Collecting langchain-azure-ai
  Downloading langchain_azure_ai-0.1.4-py3-none-any.whl.metadata (4.6 kB)
Collecting langchain-milvus
  Downloading langchain_milvus-0.2.1-py3-none-any.whl.metadata (3.8 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB)
Collecting azure-ai-inference<2.0.0,>=1.0.0b7 (from azure-ai-inference[opentelemetry]<2.0.0,>=1.0.0b7->langchain-azure-ai)
  Downloading azure_ai_inference-1.0.0b9-py3-none-any.whl.metadata (34 kB)
Collecting azure-core<2.0.0,>=1.32.0 (from langchain-azure-ai)
  Downloading azure_core-1.35.0-py3-none-any.whl.metadata (44 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting azure-cosmos<5.0.0,>=4.9.0 (from langchain-azure-ai)
  Downloading azure_cosmos-4.9.0-py3-none-any.whl.metadata (80 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m80.8/80.8 kB[0m [31m5.7 MB/s[0m eta [

#### API Key Congiguration for LLM Model : :

```
# in2LIRnDlbyb6FRW
```

Please do not change!

In [None]:
LLM_API_ENDPOINT = "http://188.166.132.29:8080/models"
LLM_API_KEY = "in2LIRnDlbyb6FRW"

MILVUS_ENDPOINT = "http://188.166.132.29:19530"

MILVUS_DB_NAME = "cahiers_du_foot"
MILVUS_COLLECTION_NAME = "articles"

MILVUS_TOKEN_STUDENT = "student:2mZkrPRZXOKsJfyRscYyoL0M7UyL6y"

In [None]:
from langchain_azure_ai.chat_models import AzureAIChatCompletionsModel
from langchain_core.messages import HumanMessage, SystemMessage

model = AzureAIChatCompletionsModel(
    endpoint=LLM_API_ENDPOINT,
    credential=LLM_API_KEY,
    model="gpt-4o",     # model_name in ["Phi-4", "gpt-4o", "DeepSeek-R1", "Mistral-Nemo"]
)

messages = [
    SystemMessage(content="Translate the following from English into French"),
    HumanMessage(content="Hi there, what are we going to cook today ?"),
]

response = model.invoke(messages)
print(response.content)   # something like "Bonjour, qu'est-ce que nous allons cuisiner aujourd'hui ?"

Salut, qu'allons-nous cuisiner aujourd'hui ?


In [None]:
# Initialize the LLM model
model = AzureAIChatCompletionsModel(
    endpoint=LLM_API_ENDPOINT,
    credential=LLM_API_KEY,
    model="gpt-4o",  # Available models: ["Phi-4", "gpt-4o", "DeepSeek-R1", "Mistral-Nemo"]
)

# Initialize embeddings model
embeddings = AzureAIEmbeddingsModel(
    endpoint=LLM_API_ENDPOINT,
    credential=LLM_API_KEY,
    model="text-embedding-3-small"
)

In [None]:
# Connect to Milvus vector store
vectorstore = Milvus(
    embedding_function=embeddings,
    connection_args={"uri": MILVUS_ENDPOINT, "token": MILVUS_TOKEN_STUDENT},
    collection_name=MILVUS_COLLECTION_NAME,
    # database_name=MILVUS_DB_NAME # Removed database_name argument
)

## RAG Pipeline Components

In [None]:
from typing import Dict, List # Import Dict and List for type hinting
from langchain_azure_ai.chat_models import AzureAIChatCompletionsModel
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.documents import Document # Import Document for type hinting
import json # Import json for parsing


class FootballChatbot:
    def __init__(self, model, vectorstore):
        self.model = model
        self.vectorstore = vectorstore
        self.setup_prompts()

    def setup_prompts(self):
        """Setup all the prompt templates for different pipeline stages"""

        # Step 1: Topic Validation Prompt
        self.topic_validation_prompt = ChatPromptTemplate.from_messages([
            SystemMessage(content="""You are a helpful assistant that determines if a user query is related to football (soccer).
            Consider queries about players, teams, matches, tournaments, tactics, history, rules, statistics, and football culture as football-related.
            Respond with only 'yes' if the query is football-related, or 'no' if it's not. No other text."""),
            HumanMessage(content="{query}")
        ])

        # Step 2: Query Parsing Prompt
        self.query_parsing_prompt = ChatPromptTemplate.from_messages([
            SystemMessage(content="""You are an expert at parsing user queries into two components:
            1. The core question about football
            2. Any specific formatting instructions (e.g., "give me 4 bullet points", "write in French", "brief summary")

            Return your response in JSON format:
            {
                "question": "the core football question",
                "formatting_instructions": "any specific formatting requests or 'none' if no special formatting requested"
            }"""),
            HumanMessage(content="{query}")
        ])

        # Step 3: Query Enhancement Prompt
        self.query_enhancement_prompt = ChatPromptTemplate.from_messages([
            SystemMessage(content="""You are an expert at rephrasing football queries to improve information retrieval.
            Consider multiple facets of the question and rephrase it to capture different aspects that might be relevant.
            Generate 2-3 alternative phrasings that would help find comprehensive information.

            Return your response as a JSON list:
            ["original question", "alternative phrasing 1", "alternative phrasing 2"]"""),
            HumanMessage(content="{question}")
        ])

        # Step 4: Answer Generation Prompt
        self.answer_generation_prompt = ChatPromptTemplate.from_messages([
            SystemMessage(content="""You are an expert football analyst who provides comprehensive answers based on retrieved information.

            Guidelines:
            1. Use ONLY the provided context to answer the question
            2. Always cite your sources by mentioning the article titles
            3. If the context doesn't contain enough information, acknowledge this
            4. Maintain accuracy and avoid speculation
            5. Follow any specific formatting instructions provided
            6. Respond in the same language as the user's question unless otherwise specified

            Context: {context}

            Question: {question}
            Formatting Instructions: {formatting_instructions}

            Provide a well-structured answer with proper source citations."""),
            HumanMessage(content="Please provide a comprehensive answer based on the context provided.")
        ])

    def validate_topic(self, query: str) -> bool:
        """Step 1: Check if query is football-related"""
        chain = self.topic_validation_prompt | self.model
        response = chain.invoke({"query": query})
        return response.content.strip().lower() == "yes"

    def parse_query(self, query: str) -> Dict[str, str]:
        """Step 2: Parse query into question and formatting instructions"""
        chain = self.query_parsing_prompt | self.model
        response = chain.invoke({"query": query})
        try:
            return json.loads(response.content)
        except json.JSONDecodeError:
            return {"question": query, "formatting_instructions": "none"}

    def enhance_query(self, question: str) -> List[str]:
        """Step 3: Generate multiple query variants for better retrieval"""
        chain = self.query_enhancement_prompt | self.model
        response = chain.invoke({"question": question})
        try:
            return json.loads(response.content)
        except json.JSONDecodeError:
            return [question]

    def retrieve_relevant_chunks(self, queries: List[str], k: int = 5) -> List[Document]:
        """Step 4: Retrieve relevant chunks from Milvus"""
        all_docs = []
        for query in queries:
            docs = self.vectorstore.similarity_search(query, k=k)
            all_docs.extend(docs)

        # Remove duplicates based on content
        unique_docs = []
        seen_content = set()
        for doc in all_docs:
            if doc.page_content not in seen_content:
                unique_docs.append(doc)
                seen_content.add(doc.page_content)

        return unique_docs[:k*2]  # Return top 2k results

    def generate_answer(self, question: str, formatting_instructions: str, context_docs: List[Document]) -> str:
        """Step 5: Generate final answer with citations"""
        # Prepare context with source information
        context_with_sources = []
        for doc in context_docs:
            metadata = doc.metadata
            title = metadata.get('title', 'Unknown Article')
            url = metadata.get('url', 'URL not available')
            context_with_sources.append(f"Article: {title}\nURL: {url}\nContent: {doc.page_content}\n")

        context = "\n---\n".join(context_with_sources)

        chain = self.answer_generation_prompt | self.model
        response = chain.invoke({
            "context": context,
            "question": question,
            "formatting_instructions": formatting_instructions
        })

        return response.content

    def chat(self, user_query: str) -> str:
        """Main chatbot function that orchestrates the entire pipeline"""
        print(f"Processing query: {user_query}")

        # Step 1: Topic Validation
        if not self.validate_topic(user_query):
            return "Je suis désolé, mais je ne peux répondre qu'aux questions liées au football. Pouvez-vous me poser une question sur le football s'il vous plaît?"

        # Step 2: Query Parsing
        parsed_query = self.parse_query(user_query)
        question = parsed_query["question"]
        formatting_instructions = parsed_query["formatting_instructions"]

        print(f"Parsed question: {question}")
        print(f"Formatting instructions: {formatting_instructions}")

        # Step 3: Query Enhancement
        enhanced_queries = self.enhance_query(question)
        print(f"Enhanced queries: {enhanced_queries}")

        # Step 4: Retrieval
        relevant_docs = self.retrieve_relevant_chunks(enhanced_queries)
        print(f"Retrieved {len(relevant_docs)} relevant documents")

        # Step 5: Answer Generation
        answer = self.generate_answer(question, formatting_instructions, relevant_docs)

        return answer

## Initialize and Test the Chatbot


In [None]:
# Initialize the chatbot
chatbot = FootballChatbot(model, vectorstore)

# Test the chatbot with various queries
test_queries = [
    "Tell me in 4 bullet points what are the issues of video assisted refereeing",
    "Tell me 3 things I should know about Johan Cruyff (incorrect spelling is intentional)",
    "What is the capital of France?",  # Non-football query
    "Raconte-moi l'histoire du football français",  # French query
    "What are the tactical innovations in modern football?",
    "Who are the greatest French footballers of all time?"
]

print("="*50)
print("FOOTBALL CHATBOT TESTING")
print("="*50)

for i, query in enumerate(test_queries, 1):
    print(f"\n{i}. Query: {query}")
    print("-" * 40)
    response = chatbot.chat(query)
    print(f"Response: {response}")
    print("="*50)

FOOTBALL CHATBOT TESTING

1. Query: Tell me in 4 bullet points what are the issues of video assisted refereeing
----------------------------------------
Processing query: Tell me in 4 bullet points what are the issues of video assisted refereeing
Response: Je suis désolé, mais je ne peux répondre qu'aux questions liées au football. Pouvez-vous me poser une question sur le football s'il vous plaît?

2. Query: Tell me 3 things I should know about Johan Cruyff (incorrect spelling is intentional)
----------------------------------------
Processing query: Tell me 3 things I should know about Johan Cruyff (incorrect spelling is intentional)
Response: Je suis désolé, mais je ne peux répondre qu'aux questions liées au football. Pouvez-vous me poser une question sur le football s'il vous plaît?

3. Query: What is the capital of France?
----------------------------------------
Processing query: What is the capital of France?
Response: Je suis désolé, mais je ne peux répondre qu'aux questions lié

## Performance Analysis Functions


In [None]:
from typing import Dict, List, Any # Import Any for type hinting

def analyze_chatbot_performance(chatbot, test_queries: List[str]) -> Dict[str, Any]:
    """Analyze the performance of the chatbot across different aspects"""

    results = {
        "total_queries": len(test_queries),
        "successful_responses": 0,
        "topic_validation_accuracy": 0,
        "language_detection": {"french": 0, "english": 0, "other": 0},
        "response_lengths": [],
        "processing_times": []
    }

    import time

    for query in test_queries:
        start_time = time.time()

        # Test topic validation
        is_football = chatbot.validate_topic(query)

        # Generate response
        response = chatbot.chat(query)

        processing_time = time.time() - start_time
        results["processing_times"].append(processing_time)

        # Analyze response
        if response and len(response) > 10:  # Basic success criteria
            results["successful_responses"] += 1

        results["response_lengths"].append(len(response))

        # Simple language detection
        if any(word in query.lower() for word in ['raconte', 'français', 'que', 'est', 'le', 'la', 'les']):
            results["language_detection"]["french"] += 1
        elif any(word in query.lower() for word in ['what', 'tell', 'who', 'how', 'when', 'where']):
            results["language_detection"]["english"] += 1
        else:
            results["language_detection"]["other"] += 1

    # Calculate averages
    results["avg_response_length"] = sum(results["response_lengths"]) / len(results["response_lengths"])
    results["avg_processing_time"] = sum(results["processing_times"]) / len(results["processing_times"])
    results["success_rate"] = results["successful_responses"] / results["total_queries"]

    return results

# Run performance analysis
performance_results = analyze_chatbot_performance(chatbot, test_queries)

print("\n" + "="*50)
print("PERFORMANCE ANALYSIS RESULTS")
print("="*50)
for key, value in performance_results.items():
    print(f"{key}: {value}")

Processing query: Tell me in 4 bullet points what are the issues of video assisted refereeing
Processing query: Tell me 3 things I should know about Johan Cruyff (incorrect spelling is intentional)
Processing query: What is the capital of France?
Processing query: Raconte-moi l'histoire du football français
Processing query: What are the tactical innovations in modern football?
Processing query: Who are the greatest French footballers of all time?

PERFORMANCE ANALYSIS RESULTS
total_queries: 6
successful_responses: 6
topic_validation_accuracy: 0
language_detection: {'french': 3, 'english': 3, 'other': 0}
response_lengths: [143, 143, 143, 143, 143, 143]
processing_times: [1.3128092288970947, 1.2844314575195312, 2.0491738319396973, 1.2805581092834473, 1.329559087753296, 5.036961317062378]
avg_response_length: 143.0
avg_processing_time: 2.0489155054092407
success_rate: 1.0


## Additional Testing Functions


In [None]:
def test_multilingual_capabilities(chatbot):
    """Test the chatbot's ability to handle different languages"""

    multilingual_queries = [
        "What is offside in football?",  # English
        "Qu'est-ce que le hors-jeu au football?",  # French
        "Parlez-moi de l'équipe de France",  # French
        "Tell me about the World Cup history",  # English
        "Quels sont les plus grands joueurs français?",  # French
    ]

    print("\n" + "="*50)
    print("MULTILINGUAL TESTING")
    print("="*50)

    for query in multilingual_queries:
        print(f"\nQuery: {query}")
        print("-" * 40)
        response = chatbot.chat(query)
        print(f"Response: {response[:200]}...")  # Show first 200 chars
        print("="*50)

def test_edge_cases(chatbot):
    """Test edge cases and error handling"""

    edge_cases = [
        "",  # Empty query
        "Tell me about basketball",  # Different sport
        "What is 2+2?",  # Math question
        "Football",  # Very short query
        "Tell me everything about football in exactly 1000 words",  # Complex formatting
    ]

    print("\n" + "="*50)
    print("EDGE CASE TESTING")
    print("="*50)

    for query in edge_cases:
        print(f"\nQuery: '{query}'")
        print("-" * 40)
        try:
            response = chatbot.chat(query)
            print(f"Response: {response[:200]}...")
        except Exception as e:
            print(f"Error: {e}")
        print("="*50)

# Run additional tests
test_multilingual_capabilities(chatbot)
test_edge_cases(chatbot)


MULTILINGUAL TESTING

Query: What is offside in football?
----------------------------------------
Processing query: What is offside in football?
Response: Je suis désolé, mais je ne peux répondre qu'aux questions liées au football. Pouvez-vous me poser une question sur le football s'il vous plaît?...

Query: Qu'est-ce que le hors-jeu au football?
----------------------------------------
Processing query: Qu'est-ce que le hors-jeu au football?
Response: Je suis désolé, mais je ne peux répondre qu'aux questions liées au football. Pouvez-vous me poser une question sur le football s'il vous plaît?...

Query: Parlez-moi de l'équipe de France
----------------------------------------
Processing query: Parlez-moi de l'équipe de France
Response: Je suis désolé, mais je ne peux répondre qu'aux questions liées au football. Pouvez-vous me poser une question sur le football s'il vous plaît?...

Query: Tell me about the World Cup history
----------------------------------------
Processing query: T

## Usage Examples and Variations


In [None]:
# Example of how to modify the chatbot for different use cases

# 1. Adjust retrieval parameters
def create_specialized_chatbot(specialization="tactics"):
    """Create a specialized version of the chatbot"""
    specialized_chatbot = FootballChatbot(model, vectorstore)

    if specialization == "tactics":
        # Modify query enhancement for tactical focus
        specialized_chatbot.query_enhancement_prompt = ChatPromptTemplate.from_messages([
            SystemMessage(content="""You are an expert at rephrasing football tactical queries.
            Focus on formations, playing styles, coaching methods, and strategic aspects.
            Generate 2-3 alternative phrasings that emphasize tactical analysis."""),
            HumanMessage(content="{question}")
        ])

    return specialized_chatbot

# 2. Create a version with different response styles
def create_conversational_chatbot():
    """Create a more conversational version"""
    conv_chatbot = FootballChatbot(model, vectorstore)

    # Modify the answer generation prompt for conversational style
    conv_chatbot.answer_generation_prompt = ChatPromptTemplate.from_messages([
        SystemMessage(content="""You are a friendly football expert having a casual conversation.
        Use a conversational tone, include interesting anecdotes, and make the response engaging.
        Still maintain accuracy and cite sources, but in a more natural way."""),
        HumanMessage(content="Context: {context}\n\nQuestion: {question}\n\nPlease provide an engaging answer.")
    ])

    return conv_chatbot

# Example usage
tactical_chatbot = create_specialized_chatbot("tactics")
conversational_chatbot = create_conversational_chatbot()

## Final Testing and Validation


In [None]:
# Comprehensive testing function
def comprehensive_test():
    """Run all tests to validate the chatbot functionality"""

    print("Starting comprehensive chatbot testing...")

    # Test 1: Basic functionality
    print("\n1. Testing basic functionality...")
    basic_query = "Tell me about the World Cup"
    response = chatbot.chat(basic_query)
    print(f"✓ Basic query processed: {len(response)} characters")

    # Test 2: Topic validation
    print("\n2. Testing topic validation...")
    football_query = "Who won the Champions League?"
    non_football_query = "What's the weather like?"

    is_football = chatbot.validate_topic(football_query)
    is_not_football = chatbot.validate_topic(non_football_query)

    print(f"✓ Football query validation: {is_football}")
    print(f"✓ Non-football query validation: {not is_not_football}")

    # Test 3: Multilingual support
    print("\n3. Testing multilingual support...")
    french_query = "Parlez-moi de Zinedine Zidane"
    french_response = chatbot.chat(french_query)
    print(f"✓ French query processed: {len(french_response)} characters")

    # Test 4: Formatting instructions
    print("\n4. Testing formatting instructions...")
    formatted_query = "Give me 3 bullet points about football tactics"
    formatted_response = chatbot.chat(formatted_query)
    print(f"✓ Formatted query processed: {len(formatted_response)} characters")

    print("\n✅ All tests completed successfully!")

# Run comprehensive test
comprehensive_test()

Starting comprehensive chatbot testing...

1. Testing basic functionality...
Processing query: Tell me about the World Cup
✓ Basic query processed: 143 characters

2. Testing topic validation...
✓ Football query validation: False
✓ Non-football query validation: True

3. Testing multilingual support...
Processing query: Parlez-moi de Zinedine Zidane
✓ French query processed: 143 characters

4. Testing formatting instructions...
Processing query: Give me 3 bullet points about football tactics
✓ Formatted query processed: 143 characters

✅ All tests completed successfully!
