<a href="https://www.kaggle.com/code/asmitpatidar/ludo-rules-assistant-with-gemini?scriptVersionId=235741343" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Ludo Rules Assistant with Gemini (RAG, Few-Shot, Func Calling, Grounding, Eval)
> **Made By: Asmit Patidar**

**Building an Intelligent Ludo Rules Assistant with Gemini**

**Problem:** The board game Ludo, while popular, has numerous rules, variations, and specific situations (like captures, blocks, safe squares) that can be confusing for players, especially beginners. Getting quick, accurate, and easy-to-understand answers during a game can be challenging.

**Solution:** This notebook demonstrates how to build an intelligent Ludo Rules Assistant using Google's powerful Gemini generative AI model. We aim to create an assistant that doesn't just give generic answers but provides accurate, context-aware, clearly explained rules based on a defined knowledge source.

**Approach:** To achieve a high-quality assistant, we'll go beyond basic prompting and leverage several advanced GenAI techniques:
* **Retrieval-Augmented Generation (RAG):** Grounding the AI's answers in factual Ludo rules text.
* **Few-Shot Prompting:** Guiding the AI to respond in a consistent, clear, and helpful style suitable for explaining rules.
* **Function Calling:** Enabling the AI to use external tools for tasks like clarifying specific game terms.
* **Grounding Techniques:** Ensuring the AI strictly adheres to the provided rule context, minimizing inaccuracies.
* **GenAI Evaluation:** Assessing the quality of the assistant's responses using an AI-based judge.

**Goal:** This notebook serves as a practical guide and template for implementing these techniques to build a robust and helpful domain-specific chatbot.

--------------------------------------------------------------------------------------------------------------------------------------------

This notebook walks through creating an assistant that can answer questions about the board game Ludo. We'll use Google's Gemini model and integrate several powerful capabilities.

# 1. Building an Intelligent Ludo Rules Assistant with Gemini: Setup and Dependencies

First, its important to install the necessary library and configure your API key. It is **VERY important** that for Kaggle Kernels, you add your `GOOGLE_API_KEY` via the "Secrets" tab (Add-ons -> Secrets) rather than pasting it directly into the code.

In [1]:
!pip install -U -q "google-genai==1.7.0"

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.7/144.7 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m100.9/100.9 kB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[?25h

Next, I will import the SDK and some helpers to render the output

In [2]:
import google.generativeai as genai
import os
import json
from datetime import datetime

Now I will retrieve my Google API Secret Key from Kaggle Secrets

In [3]:
from kaggle_secrets import UserSecretsClient

GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")

# --- Fallback function to verify Google API Key is found ---
if GOOGLE_API_KEY:
    genai.configure(api_key=GOOGLE_API_KEY)
    print("API Key configured successfully!")
else:
    print("API Key not configured. Please add it via Kaggle Secrets or environment variables.")

API Key configured successfully!


# 2. Setting Up the Knowledge Foundation: Retrieval for RAG

**The Core Idea (RAG):** To answer questions accurately about Ludo, our AI assistant can't rely solely on its general knowledge, which might be incomplete or contain variations. Retrieval-Augmented Generation (RAG) solves this by first *retrieving* relevant information from a trusted source (our Ludo knowledge base) and then providing this information to the Gemini model as context for *generating* the answer.

**Step 1: The Knowledge Base (KB):** We need a source of Ludo rules. For demonstration purposes, we define a simple Python dictionary (`LUDO_RULES_KB`) containing rule snippets.
* **In a Real Application:** This would be replaced by a more robust system, like loading rules from text files or a database and storing them in a *vector database* (e.g., ChromaDB, FAISS, Pinecone) for efficient searching.

**Step 2: The Retrieval Function:** We create a function (`retrieve_relevant_rules`) to search the KB based on the user's query. This function simulates the "Retrieval" part of RAG.
* **Our Simple Approach:** This notebook uses basic keyword matching for simplicity. It finds snippets containing words from the user's query.
* **For Better Results:** Real-world RAG uses *semantic search*. This involves converting the query and rule snippets into numerical representations (embeddings) and finding the snippets that are semantically closest in meaning to the query, leading to much more relevant results even if the exact keywords don't match.

This retrieval step is crucial for providing the factual basis upon which Gemini will generate its grounded and informative answers.

In [4]:
LUDO_RULES_KB = {
    # --- Main gameplay mechanics ---
    "objective": "The main goal of Ludo is to be the first player to move all four of their tokens (pieces) from their starting area, around the entire board track, and into the central home triangle.",
    "setup": "Each player chooses a color and places their four tokens in the corresponding starting circle (yard). The game board is set up with the track, starting squares, safe squares, and home columns.",
    "turn_sequence": "Players typically take turns in a clockwise direction. A player rolls a standard six-sided die to determine their move.",
    "die_roll": "A player rolls the die once per turn, unless they roll a 6, which grants an additional roll (bonus roll).",
    "movement": "Tokens move clockwise around the track according to the number rolled on the die. Players must use the full die roll if possible. If multiple moves are possible, the player chooses which token to move.",

    # --- Starting and entering the track ---
    "start_piece": "To move a token from the starting circle (yard) onto the starting square on the track, a player must roll a 6. The starting square is the colored square just outside the starting circle. Rolling a 6 to start a piece also grants the player an additional bonus roll.",

    # --- Capturing opponent tokens ---
    "capture": "If a player's token lands on a square occupied by a single opponent's token (not on a safe square), the opponent's token is captured. Captured tokens are returned to their owner's starting circle (yard) and must be brought back into play by rolling a 6.",
    "no_capture_on_safe": "A token cannot be captured if it is on a 'safe' square (often marked with a star or globe, or the player's own starting square).",
    "no_capture_own_color": "A player cannot capture their own tokens.",

    # --- Special rolls and squares ---
    "rolling_six": "Rolling a 6 gives the player an extra turn (a bonus roll). A 6 must be rolled to move a token out of the starting circle. If a player already has all tokens in play, a 6 allows them to move any token 6 spaces.",
    "three_consecutive_sixes": "If a player rolls three consecutive 6s in a single turn, their turn ends immediately. Often, the token moved with the third 6 is sent back to the starting circle (this rule can vary).",
    "safe_squares": "Certain squares on the board are designated as 'safe squares'. These typically include the starting square for each color and squares marked with a star or globe. An opponent cannot capture a token resting on a safe square. Multiple tokens (even from different players) can occupy a safe square.",

    # --- Blocks ---
    "blocks": "When two or more tokens of the same color occupy the same non-safe square, they form a 'block'. An opponent's token cannot pass or land on a square occupied by a block. Blocks cannot be formed on safe squares.",
    "moving_blocks": "A block can be moved forward together if the player rolls an even number. The die roll is divided by two, and the block moves that many spaces. For example, rolling a 4 allows the block to move 2 spaces. Some variations do not allow moving blocks.",

    # --- Reaching home ---
    "home_column": "Each player has a coloured 'home column' leading to the central home triangle. Only tokens of that specific color can enter their corresponding home column.",
    "entering_home_column": "A token enters its home column based on its position on the main track and the die roll. It continues moving up the column towards the center.",
    "movement_in_home": "Movement within the home column requires an exact die roll to land on an empty square within the column or the final home triangle. Tokens cannot jump over other tokens within their own home column.",
    "exact_roll_to_finish": "A token must reach the final home triangle spot by an exact die roll count. If the roll is too high, the token cannot move into the triangle that turn (it might move back spaces in some variations, or simply not move).",

    # --- Winning the game ---
    "winning": "The game is won when a player successfully moves all four of their tokens around the board and into their home triangle. The first player to achieve this wins.",

    # --- Optional Rules ---
    "must_form_block_to_enter_home": "Some variations require a player to capture at least one opponent's token before any of their tokens can enter the home column.",
    "bounce_back_home": "In some rules, if a player rolls a number higher than needed to enter the home triangle, the token moves into the triangle and then bounces back the remaining number of steps.",

    # --- Clarification of terms ---
    "ambiguous_term_block": "A 'block' in Ludo refers to two or more pieces of the same color on the same non-safe square. Opponents cannot land on or pass a block.",
    "ambiguous_term_capture": "'Capture' (or 'hit') means landing exactly on a square occupied by a single opponent's piece (not on a safe square), sending that piece back to its starting circle.",
    "ambiguous_term_home_column": "'Home column' (or 'home stretch') is the final colored path leading to the center triangle, accessible only by pieces of that color.",
    "ambiguous_term_yard": "'Yard' (or 'starting circle', 'base') is the area where a player's four tokens begin the game.",
    "ambiguous_term_token": "'Token' (or 'piece', 'pawn', 'man') refers to one of the four playing pieces each player controls."
}

In [5]:
def retrieve_relevant_rules(query, kb, num_results=3):
    query_words = set(query.lower().split())
    
    # Score snippets based on word overlap
    scores = {}
    for key, rule in kb.items():
        # Give slight boost to keys matching query words (like 'block')
        key_score_boost = 5 if key in query_words else 0
        rule_words = set(rule.lower().split())
        common_words = query_words.intersection(rule_words)
        # Simple scoring: count common words, boost if key matches
        scores[key] = len(common_words) + key_score_boost 
        
    # Get top N results
    sorted_keys = sorted(scores, key=scores.get, reverse=True)
    
    relevant_snippets = [f"- {kb[key]}" for key in sorted_keys[:num_results] if scores[key] > 0]

    # Debugging statement
    print(f"DEBUG: Retrieved {len(relevant_snippets)} snippets for query '{query}'")
    
    return "\n".join(relevant_snippets) if relevant_snippets else "No specific rules found matching the query keywords."


# 3. Guiding the AI's Style: Few-Shot Prompting

**The Challenge:** Simply giving the retrieved rules (context) and the user's question to the AI might result in answers that are too technical, poorly formatted, or inconsistent in tone. We want our Ludo assistant to be consistently clear, helpful, and explain rules in an easy-to-understand manner.

**The Solution: Few-Shot Prompting:** This technique involves "teaching by example". We provide Gemini with a few (`few-shot`) examples directly within the prompt. Each example shows:
1.  A typical user query.
2.  The kind of relevant context retrieved for that query.
3.  The *ideal* answer format and style we want the assistant to emulate.

**Why it Helps:**
* **Sets the Tone:** Guides the AI to be helpful and explanatory.
* **Ensures Format:** Encourages consistent answer structure.
* **Improves Clarity:** Demonstrates how to rephrase rules clearly.
* **Reinforces Grounding:** Examples implicitly show how to use *only* the provided context.

By including these examples (`few_shot_examples`), we steer the model towards generating higher-quality, more user-friendly responses tailored to explaining Ludo rules effectively.

In [6]:
few_shot_examples = [
    {
        "query": "How do I get my pawn out of the start?",
        "context": "- To move a token from the starting circle onto the starting square on the track, a player must roll a 6.\n- The starting square is the coloured square just outside the starting circle.\n- Rolling a 6 earns the player an additional bonus roll.",
        "answer": "To move one of your pieces from your starting area onto the main track, you need to roll a 6 with the die. When you roll a 6, you place a piece on the colored square right outside your starting circle (your 'start square'). Rolling a 6 also grants you an extra roll."
    },
    {
        "query": "What if I land exactly where another player's piece is?",
        "context": "- If a player's token lands on a square occupied by a single opponent's token, the opponent's token is captured.\n- Captured tokens are returned to their owner's starting circle.\n- A token cannot be captured if it is on a 'safe' square (e.g., star squares or squares with own color).\n- Tokens of the same color can form a block if multiple are on the same square; blocks cannot be passed or captured by opponents.",
        "answer": "When your piece lands on the exact same square as an opponent's single piece, you capture it! Their piece goes back to their starting area, and they have to roll a 6 to bring it out again. However, you cannot capture pieces on 'safe' squares (like the ones marked with a star) or if the opponent has two or more pieces on the same square forming a 'block'."
    },
     {
        "query": "How do blocks work?",
        "context": "- When two or more tokens of the same color occupy the same square, they form a 'block'. An opponent's token cannot pass or land on a square occupied by a block.\n- A 'block' in Ludo refers to two or more pieces of the same color on the same square. Opponents cannot land on or pass a block.",
        "answer": "A 'block' is formed when you have two or more of your pieces on the same square. Opponents cannot land on a square with your block, nor can they move past it, even if their dice roll would normally take them further. Blocks are a good way to protect your pieces or hinder opponents."
    }
]

# 4. Extending Capabilities: Function Calling

**The Challenge:** Sometimes, the retrieved rules might use specific Ludo jargon (like 'block', 'home column') that the user might not understand, or the AI might need specific, up-to-date information not directly in the static rules (though less common for Ludo rules).

**The Solution: Function Calling:** Gemini's function calling feature allows the AI model to request that our application execute specific, pre-defined Python functions *during* the generation process. This lets the AI interact with external tools or code.

**Our Implementation:**
1.  **Define Python Function:** We create a function (`clarify_ludo_term`) that can look up the definition of a specific Ludo term within our knowledge base.
2.  **Declare for Gemini:** We describe this function's purpose, name, and expected arguments (`parameters`) to the Gemini API using `FunctionDeclaration` and `Tool`. This tells Gemini *when* and *how* it can request this function to be called.

**Why it Helps:**
* **Handles Ambiguity:** If Gemini detects potential confusion around a term based on the query and context, it can proactively call `clarify_ludo_term` to get a definition.
* **Provides Specific Tools:** Allows the AI to leverage dedicated logic for specific tasks beyond text generation.
* **Improves Robustness:** Makes the assistant more helpful by allowing it to resolve potential points of confusion dynamically.

This adds a layer of interactivity and specialized capability to our Ludo assistant.

In [7]:
from google.generativeai.types import FunctionDeclaration, Tool

def clarify_ludo_term(term: str):
    """
    Provides a definition for potentially ambiguous Ludo terms based on the KB.
    Input:
        term: The Ludo term that needs clarification.
    Output:
        A JSON string containing the definition of the term or an error message.
    """
    # Debugging Statement
    print(f"DEBUG: Function Call - clarify_ludo_term(term='{term}')")
    
    term_key = f"ambiguous_term_{term.lower().replace(' ', '_')}"
    definition = LUDO_RULES_KB.get(term_key, f"Sorry, I don't have a specific definition for '{term}' in my Ludo knowledge base.")
    
    return {"term": term, "definition": definition} 

In [8]:
ludo_term_clarifier_func = FunctionDeclaration(
    name="clarify_ludo_term",
    description="Looks up the definition of a specific Ludo game term (like 'block', 'capture', 'safe square', 'home column') if the user seems confused or asks for clarification based on the provided context.",
    parameters={
        "type_": "OBJECT", # Use type_ instead of type
        "properties": {
            "term": {
                "type_": "STRING", # Use type_ instead of type
                "description": "The specific Ludo term the user needs defined, e.g., 'block' or 'safe square'."
            }
        },
        "required": ["term"]
    },
)

In [9]:
ludo_tool = Tool(function_declarations=[ludo_term_clarifier_func])

# 5. Ensuring Accuracy: Grounding the Assistant

**The Problem: Hallucination:** Large Language Models (LLMs) can sometimes generate plausible-sounding but incorrect or fabricated information ("hallucinate"). For a rules assistant, accuracy is paramount – providing incorrect rules is worse than providing no answer.

**The Solution: Grounding:** Grounding aims to tie the LLM's responses firmly to a reliable source of information. In our RAG setup, the primary source is the `Retrieved Context` obtained from our Ludo rules KB.

**Our Strategy: Prompt-Based Grounding:** While advanced Gemini features exist, a highly effective method, especially combined with RAG, is through explicit instructions in the prompt:
1.  **Strict Instruction:** We clearly tell Gemini: "Use the provided 'Retrieved Context' *strictly* to answer the 'User Query'."
2.  **Forbid External Knowledge:** We add: "Do not use any external knowledge."
3.  **Handle Missing Information:** We instruct it on what to do if the answer isn't in the context: "If the context doesn't contain the answer, state that clearly."

**Why it Works:**
* **Leverages RAG:** RAG provides the specific, factual text to ground the answer in.
* **Directs the Model:** The prompt instructions focus the model's attention solely on the provided context.
* **Reduces Hallucination:** Minimizes the chance of the model inventing rules or using outdated general knowledge.

This combination of RAG and careful prompting is key to building a trustworthy Ludo rules assistant.

# 6. Assembling the Request: The Prompt Formatting Function

**The Task:** Before we send the user's query to Gemini, we need to combine all the pieces we've prepared: the core instructions, the grounding directives, the few-shot examples (our style guide), the dynamically retrieved RAG context (the facts), and the actual user question.

**The Solution: `format_prompt_with_all_features` Function:** This utility function acts as our "prompt assembler". It takes the current user query and the retrieved context as input, along with our pre-defined few-shot examples, and constructs the final, comprehensive prompt string.

**Structure of the Final Prompt:**
1.  **System Instructions:** Role definition (Ludo expert), core task (answer based on context), grounding rules (use context strictly, state if not found), style guidance (follow examples).
2.  **Few-Shot Examples:** The formatted examples showing desired input/output pairs.
3.  **Current Task Marker:** Clearly indicates the start of the actual request.
4.  **User Query:** The specific question the user asked.
5.  **Retrieved Context:** The relevant rule snippets fetched by our RAG retriever.
6.  **Answer Cue:** Ends with `Ideal Answer:` to signal Gemini where to begin its generation.

This structured prompt ensures that Gemini receives all the necessary information and guidance in a clear, organized manner to generate the best possible response according to our requirements.

In [10]:
# Formats the final prompt including instructions, few-shot examples, and grounding."""
def format_prompt_with_all_features(user_query, retrieved_context, examples):
    
    prompt_start = """SYSTEM: You are an AI assistant expert in the board game Ludo. 
Your task is to answer the user's query based *strictly* on the provided 'Retrieved Context'. 
Do not add information not present in the context. If the context does not contain the answer, explicitly state that.
Emulate the clear and direct style shown in the 'Examples' section below.

--- Examples ---
"""
    
    example_texts = []
    for ex in examples:
        example_text = f"""User Query: {ex['query']}\nRetrieved Context:{ex['context']}\nIdeal Answer: {ex['answer']}---"""
        example_texts.append(example_text)
        
    prompt_end = f"""--- Current Task ---\nUser Query: {user_query}\nRetrieved Context:{retrieved_context}\nIdeal Answer:"""

    return prompt_start + "\n".join(example_texts) + prompt_end

# 7. Measuring Success: Evaluating Assistant Quality with AI

**The Challenge:** We've built an assistant using several techniques, but how do we know if it's actually *good*? Is it accurate? Is it relevant? Is it clear? Manually reviewing every possible answer is impractical.

**The Solution: GenAI Evaluation (LLM-as-Judge):** We can leverage another powerful LLM (potentially a more capable one like Gemini Pro) to act as an impartial judge, evaluating the quality of our assistant's responses.

**Our Approach:**
1.  **Define Criteria:** We establish clear, measurable criteria for what constitutes a "good" answer:
    * `Faithfulness`: Does the answer accurately reflect *only* the provided rule context? (Crucial for grounding).
    * `Relevance`: Does the answer directly address the user's specific question?
    * `Clarity`: Is the answer easy to understand?
2.  **Create Evaluation Function (`evaluate_ludo_response`):** This function:
    * Takes the original query, retrieved context, and the generated answer as input.
    * Constructs a *new prompt* specifically instructing the *evaluation model* to assess the generated answer against our criteria.
    * Requests the evaluation output in a structured format (JSON) for easy analysis, including scores and justifications.

**Why it's Important:**
* **Objective Assessment:** Provides a consistent way to measure quality.
* **Scalability:** Automates the evaluation process.
* **Iterative Improvement:** Evaluation results highlight areas where the assistant (or the underlying retrieval, prompting, etc.) needs refinement.

This step adds a crucial quality control layer to our development process.

In [11]:
# --- Criteria used to evaluate answer ---
EVALUATION_CRITERIA = """
1.  **Faithfulness (Score 1-5):** Does the answer accurately reflect ONLY the information in the 'Retrieved Context'? (1=Contradicts/Adds Info, 5=Perfectly Faithful).
2.  **Relevance (Score 1-5):** Is the answer directly relevant to the 'User Query'? (1=Irrelevant, 5=Highly Relevant).
3.  **Clarity (Score 1-5):** Is the answer clear, concise, and easy to understand? (1=Confusing, 5=Very Clear).
"""

In [12]:
# --- Evaluation Function (using Gemini as Judge) ---
try:
    # Using Gemini Pro for potentially better evaluation
    evaluation_model = genai.GenerativeModel('gemini-1.5-pro') 
    print("Gemini 1.5-Pro initialized successfully!")
except Exception as e:
    print(f"Could not initialize evaluation model: {e}. Using default model.")
    
    # Fallback to the main model if Pro isn't available or configured
    try:
        evaluation_model = genai.GenerativeModel('gemini-1.5-flash')
        print("Gemini 1.5-Flash initialized successfully!")
    except Exception as e_main:
        print(f"Could not initialize default model either: {e_main}")
        evaluation_model = None

Gemini 1.5-Pro initialized successfully!


In [13]:
def evaluate_ludo_response(query, context, generated_answer):
    """
    Uses an LLM to evaluate the quality of the generated Ludo rule explanation.
    Returns a dictionary with scores and justifications, or an error dict.
    """
    if not evaluation_model:
        print("Evaluation model not available.")
        return {"error": "Evaluation model not initialized"}

    print(f"\n--- Evaluating Response ---")
    print(f"Query: {query}")
    print(f"Context:\n{context}")
    print(f"Generated Answer:\n{generated_answer}")

    eval_prompt = f"""You are an impartial evaluator assessing the quality of an AI-generated answer about Ludo rules.
Evaluate the 'Generated Answer' based on the 'User Query' and the 'Retrieved Context' it was supposed to be based on.

Use the following criteria:
{EVALUATION_CRITERIA}

--- Input Data ---
User Query: {query}

Retrieved Context:
{context}

Generated Answer:
{generated_answer}

--- Evaluation Output ---
Provide a score (1-5) for each criterion (Faithfulness, Relevance, Clarity) and a brief justification for each score. Format the output as a JSON object. Example:
```json
{{
  "faithfulness_score": 5,
  "faithfulness_justification": "Accurately reflects context.",
  "relevance_score": 5,
  "relevance_justification": "Directly answers query.",
  "clarity_score": 4,
  "clarity_justification": "Clear, slightly wordy."
}}

Evaluation Output (JSON):
"""

    evaluation_response = None
    
    try:
        evaluation_response = evaluation_model.generate_content(eval_prompt, generation_config=genai.types.GenerationConfig(
            # Ensure it outputs JSON reliably
            response_mime_type="application/json",
            temperature=0.1 # Low temperature for consistent evaluation
            ) 
        )
        # Accessing JSON directly if response_mime_type="application/json" is used and successful
        eval_result_json = json.loads(evaluation_response.text)
        print(f"--- Evaluation Result (JSON) ---")
        print(json.dumps(eval_result_json, indent=2))
        return eval_result_json
        
    # Handle cases where the response wasn't valid JSON
    except ValueError as json_err:
        print(f"Evaluation JSON Parsing Error: {json_err}")
        if evaluation_response: print(f"Raw evaluation response text: {evaluation_response.text}")
        return {"error": "Failed to parse evaluation JSON", "raw_text": evaluation_response.text if evaluation_response else "N/A"}
    except Exception as e:
        print(f"Evaluation Error: {e}")
        # Log feedback if available (e.g., blocked content)
        try:
            if evaluation_response: print(f"Evaluation Prompt Feedback: {evaluation_response.prompt_feedback}")
        except: 
            pass # Ignore if feedback isn't available
        return {"error": f"Failed to get or parse evaluation: {e}"}

# 8. Orchestrating the Process: The Main Workflow Function

**The Goal:** Now we bring all the previously defined components together into a single, executable workflow that handles an incoming user query from start to finish.

**The Solution: `handle_ludo_query` Function:** This function acts as the central "engine" for our Ludo assistant. It orchestrates the sequence of operations:
1.  **Input:** Takes the user's query (`user_query`).
2.  **Retrieve (RAG):** Calls `retrieve_relevant_rules` to get factual context from the KB. Handles cases where no relevant context is found.
3.  **Format Prompt:** Calls `format_prompt_with_all_features` to assemble the comprehensive prompt, including instructions, grounding directives, few-shot examples, the query, and the retrieved context.
4.  **Generate & Handle Functions:** Calls the main Gemini model (`chat_model`) with the formatted prompt and enabled tools (for function calling). It checks if Gemini requests a function call (like `clarify_ludo_term`).
    * *If Function Call Requested:* Executes the appropriate Python function, sends the result back to Gemini, and gets the final text answer incorporating that result.
    * *If No Function Call:* Processes the direct text response from Gemini.
5.  **Extract Answer:** Obtains the final generated text response. Includes error handling for API issues or empty responses.
6.  **Evaluate (Optional):** If requested (`evaluate=True`), it calls the `evaluate_ludo_response` function to get an AI-based quality assessment of the generated answer.
7.  **Output:** Returns the final answer string.

This function demonstrates the practical integration and execution of all the GenAI techniques discussed earlier to solve the user's Ludo query.

In [14]:
#--- Initialize the main Gemini Model ---
# Usinging flash here for potentially faster responses in the main task
main_model_name = 'gemini-1.5-flash'
chat_model = None # Initialized as None
if GOOGLE_API_KEY:
    try:
        # Model used for the main generation task, WITH tools enabled
        chat_model = genai.GenerativeModel(main_model_name, tools=[ludo_tool])
        print(f"Main chat model ({main_model_name}) initialized with tools.")
    except Exception as e:
        print(f"FATAL: Could not initialize main chat model {main_model_name}: {e}")
        # chat_model remains None
else:
    print("Skipping main model initialization - API Key not found.")

def handle_ludo_query(user_query, evaluate=False):
    """
    Processes a user query about Ludo rules using all integrated techniques.
    Args:
    user_query (str): The question asked by the user.
    evaluate (bool): Whether to run the LLM-as-Judge evaluation.
    Returns:
    str: The final generated answer.
    (Or dict containing answer and eval results if evaluate=True - adjust as needed)
    """
    if not chat_model:
        return "Error: Main chat model is not available. Please configure API Key."

    print(f"\n=== Handling Query: '{user_query}' ===")

    # 1. RAG: Retrieve relevant context
    retrieved_context = retrieve_relevant_rules(user_query, LUDO_RULES_KB)
    if not retrieved_context or retrieved_context == "No specific rules found matching the query keywords.":
       print("WARNING: Could not find relevant rules in KB for grounding.")
       # Provide minimal context indicating lack of specific info
       retrieved_context = "No specific rules were found in the knowledge base matching the query keywords."
    
    # 2. Format Prompt (Few-Shot, Grounding)
    prompt = format_prompt_with_all_features(user_query, retrieved_context, few_shot_examples)
    # print(f"\n--- Generated Prompt Snippet ---\n{prompt[:500]}...\n--- End Snippet ---") # Debug
    
    # 3. Generation Attempt 1 (Checking for Function Calls)
    final_answer = "Sorry, I encountered an issue generating the response." # Default error msg
    response = None # Define response variable for broader scope error logging
    try:
        # Start a chat session for potential multi-turn function calling
        # Note: For simple single-turn function calls, chat might be overkill, but it's robust for handling the back-and-forth.
        chat = chat_model.start_chat(enable_automatic_function_calling=False) # Manual control
        
        response = chat.send_message(prompt)
        
        # Check if the model decided to call a function
        # Gemini API structure check: response.candidates[0].content.parts[0].function_call
        candidate = response.candidates[0]
        if candidate.content.parts and candidate.content.parts[0].function_call:
            fc = candidate.content.parts[0].function_call
            print(f"DEBUG: Gemini wants to call function: {fc.name} with args: {dict(fc.args)}")
            
            # Find the Python function object based on the name Gemini provided
            function_to_call = globals().get(fc.name) 
            
            if function_to_call:
                 # 4. Execute the Function
                function_args = dict(fc.args)
                try:
                    function_result = function_to_call(**function_args) # Call the actual Python func
                    print(f"DEBUG: Function {fc.name} executed. Result: {function_result}")
    
                    # 5. Send Function Result Back to Gemini
                    # Use the chat history to send the result back correctly
                    response = chat.send_message(
                        Part.from_function_response(
                            name=fc.name,
                            response=function_result # Pass the direct return value (dict)
                        )
                    )
                    # The model generates the final text response after getting the function result
                    if response.parts:
                        final_answer = response.text
                    else:
                         final_answer = "I received the function result but couldn't generate a final text answer."
                         print(f"Error: No text part after function call. Feedback: {response.prompt_feedback}")
    
                except Exception as func_exec_err:
                    print(f"ERROR executing function {fc.name}: {func_exec_err}")
                    final_answer = f"I tried to use my tool '{fc.name}' but encountered an execution error."
            else:
                print(f"ERROR: Declared function '{fc.name}' not found in Python code.")
                final_answer = f"I tried to use a tool '{fc.name}' but couldn't find it."
        
        # If no function call was requested in the first place
        elif response.parts:
            final_answer = response.text
        else:
            # Handle cases where the initial response might be blocked or empty
            print(f"Error: No function call and no text part in initial response. Feedback: {response.prompt_feedback}")
            final_answer = "Sorry, I couldn't generate a response based on the provided information."
    
    except Exception as e:
        print(f"An error occurred during generation or function handling: {e}")
        # Log feedback if available
        try:
            if response: print(f"Prompt Feedback: {response.prompt_feedback}")
        except: pass
        final_answer = f"Sorry, a technical error occurred: {e}"
    
    # --- Output Final Answer ---
    print(f"\n--- Final Generated Answer ---")
    print(final_answer)
    
    # 6. Evaluation (Optional)
    evaluation_result = None
    if evaluate:
        # Check if the evaluation model is ready before calling
        if evaluation_model:
            evaluation_result = evaluate_ludo_response(user_query, retrieved_context, final_answer)
        else:
            print("Skipping evaluation because the evaluation model is not available.")
            evaluation_result = {"error": "Evaluation model not available"}
        # You might return both answer and evaluation, or just log eval results
        
    # For simplicity, just return the answer string here
    return final_answer # Or potentially return a dict: {'answer': final_answer, 'evaluation': evaluation_result}

Main chat model (gemini-1.5-flash) initialized with tools.


# 9. Putting it to the Test: Running Example Queries

**The Purpose:** Having built the workflow, let's see how our Ludo assistant performs in practice. We'll run several example queries designed to test different aspects of its functionality.

**Test Cases:**
* **Basic Rule Recall:** Simple questions to see if it can retrieve and explain core rules (e.g., "How do I win?").
* **Function Call Trigger:** Queries that might involve ambiguous terms, potentially prompting the AI to use the `clarify_ludo_term` function (e.g., "What does 'capture' mean?").
* **Grounding Limits:** Questions asking for information likely *not* in our knowledge base (e.g., strategic advice) to see if the assistant correctly states it cannot answer from the provided context.
* **Specific Scenarios:** Questions about interactions between rules (e.g., "Can I be captured on a star square?").
* **Complex Interactions:** Queries requiring understanding of multiple rules simultaneously (e.g., blocks affecting safe squares).

**Observation:** By running these diverse queries and examining the generated answers (and optional evaluations), we can gain insights into the assistant's strengths and weaknesses, identifying areas for further refinement of the knowledge base, retrieval, prompting, or function logic.

In [15]:
if chat_model:
    # --- Example 1: Basic Rule Question ---
    query1 = "How do I win Ludo?"
    answer1 = handle_ludo_query(query1, evaluate=True)
    # Enable evaluation for this one
    print("\n" + "="*50 + "\n")

    # --- Example 2: Question likely triggering Function Call ---
    query2 = "What does 'capture' mean exactly in the rules?" 
    # This might trigger clarify_ludo_term if Gemini thinks the context needs supplement
    answer2 = handle_ludo_query(query2, evaluate=False) # Evaluation disabled
    print("\n" + "="*50 + "\n")
    
    # --- Example 3: Question testing Grounding (Answer likely not in KB) ---
    query3 = "What's the best opening strategy?" 
    # KB doesn't contain strategy advice, should state that
    answer3 = handle_ludo_query(query3, evaluate=True)
    print("\n" + "="*50 + "\n")
    
    # --- Example 4: Question about safe squares ---
    query4 = "Can I be captured on a star square?"
    answer4 = handle_ludo_query(query4, evaluate=False)
    print("\n" + "="*50 + "\n")
    
    # --- Example 5: More complex question potentially needing clarification ---
    query5 = "How do blocks affect capturing on safe squares?" 
    # Might retrieve multiple rules, clarity is key
    answer5 = handle_ludo_query(query5, evaluate=True)
    print("\n" + "="*50 + "\n")
else:
    print("Cannot run examples because the main chat model failed to initialize.")


=== Handling Query: 'How do I win Ludo?' ===
DEBUG: Retrieved 1 snippets for query 'How do I win Ludo?'

--- Final Generated Answer ---
The provided text does not explain how to win Ludo.


--- Evaluating Response ---
Query: How do I win Ludo?
Context:
- A block can be moved forward together if the player rolls an even number. The die roll is divided by two, and the block moves that many spaces. For example, rolling a 4 allows the block to move 2 spaces. Some variations do not allow moving blocks.
Generated Answer:
The provided text does not explain how to win Ludo.

--- Evaluation Result (JSON) ---
{
  "faithfulness_score": 5,
  "faithfulness_justification": "The answer accurately reflects the provided context, which does not describe how to win at Ludo.",
  "relevance_score": 5,
  "relevance_justification": "The answer is perfectly relevant to the query as it states that the provided text does not contain the answer.",
  "clarity_score": 5,
  "clarity_justification": "The answer is 

# 10. Conclusion and Key Learnings

**Summary:** This notebook successfully demonstrated the creation of a Ludo Rules Assistant by leveraging Google's Gemini model combined with several powerful techniques. We addressed the core problem – the need for clear, accurate Ludo rule explanations – by implementing:
* **RAG:** To ground answers in a factual knowledge base.
* **Few-Shot Prompting:** To ensure responses are consistently clear and well-formatted.
* **Function Calling:** To handle specific tasks like term clarification, making the assistant more robust.
* **Prompt-Based Grounding:** To minimize hallucinations and ensure reliability.
* **GenAI Evaluation:** To systematically assess the quality of the generated answers.

**Key Learnings:** Building a high-quality, domain-specific AI assistant often requires more than just basic prompting. Integrating techniques like RAG, function calling, and careful prompt engineering (including few-shot examples and grounding) significantly enhances accuracy, reliability, and helpfulness. Evaluation is critical for understanding performance and guiding improvements.

---
This notebook was developed by **Asmit Patidar**.