# Python for Learning AI Week 3: Building AI Assistants with Gemini API

Welcome to Week 3 of our Python development series! This week, we're diving into the exciting world of Large Language Models (LLMs) by building our own AI assistant using Google's Gemini API. By the end of this session, you'll have created your own AI assistant and gained valuable insights into how modern AI systems work.

**Superhero Sidekick Metaphor**: Throughout this notebook, we'll compare working with LLMs to having your own superhero AI sidekick:
- The LLM is like your superhero sidekick with encyclopedic knowledge and special powers
- Prompts are like mission briefings you give to your sidekick
- API parameters are like special equipment you provide (e.g., "be more creative" or "stay focused")
- The sidekick's responses are the intelligence and assistance they provide for your mission

## What You'll Learn
1. Understanding Large Language Models and Google's Gemini
2. Setting up your API credentials securely
3. Making requests to the Gemini API
4. Customizing responses with different parameters
5. Building a simple chatbot with persistent history

## Prerequisites
- Basic Python knowledge ([covered in Week 1](../week1/week1_python_basics.ipynb))
- Understanding of APIs ([covered in Week 2](../week2/001_apis_and_networking.ipynb))
- A Google account (to access the Gemini API)

## 1. Setting Up Your Environment

Before we can start using the Gemini API, we need to set up our environment properly. This includes adding our API key to the `.env` file and installing the necessary packages.

### Installing Required Packages

To set up your environment for using the Gemini API, you'll need to install the necessary packages. Follow these steps:

1. Open a terminal window in VS Code (Terminal → New Terminal)
2. Navigate to the project root directory (where the `pyproject.toml` file is located)
3. Run the following command to install all project dependencies using UV:
   ```bash
   uv pip install -e .
   ```
   This command installs the project in "editable" mode, allowing you to modify the code without reinstalling.

4. Install the Google Genai library (if it's not already in the project dependencies):
   ```bash
   uv pip install google-genai
   ```
   
   The `google-genai` package is Google's latest official Python client library for the Gemini API. It provides:
   - Simple methods to interact with Google's Generative AI models
   - Tools for sending prompts and receiving responses
   - Support for chat-based interactions with conversation history
   - Structured data handling and formatting options with Pydantic integration
   - Authentication and API key management
   - Modern API design with improved developer experience

5. You can also check for the latest versions of packages and update them:
   ```bash
   uv pip install --upgrade google-genai python-dotenv
   ```

### Configuring Your Gemini API Key

To use the Gemini API, you need an API key. Follow these steps:

1. Visit [Google AI Studio](https://aistudio.google.com/) and sign in with your Google account. 
   *Note: Using your personal Gmail ID is perfectly fine for this exercise.*
2. Click on "Get API key" in the side nav bar.
3. Click on "Create API key" on top right corner.
4. Provide a suitable name for the API key.
5. Create a new project in which the API key will be created.
6. Create your API key.
7. Click on the created API key you created from the table, copy the API key (Starts with `AI`).

Now, add your API key to the existing `.env` file in the project root directory by adding these lines:
```
GEMINI_API_KEY=your_api_key_here
GEMINI_MODEL=gemini-2.5-flash
```

You can change the model to any of the available models like:
- gemini-2.5-flash
- gemini-2.5-flash-lite
- gemini-2.5-pro

⚠️ **Important:** Never share your API key with others or commit it to public repositories!

Let's check if we can load the API key from the `.env` file:

In [None]:
# Load environment variables from .env file and initialize the Google Genai library
import os
from dotenv import load_dotenv
from google import genai

# Load the environment variables from the .env file
load_dotenv()

# Get the API key
api_key = os.getenv('GEMINI_API_KEY')
# Set default model as an environment variable
# You can add this to your .env file too for persistence
GEMINI_MODEL = os.getenv('GEMINI_MODEL')

print(f"Using Gemini model: {GEMINI_MODEL}")

# Check if the API key exists and is properly formatted
if api_key and api_key.startswith('AI'):
    print("✅ API key loaded successfully!")
    # Show a masked version of the key for verification (first 4 chars and last 4 chars)
    masked_key = api_key[:4] + '*' * (len(api_key) - 8) + api_key[-4:]
    print(f"API Key: {masked_key}")
    
    # Configure the client with our key
    client = genai.Client(api_key=api_key)
    
    print("List of models that support generateContent:\n")
    for m in client.models.list():
        for action in m.supported_actions:
            if action == "generateContent":
                print(m.name)

    print("List of models that support embedContent:\n")
    for m in client.models.list():
        for action in m.supported_actions:
            if action == "embedContent":
                print(m.name)
else:
    print("❌ API key not found or invalid.")
    print("Please add GEMINI_API_KEY=your_api_key_here to your .env file in the project root directory.")

## 2. Making Simple Requests to Gemini

Now that we've set up our environment and connected to the API, let's start by making a simple request to the Gemini model. We'll create a function that allows us to send a prompt and receive a response.

### Understanding the API Flow

In this notebook, we're interacting with Google's hosted Large Language Models (LLMs) through their Gemini API. Let's understand the basic flow:

```
┌───────────────────┐      HTTP Request       ┌─────────────────┐
│                   │                         │                 │
│    Our Code       │                         │                 │
│    (Python)       │                         │  Google's LLM   │
│                   │                         │  (Cloud-hosted) │
│ ┌─────────────┐   │                         │                 │
│ │ genai.Client│───────────────────────────> │                 │
│ └─────────────┘   │   API Key + Prompt      │                 │
│                   │ <─────────────────────  │                 │
│                   │    Generated Content    │                 │
└───────────────────┘                         └─────────────────┘
```

**About the Gemini API with the google-genai Client**:
- **API Type**: Modern Python client for Google's Gemini models
- **Primary Client Methods**: 
  - `client.models.generate_content()` - For one-off content generation requests
  - `client.chats.create()` - Creates conversation sessions
  - `chat.send_message()` - For multi-turn conversations
- **Authentication**: Uses API key-based authentication via the Client object
- **Request Format**: Structured Python objects converted to API requests
- **Response Format**: Rich response objects with text, metadata, and usage statistics

**What's happening:**
1. We initialize a `genai.Client()` with our API key
2. The client handles connection details, authentication, and request formatting
3. Google's servers process our prompts using their trained LLMs (Gemini models)
4. The client parses the responses into structured objects with helpful methods and properties
5. We access the results through these objects (e.g., `response.text`, `response.usage_metadata`)

In this notebook, we'll be using both the client's `models.generate_content()` method for single requests and the chat functionality (`client.chats.create()` and `chat.send_message()`) for conversation-based interactions to demonstrate different ways of working with the modern Gemini API.

**Superhero Sidekick Metaphor**: This is like giving your superhero sidekick a simple reconnaissance mission—perhaps asking them to scout ahead and report what they see. We're starting with something simple to test their capabilities.

In [None]:
# Create a function for basic interaction with Gemini
def ask_gemini(prompt, model_name=None):
    """
    Send a prompt to the Gemini model and get a response.
    
    Args:
        prompt (str): The prompt to send to the model
        model_name (str, optional): The name of the Gemini model to use, defaults to environment variable
        
    Returns:
        str: The model's response
    """
    # Use the model from environment variable if not specified
    if model_name is None:
        model_name = GEMINI_MODEL
        
    print(f"Using model: {model_name}")
    
    # Initialize client
    client = genai.Client(api_key=api_key)
    
    # Generate a response
    response = client.models.generate_content(
        model=model_name,
        contents=prompt
    )
    
    return response.text

In [None]:
# Let's try a simple prompt
prompt = "Who are some super heroes who can code in Python?"
response = ask_gemini(prompt)

print("Prompt:", prompt)
print("\nResponse:")
print(response)

## 3. Customizing Model Behavior with Parameters

One powerful aspect of using LLMs like Gemini is that you can customize various parameters to control the behavior of the model. Let's explore the most important ones:

- **Temperature**: Controls randomness (0.0 = deterministic, 1.0 = creative)
- **Top-k**: Limits token selection to the k most likely next tokens
- **Top-p**: Limits token selection to a subset with a cumulative probability of p

**Note**: Depending on the specific model you're using (e.g., gemini-2.5-flash vs. gemini-2.5-pro), there might be variations in how these parameters affect the output. Some models may have additional parameters or different optimal ranges. Always refer to the documentation for your chosen model for the most accurate information.

**Superhero Sidekick Metaphor**: These parameters are like configuring your sidekick's approach to the mission:
- Temperature is like adjusting how creative or by-the-book your sidekick should be
- Top-k is like limiting their toolkit to only the most reliable gadgets
- Top-p is like telling them to only use strategies they're confident in

**Want to learn more about these parameters?**
- [Google AI: Gemini API Parameter Guide](https://ai.google.dev/docs/concepts#model_parameters)
- [Understanding Temperature in LLMs](https://www.promptingguide.ai/introduction/settings.en#temperature)
- [OpenAI API Parameters Guide](https://platform.openai.com/docs/api-reference/chat/create#chat-create-temperature) (similar concepts apply across LLMs)

In [None]:
# Enhanced function with customizable parameters
def ask_gemini_custom(prompt, temperature=0.7, top_k=40, top_p=0.95, model_name=None):
    """
    Send a prompt to the Gemini model with customizable parameters.
    
    Args:
        prompt (str): The prompt to send to the model
        temperature (float): Controls randomness (0.0 = deterministic, 1.0 = creative)
        top_k (int): Limits token selection to the k most likely next tokens
        top_p (float): Limits token selection to a subset with cumulative probability of p
        model_name (str, optional): The name of the Gemini model to use
        
    Returns:
        str: The model's response
    """
    # Use the model from environment variable if not specified
    if model_name is None:
        model_name = GEMINI_MODEL
        
    # Configure generation parameters
    generation_config = {
        "temperature": temperature,
        "top_k": top_k,
        "top_p": top_p,
    }
    
    # Initialize client
    client = genai.Client(api_key=api_key)
    
    # Generate a response with the specified configuration
    response = client.models.generate_content(
        model=model_name,
        contents=prompt,
        config=generation_config
    )
    
    return response.text


In [None]:
# Let's try the same prompt with different temperatures
# Rerun the below cells, to check the consistency of responses

prompt = "Write a very short poem about super heroes using respective movie references."

print("Temperature = 0.2 (More focused, less creative)")
print("-" * 50)
response_cold = ask_gemini_custom(prompt, temperature=0.2)
print(response_cold)

### Experiment Time: Play with Parameters!

Now it's your turn to experiment! Try running the following cells multiple times to see how different temperature values affect the responses. You can also modify the code to:

1. **Try different temperature values** (between 0.0 and 1.0)
2. **Experiment with top_k** (try values like 10, 20, 40, 100) 
3. **Adjust top_p** (try values between 0.5 and 1.0)
4. **Ask different questions** to see how parameters affect various types of requests

The best way to understand these parameters is to experiment with them yourself. Notice how lower temperatures produce more consistent responses, while higher temperatures create more varied and creative outputs.

**Challenge**: After running the cells below, try creating your own cell that uses very different parameter combinations to see what happens!

In [None]:
print("Temperature = 1.0 (More creative, more diverse)")
print("-" * 50)
response_hot = ask_gemini_custom(prompt, temperature=1.0)
print(response_hot)

### Understanding Tokens and Monitoring in LLM Platforms

When working with LLMs like Gemini, it's important to understand how tokens affect your usage and costs. Tokens are the basic units that LLMs process - they can be parts of words, words, or punctuation.

**Monitoring in Google AI Studio**

For detailed analysis of your Gemini API usage:

1. Go to [Google AI Studio](https://aistudio.google.com/)
2. Click on "API Keys" in the sidebar
3. Select the API key you're using
4. View the "Usage" tab to see:
   - Token consumption over time
   - API call frequency
   - Model usage distribution
   - Cost estimates (if applicable)

**Alternative: OpenAI Tokenizer Tools**

If you're working with multiple LLM providers or want to analyze token usage outside of Google AI Studio:

1. Use OpenAI's [Tokenizer Tool](https://platform.openai.com/tokenizer) - a free web interface that shows exactly how text is split into tokens

This monitoring is essential for managing costs and optimizing your prompts, especially in production applications where efficiency matters.

Remember that different LLMs use slightly different tokenization methods, but the token counts are generally similar enough for estimation purposes.

In [None]:
# Demonstrating token counting with google-genai
# This shows how to count tokens before sending a request and get usage metadata after

# Initialize client 
client = genai.Client(api_key=api_key)

# Sample prompt
prompt = "Write a short paragraph about how Iron Man's technology works."
print(f"Prompt: \"{prompt}\"")

# Count tokens in the prompt before sending
token_count = client.models.count_tokens(
    model=GEMINI_MODEL, 
    contents=prompt
)
print("\nToken count before generation:")
print(f"Prompt tokens: {token_count.total_tokens}")

# Generate content with the prompt
response = client.models.generate_content(
    model=GEMINI_MODEL,
    contents=prompt
)

# Display the response
print("\nResponse:")
print(response.text[:200] + "..." if len(response.text) > 200 else response.text)

# Show detailed token usage statistics
print("\nDetailed token usage statistics:")
print(f"Prompt tokens: {response.usage_metadata.prompt_token_count}")
print(f"Response tokens: {response.usage_metadata.candidates_token_count}")
print(f"Total tokens: {response.usage_metadata.total_token_count}")
print(f"Token rate: {response.usage_metadata.prompt_token_count / len(prompt):.2f} tokens per character")


In [None]:

# Demonstrate with a more complex prompt
complex_prompt = """
Analyze the following technologies and compare their efficiency:
1. Iron Man's Arc Reactor
2. Black Panther's Vibranium suit
3. Batman's gadgets
4. Wonder Woman's equipment
"""

print("\n\nComplex prompt token analysis:")
complex_token_count = client.models.count_tokens(
    model=GEMINI_MODEL, 
    contents=complex_prompt
)
print(f"Complex prompt tokens: {complex_token_count.total_tokens}")
print(f"Character count: {len(complex_prompt)}")
print(f"Token to character ratio: {complex_token_count.total_tokens / len(complex_prompt):.2f}")

print("\nUnderstanding token counts helps optimize your API usage and costs!")

## 4. Building a Simple Chatbot with History

So far, we've been making one-off requests to the Gemini model. However, for a true chatbot experience, we need to maintain conversation history to provide context for follow-up questions. The Gemini API makes this easy with its chat functionality.

**Superhero Sidekick Metaphor**: This is like having an ongoing mission with your sidekick, where they remember all the previous intel and context, allowing for a more coordinated operation rather than treating each interaction as a brand new mission.

In [None]:
# Let's create a function that prepares a chat session for a superhero-themed conversation
def setup_hero_chat(model_name=None):
    """
    Set up a chat session for a superhero-themed conversation.
    
    Args:
        model_name (str, optional): The name of the Gemini model to use
    
    Returns:
        tuple: The chat session object and the client object
    """
    # Use the model from environment variable if not specified
    if model_name is None:
        model_name = GEMINI_MODEL
        
    print(f"Setting up superhero chat with model: {model_name}")
    
    # Initialize client
    client = genai.Client(api_key=api_key)
    
    # Create a chat session
    chat = client.chats.create(
        model=model_name,
        history=[]  # Start with an empty conversation
    )
    
    print("Superhero chat session initialized and ready for your questions!")
    print("This chat maintains context between questions, like a true sidekick.")
    print("-" * 50)
    
    return chat, client

# Initialize our superhero chat session
hero_chat, hero_client = setup_hero_chat()

In [None]:
# Question 1: Ask about scientist superheroes
question1 = "Who are the most popular superheroes that are also scientists?"
print(f"Question: {question1}\n")

# Send the message and get the response
try:
    response1 = hero_chat.send_message(question1)
    print(f"Gemini: {response1.text}")
except Exception as e:
    print(f"Error: {str(e)}")

In [None]:
# Question 2: Ask about Iron Man's scientific background
question2 = "Can you describe the scientific background of Tony Stark/Iron Man?"
print(f"Question: {question2}\n")

# Send the message and get the response
try:
    response2 = hero_chat.send_message(question2)
    print(f"Gemini: {response2.text}")
except Exception as e:
    print(f"Error: {str(e)}")

In [None]:
# Question 3: Ask about Black Panther's vibranium technology
question3 = "What scientific principles does Black Panther's vibranium technology use?"
print(f"Question: {question3}\n")

# Send the message and get the response
try:
    response3 = hero_chat.send_message(question3)
    print(f"Gemini: {response3.text}")
except Exception as e:
    print(f"Error: {str(e)}")

In [None]:
# Bonus question: Ask for a comparison that references previous answers
bonus_question = "Compare the scientific approaches of previous hero and Black Panther. Which is more advanced?"
print(f"Question: {bonus_question}\n")

# Send the message and get the response
try:
    bonus_response = hero_chat.send_message(bonus_question)
    print(f"Gemini: {bonus_response.text}")
    print("\nNotice how the model remembers the context from previous questions and can make comparisons!")
except Exception as e:
    print(f"Error: {str(e)}")

In [None]:
# Try your own question!
# Replace this with any superhero-related question you want to ask
your_question = "What would happen if Spider-Man and Batman teamed up?"

print(f"Your Question: {your_question}\n")

# Send the message and get the response
try:
    your_response = hero_chat.send_message(your_question)
    print(f"Gemini: {your_response.text}")
except Exception as e:
    print(f"Error: {str(e)}")

print("\nTry modifying this cell with different questions to see how the conversation continues!")

## 5. Implementing Persistent Chat History

Now let's create a class that can save chat history to a file for future reference.
### Benefits of Persisting Chat History

Saving chat history to disk provides several advantages for AI applications:

1. **Continuity Between Sessions**: Users can pick up conversations where they left off, even after closing the application.

2. **Analytics and Improvement**: 
   - Track common questions and improve responses
   - Identify patterns in user interactions
   - Use historical data to fine-tune future AI models

3. **User Experience**: 
   - Maintain context for longer, more meaningful conversations
   - Provide personalized responses based on past interactions
   - Create a "memory" that builds relationships with users

4. **Debugging and Quality Control**:
   - Review past interactions to identify where the AI might have provided incorrect information
   - Understand how conversations evolve and where they might go off-track

5. **Application Integration**: The saved JSON data can be easily integrated with databases, dashboards, or other tools for further processing.

In our superhero sidekick metaphor, this is like your sidekick keeping a detailed mission log that you can review together later, helping both of you improve your teamwork and tactics for future missions.

In [None]:
# Class for persistent chat using google-genai
import datetime
import json
class PersistentChat:
    def __init__(self, model_name=None, history_file='chat_history.json'):
        """
        Initialize a chat with persistent history.
        
        This class enables conversations with Gemini that are saved to disk, allowing the
        chat history to persist between sessions and be analyzed later. This is especially
        useful for applications that need to maintain context over time, track user interactions,
        or analyze conversation patterns for further improvements.
        """
        # Use the model from environment variable if not specified
        if model_name is None:
            model_name = GEMINI_MODEL
            
        # Initialize the client
        self.client = genai.Client(api_key=api_key)
        self.model_name = model_name
        self.history_file = os.path.join('../', history_file)
        self.chat_history = []
        
        # Create a chat session with the client
        self.chat = self.client.chats.create(model=model_name, history=[])
    
    def send_message(self, message):
        """Send a message and save the response to history."""
        response = self.chat.send_message(message)
        
        # Add to history
        self.chat_history.append({"role": "user", "message": message})
        self.chat_history.append({"role": "assistant", "message": response.text})
        
        # Save history
        self.save_history()
        
        return response.text
    
    def save_history(self):
        """Save chat history to file."""
        # Create a timestamped record
        history_record = {
            "timestamp": datetime.datetime.now().isoformat(),
            "conversations": self.chat_history
        }
        
        try:
            with open(self.history_file, 'w') as f:
                json.dump(history_record, f, indent=2)
            print(f"Chat history saved to {self.history_file}")
        except Exception as e:
            print(f"Error saving chat history: {e}")


### Continuing a Persistent Chat Session

One of the key benefits of the `PersistentChat` class is that you can easily save your conversation and resume it later. This allows users to maintain context across multiple sessions, creating a more coherent and personalized experience.

Here's how you can continue a conversation from a previous session:

1. **Create a new instance** of `PersistentChat` with the same `history_file` parameter
2. **Send new messages** to continue the conversation where you left off
3. **Access the full conversation history** from the JSON file whenever needed

This enables several powerful use cases:
- Building AI assistants that remember user preferences
- Creating long-running conversations that span days or weeks
- Analyzing conversation patterns over time
- Sharing conversation contexts between different applications

Example of continuing a conversation:
```python
# Start a conversation
chat = PersistentChat(history_file='my_hero_chat.json')
chat.send_message("Tell me about Wonder Woman")

# Later, even after restarting your application
continued_chat = PersistentChat(history_file='my_hero_chat.json') 
continued_chat.send_message("What are her greatest powers?")  # The AI remembers the context
```

The history is stored in a JSON file that includes timestamps and a complete record of the conversation, making it easy to integrate with other systems or analyze later.

In [None]:

# Unlike the previous chat examples, PersistentChat saves all interactions to a file,
# allowing you to review past conversations, analyze responses, or even train future models.
print("\nExample of persistent chat:")
persistent_chat = PersistentChat(history_file='./superhero_chat_demo.json')
response1 = persistent_chat.send_message("Who would win in a fight between Batman and Iron Man, and why?")
print(f"Response: {response1[:150]}...")  # Show just the beginning for brevity

## 6. Conclusion: Your AI Superhero Sidekick Journey

Congratulations! You've successfully:
1. Set up your environment with the Gemini API
2. Made basic requests to generate text
3. Customized model behavior with parameters
4. Built a simple chatbot with conversation history
5. Worked with structured data responses

**Next Steps:**
- Experiment with different parameters to see how they affect the responses
- Try creating a specialized assistant for a specific domain (e.g., a superhero encyclopedia or a comic book advisor)
- Explore more advanced prompt engineering techniques
- Try integrating Gemini into a web application or other projects

**Resources:**
- [Google AI Studio](https://aistudio.google.com/)
- [Gemini API Documentation](https://ai.google.dev/docs)
- [Prompt Engineering Guide](https://ai.google.dev/docs/prompt_best_practices)

Remember that LLM capabilities are constantly evolving, so keep exploring and learning as these technologies advance!

Happy coding and AI creating!