# Lecture Q&A Chatbot with Context Caching for Efficient Retrieval
## Background and Purpose
This workflow implements an advanced Q&A chatbot using LlamaIndex and the Gemini API, designed for students to analyze lecture materials by uploading a class presentation (PDF) and an audio recording. It leverages context caching to store both files for efficient retrieval, reducing API calls for repeated queries, and supports multi-turn conversations for iterative exploration of lecture content. The chatbot is ideal for academic use, enabling students to extract key points and clarify concepts from complex lecture materials with high accuracy and minimal latency.

## Use Case
A student attending a machine learning course uploads the lecture slides (PDF) and an audio recording of the class to the chatbot. They ask, "Summarize the key points from the lecture," to get a concise overview of the material. Later, they follow up with, "Explain the main concept discussed in slide 5," to dive deeper into a specific topic, such as neural network architectures. The chatbot uses cached content to efficiently process both queries, providing accurate, context-aware answers without repeatedly uploading files.

## Import Required Libraries
This cell imports necessary libraries, including Google’s generative AI client for file uploads and caching, LlamaIndex’s `GoogleGenAI` for LLM interactions, and components for handling chat messages.

In [1]:
from google import genai
from google.genai.types import CreateCachedContentConfig, Content, Part
from llama_index.llms.google_genai import GoogleGenAI
from llama_index.core.llms import ChatMessage
import time

## Upload and Cache Lecture Files
Defines a function to upload a PDF presentation and an audio recording to the Gemini API, wait for processing, and create a cache with a one-hour TTL containing both files. The cache enables efficient retrieval of lecture content, with a system instruction ensuring accurate and detailed answers based on the provided materials.

In [2]:
def upload_and_cache_files(pdf_path: str, audio_path: str, api_key: str) -> str:
    client = genai.Client(api_key=api_key)
    
    # Upload PDF
    pdf_file = client.files.upload(file=pdf_path)
    while pdf_file.state.name == "PROCESSING":
        time.sleep(2)
        pdf_file = client.files.get(name=pdf_file.name)
    
    # Upload Audio
    audio_file = client.files.upload(file=audio_path)
    while audio_file.state.name == "PROCESSING":
        time.sleep(2)
        audio_file = client.files.get(name=audio_file.name)
    
    # Create cache with both files
    cache = client.caches.create(
        model="gemini-2.5-flash",
        config=CreateCachedContentConfig(
            display_name="Lecture Q&A Cache",
            system_instruction=(
                "You are an expert lecture assistant. Provide accurate and detailed "
                "answers based on the provided lecture presentation (PDF) and audio recording."
            ),
                contents=[
                Content(
                    role="user",
                    parts=[
                        Part.from_uri(file_uri=pdf_file.uri, mime_type="application/pdf"),
                        Part.from_uri(file_uri=audio_file.uri, mime_type="audio/mp3"),
                    ],
                )
            ],
            ttl="3600s",
        ),
    )
    print("Cache created successfully")
    return cache.name

## Create Q&A Messages
Constructs a list of `ChatMessage` objects for the conversation, appending a new user query to any existing messages. This supports multi-turn conversations by maintaining context for follow-up questions without directly embedding files in each query, as they are stored in the cache.

In [3]:
def create_qa_messages(query: str, previous_messages: list = None) -> list:
    messages = previous_messages or []
    messages.append(ChatMessage(role="user", content=query))
    return messages


## Execute Workflow
Runs the chatbot with a sample initial query ("Summarize the key points from the lecture presentation and audio."), a follow-up query ("Explain the main concept discussed in slide 5 of the presentation."), and paths to a PDF and audio file. Replace `sample_lecture_slides.pdf` and `sample_lecture_audio.mp3` with real file paths (e.g., a lecture presentation and recording) to test functionality. Prints the results from both turns.

In [None]:
if __name__ == "__main__":
    api_key = "your_api_key"  # Replace with your actual API key
    initial_query = "Summarize the key points from the lecture presentation and audio."
    follow_up_query = "Explain the main concept discussed in slide 5 of the presentation."
    pdf_path = "sample_lecture_slides.pdf"  # Replace with a real PDF path
    audio_path = "sample_lecture_audio.mp3"  # Replace with a real audio path

    print("Uploading and caching files...")
    cache_name = upload_and_cache_files(pdf_path, audio_path, api_key)
    llm = GoogleGenAI(
        model="gemini-2.5-flash",
        api_key=api_key,
        cached_content=cache_name
    )
    
    # Initial query
    messages = create_qa_messages(initial_query)
    initial_response = llm.chat(messages)
    print("\nInitial Response:")
    print(initial_response)

    # Add AI response to messages for multi-turn
    messages.append(ChatMessage(role="assistant", content=str(initial_response)))
    
    # Follow-up query
    messages = create_qa_messages(follow_up_query, previous_messages=messages)
    follow_up_response = llm.chat(messages)
    print("\nFollow-up Response:")
    print(follow_up_response)

Uploading and caching files...
Cache created successfully

Initial Response:
assistant: This lecture provides a detailed introduction to Convolutional Neural Networks (CNNs), covering their historical development, core architectural components, and practical applications.

Here are the key points:

**1. Recap of Neural Networks (NNs):**
*   Last time, we discussed basic Neural Networks, which stack linear layers with non-linearities (activation functions).
*   NNs can address problems like the "mode problem" by learning intermediate templates (e.g., different types of cars) and combining them for a final classification score.

**2. A Bit of History:**
*   **Perceptron (1957, Frank Rosenblatt):** The Mark I Perceptron was the first implementation of the perceptron algorithm. It used 20x20 photocells (400 pixels) to recognize letters. It had an update rule similar to backpropagation but lacked a principled backpropagation technique.
*   **Adaline/Madaline (1960, Widrow and Hoff):** Devel