# From RAG to Agents: Building Smart AI Assistants

Videos:

* Part 1: https://www.youtube.com/watch?v=GH3lrOsU3AU
* Part 2: https://www.youtube.com/watch?v=yS_hwnJusDk

In this [workshop](https://github.com/alexeygrigorev/rag-agents-workshop) we

* Build a RAG application on the FAQ database
* Make it agentic
* Learn about agentic search
* Give tools to our agents
* Use PydanticAI to make it easier

You can learn more about agents in the upcoming [AI Bootcamp course](https://maven.com/alexey-grigorev/from-rag-to-agents). Use code "DTC" to get $99 off.

Based on the code of this workshop, we developed a library ["Toy AI Kit"](https://github.com/alexeygrigorev/toyaikit). This library simplifies the interaction with OpenAI API when developing agents and helps better understand how other agent libraries are implemented.

For this workshop, we will use the following FAQ documents from our free courses:

* Machine Learning Zoomcamp
* Data Engineering Zoomcamp
* MLOps Zoomcamp

## Environment
* For this workshop, all you need is Python with Jupyter.
* I use GitHub Codespaces to run it (see [here](https://www.loom.com/share/80c17fbadc9442d3a4829af56514a194)) but you can use whatever environment you like.
* Also, you need an OpenAI account (or an alternative provider).

### Setting up Github Codespaces
Github Codespaces is the recommended environment for this workshop. But you can use any other environment with Jupyter Notebook, including your laptop and Google Colab.

* Create a repository on GitHub, initialize it with README.md
* Add the OpenAI key:
  * Go to Settings -> Secrets and Variables (under Security) -> Codespaces
  * Click "New repository secret"
  * Name: OPENAI_API_KEY, Secret: your key
  * Click "Add secret"
* Create a codespace
  * Click "Code"
  * Select the "Codespaces" tab
  * "Create codespaces on main"

### Installing required libraries
Next we need to install the required libraries:

In [1]:
#%pip install jupyter openai minsearch requests

# Part 0: Basic RAG


## Minsearch

First, we implement a basic search function that will query our FAQ database. This function takes a query string and returns relevant documents. We will use minsearch for that.

In [2]:
import json
from minsearch import AppendableIndex

In [3]:
# Download and process the FAQ documents for the RAG system:
import requests

docs_url = 'https://github.com/alexeygrigorev/llm-rag-workshop/raw/main/notebooks/documents.json'
docs_response = requests.get(docs_url)
# Parses the downloaded JSON into a Python object (a list of courses, each with documents)
documents_raw = docs_response.json()

documents = []

for course in documents_raw:
    course_name = course['course'] # Extract the course name for each course

    for doc in course['documents']: # Extract each document in the course
        doc['course'] = course_name # Add the course name to each document
        documents.append(doc)

In [4]:
# Creates an index of all FAQ documents so they can be searched efficiently:
index = AppendableIndex( # Class from the minsearch library used for fast text search
    text_fields=["question", "text", "section"], # Specifies which fields in each document should be searched as text
    keyword_fields=["course"] # Specifies fields that are treated as exact-match keywords
)

index.fit(documents) # Builds the index from the documents

<minsearch.append.AppendableIndex at 0x73d2ba9b70d0>

## Search

Explanation:

* This function is the foundation of our RAG system
* It looks up in the FAQ to find relevant information
* The result is used to build context for the LLM

In [5]:
# Search the documents:
def search(query):
    boost = {'question': 3.0, 'section': 0.5}
    # Calls the search method on the FAQ index
    results = index.search(
        query=query, # The search query provided by the user
        filter_dict={'course': 'data-engineering-zoomcamp'},
        boost_dict=boost,
        num_results=5, # Returns the top 5 results
        output_ids=True # Returns the IDs of the documents in the results
    )

    return results

In [6]:
results = search('I just discovered the course. Can I join now?')
print(results[0]['text'])

Yes, even if you don't register, you're still eligible to submit the homeworks.
Be aware, however, that there will be deadlines for turning in the final projects. So don't leave everything for the last minute.


## Prompt

We create a function to format the search results into a structured context that our LLM can use. This code prepares a structured prompt for the LLM, ensuring it answers the user's question using only the relevant FAQ entries found by the search. This is a key step in the RAG workflow, providing the LLM with the necessary context for accurate and grounded responses.

In [7]:
# Defines a string template for the prompt that will be sent to the LLM
prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

<QUESTION>
{question}
</QUESTION>

<CONTEXT>
{context}
</CONTEXT>
""".strip()
# Defines a function to build the actual prompt for the LLM
def build_prompt(query, search_results):
    context = "" # Initializes an empty string to accumulate context from search results
    # Appends section, question, and answer to the context string in a readable format
    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"
    # Fills in the template with the actual question and the constructed context
    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

## RAG

RAG consists of 3 parts:

* Search
* Prompt
* LLM

So in python it looks like that:
```python
def rag(query):
    search_results = search(query)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt)
    return answer
```

## The RAG flow

We add a call to an LLM and combine everything into a complete RAG pipeline.

Explanation:

* build_prompt: Formats the search results into a prompt
* llm: Makes the API call to the language model
* rag: Combines search and LLM into a single function

In [8]:
from openai import OpenAI
client = OpenAI()

In [9]:
import os
print(os.getenv("OPENAI_API_KEY"))

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


**NB: This cell will incur charges.**

In [10]:
def llm(prompt):
    response = client.chat.completions.create(# Sends a prompt to the OpenAI API
        model='gpt-4o-mini',
        messages=[{"role": "user", "content": prompt}]
    )
    # Extracts the text content of the model's reply
    return response.choices[0].message.content

def rag(query): # Implements the RAG workflow
    search_results = search(query) # Searches FAQ database for relevant documents for the query
    prompt = build_prompt(query, search_results) # Formats search results and query for the LLM
    answer = llm(prompt) # Sends the formatted prompt to the LLM and gets the answer
    return answer

In [11]:
rag('I just discovered the course. Can I join now?')

"Yes, you can still join the course now. Even if you don't register, you're eligible to submit the homeworks. However, be aware that there will be deadlines for turning in the final projects, so it's advisable not to leave everything until the last minute."

In [12]:
rag('how do I run docker on gentoo?')

'The provided context does not include specific instructions for running Docker on Gentoo. Therefore, I cannot provide an answer to the question based on the available information.'

# Part 1: Agentic RAG

Now let's make our flow agentic

## Agents and Agentic flows

Agents are AI systems that can:

* Make decisions about what actions to take
* Use tools to accomplish tasks
* Maintain state and context
* Learn from previous interactions
* Work towards specific goals

Agentic flow is not necessarily a completely independent agent, but it can still make some decisions during the flow execution

A typical agentic flow consists of:

1. Receiving a user request
2. Analyzing the request and available tools
3. Deciding on the next action
4. Executing the action using appropriate tools
5. Evaluating the results
6. Either completing the task or continuing with more actions

The key difference from basic RAG is that agents can:

* Make multiple search queries
* Combine information from different sources
* Decide when to stop searching
* Use their own knowledge when appropriate
* Chain multiple actions together
So in agentic RAG, the system

* has access to the history of previous actions
* makes decisions independently based on the current information and the previous actions

Let's implement this step by step.

## Making RAG more agentic

First, we'll take the prompt we have so far and make it a little more "agentic":

* Tell the LLM that it can answer the question directly or look up context
* Provide output templates
* Show clearly what's the source of the answer

In [13]:
prompt_template = """
You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.
At the beginning the context is EMPTY.

<QUESTION>
{question}
</QUESTION>

<CONTEXT>
{context}
</CONTEXT>

If CONTEXT is EMPTY, you can use our FAQ database.
In this case, use the following output template:

{{
"action": "SEARCH",
"reasoning": "<add your reasoning here>"
}}

If you can answer the QUESTION using CONTEXT, use this template:

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "CONTEXT"
}}

If the context doesn't contain the answer, use your own knowledge to answer the question

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "OWN_KNOWLEDGE"
}}
""".strip()

In [14]:
# Let's use it:
question = "how do I run docker on gentoo?"
context = "EMPTY"

prompt = prompt_template.format(question=question, context=context)
print(prompt)

You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.
At the beginning the context is EMPTY.

<QUESTION>
how do I run docker on gentoo?
</QUESTION>

<CONTEXT>
EMPTY
</CONTEXT>

If CONTEXT is EMPTY, you can use our FAQ database.
In this case, use the following output template:

{
"action": "SEARCH",
"reasoning": "<add your reasoning here>"
}

If you can answer the QUESTION using CONTEXT, use this template:

{
"action": "ANSWER",
"answer": "<your answer>",
"source": "CONTEXT"
}

If the context doesn't contain the answer, use your own knowledge to answer the question

{
"action": "ANSWER",
"answer": "<your answer>",
"source": "OWN_KNOWLEDGE"
}


## Testing the prompt

**Step-by-step flow:**

**1. String Template:**

- Above, `prompt_template` is defined as a string with placeholders for `{question}` and `{context}`.
- It is filled using `prompt = prompt_template.format(question=question, context=context)`.

**2. Calling llm(prompt):**

- The `llm` function takes the formatted prompt and sends it to the OpenAI API using `client.chat.completions.create`.
- This function wraps the API call, specifying the model and passing the prompt as a message.

**3. OpenAI API:**

- The API receives the prompt, generates a response, and returns it.

**4. Extracting the Answer:**

- The `llm` function extracts the answer text from the API response (`response.choices[0].message.content`) and returns it.

In [15]:
# We may get something like this:
answer = llm(prompt) # Uses its own knowledge since the context is empty
print(answer)

{
"action": "ANSWER",
"answer": "To run Docker on Gentoo, you need to follow these steps:\n\n1. **Install Docker**: You can install Docker using the Portage package manager. First, make sure your system is up to date, then run:\n   ```\n   emerge app-emulation/docker\n   ```\n\n2. **Add your user to the Docker group**: This allows you to run Docker commands without sudo. Replace `yourusername` with your actual username:\n   ```\n   usermod -aG docker yourusername\n   ```\n   Then, log out and back in for the group change to take effect.\n\n3. **Start the Docker service**: Use the following commands to start Docker:\n   ```\n   rc-update add docker default\n   service docker start\n   ```\n\n4. **Check Docker version**: You can verify that Docker is installed correctly by checking the version:\n   ```\n   docker --version\n   ```\n\n5. **Run a test container**: To confirm that Docker is working, you can run a simple test container:\n   ```\n   docker run hello-world\n   ```\n\nIf the co

In [16]:
# But if we ask for something that it can't answer:
question = "how do I join the course?" # Not covered by the FAQ context
context = "EMPTY" 

prompt = prompt_template.format(question=question, context=context)

In [17]:
# We will get this:
answer = llm(prompt)
print(answer)

{
"action": "ANSWER",
"answer": "To join the course, you typically need to register through the course's official website or platform where it is hosted. Look for a 'Sign Up' or 'Enroll' button, create an account if required, and follow the prompts to complete your enrollment. If you have any specific questions about joining, feel free to ask!",
"source": "OWN_KNOWLEDGE"
}


In [18]:
# Here, build_context is a helper function:
def build_context(search_results):
    context = ""
    # Converts a list of FAQ search results into a formatted string
    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"

    return context.strip()

In [19]:
# Let's implement the search for relevant FAQ entries:
 # Queries FAQ database and returns a list of relevant documents
search_results = search(question)
# Formats results into a readable string, combining section, question, and answer for each document
context = build_context(search_results) 
# Fill in the user question and queried formatted context to create a prompt for the LLM
prompt = prompt_template.format(question=question, context=context)
print(prompt)

You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.
At the beginning the context is EMPTY.

<QUESTION>
how do I join the course?
</QUESTION>

<CONTEXT>
section: General course-related questions
question: Course - Can I still join the course after the start date?
answer: Yes, even if you don't register, you're still eligible to submit the homeworks.
Be aware, however, that there will be deadlines for turning in the final projects. So don't leave everything for the last minute.

section: General course-related questions
question: Course - When will the course start?
answer: The purpose of this document is to capture frequently asked technical questions
The exact day and hour of the course will be 15th Jan 2024 at 17h00. The course will start with the first  “Office Hours'' live.1
Subscribe to course public Google Calendar (it works from Desktop only).
Register before the course starts

In [20]:
# Sends formatted prompt to OpenAI with user question and context built from FAQ search results
answer = llm(prompt)
print(answer)

{
"action": "ANSWER",
"answer": "To join the course, you need to register before the course starts using the provided registration link. The course begins on January 15, 2024, at 17:00. You should also join the course's Telegram channel for announcements and register on DataTalks.Club's Slack to stay connected with fellow participants and instructors.",
"source": "CONTEXT"
}


Let's put this together:

* First attempt to answer it with our knowledge
* If needed, do the lookup and then answer

In [21]:
def agentic_rag_v1(question):
    context = "EMPTY" # Initial context is empty
    prompt = prompt_template.format(question=question, context=context) # Build the initial prompt
    answer_json = llm(prompt) # Send prompt to LLM
    answer = json.loads(answer_json) # The answer is parsed from JSON
    print(answer)
    # Check LLM’s action
    if answer['action'] == 'SEARCH': # If the LLM says to search, proceed to the next step
        print('need to perform search...')
        # Perform search and build context
        search_results = search(question) 
        context = build_context(search_results)
        # Build a new prompt now including the context from the search
        prompt = prompt_template.format(question=question, context=context)
        answer_json = llm(prompt)
        answer = json.loads(answer_json)
        print(answer)
    # Returns answer either from own knowledge or from the FAQ context
    return answer

In [22]:
# Test it ('OWN_KNOWLEDGE' means the LLM couldn't find relevant information in the FAQ context):
agentic_rag_v1('how do I join the course?')

{'action': 'SEARCH', 'reasoning': 'The context is empty and I need to find information on how to join the course.'}
need to perform search...
{'action': 'ANSWER', 'answer': "To join the course, you need to register using the provided link before the course starts. You can also join the Telegram channel for announcements and make sure to subscribe to the course's public Google Calendar to keep track of important dates and events. Don't forget to create an account on DataTalks.Club's Slack and join the relevant channel.", 'source': 'CONTEXT'}


{'action': 'ANSWER',
 'answer': "To join the course, you need to register using the provided link before the course starts. You can also join the Telegram channel for announcements and make sure to subscribe to the course's public Google Calendar to keep track of important dates and events. Don't forget to create an account on DataTalks.Club's Slack and join the relevant channel.",
 'source': 'CONTEXT'}

In [23]:
agentic_rag_v1('how patch KDE under FreeBSD?')

{'action': 'ANSWER', 'answer': "To patch KDE under FreeBSD, follow these steps: \n\n1. **Obtain the Source Code**: You can typically get the source code for KDE from the FreeBSD Ports Collection. Navigate to the KDE port you want to patch. For instance, if you're interested in 'kde5', you can find it under `/usr/ports/x11/kde5`.\n\n2. **Download the Patch**: Get the patch file you want to apply. This might be available from KDE's official bug tracker, Git repositories, or relevant developer forums.\n\n3. **Apply the Patch**: Once you have the patch file, use the `patch` command to apply it. For example:\n   ```\n   cd /usr/ports/x11/kde5\n   patch < /path/to/your/patch/file.patch\n   ```\n\n4. **Make and Install**: After applying the patch, you need to rebuild and reinstall the port to ensure your changes take effect. You can do this using:\n   ```\n   make clean install\n   ```\n\n5. **Test the Changes**: Run KDE to see if your patch resolves the issue or adds the desired functionalit

{'action': 'ANSWER',
 'answer': "To patch KDE under FreeBSD, follow these steps: \n\n1. **Obtain the Source Code**: You can typically get the source code for KDE from the FreeBSD Ports Collection. Navigate to the KDE port you want to patch. For instance, if you're interested in 'kde5', you can find it under `/usr/ports/x11/kde5`.\n\n2. **Download the Patch**: Get the patch file you want to apply. This might be available from KDE's official bug tracker, Git repositories, or relevant developer forums.\n\n3. **Apply the Patch**: Once you have the patch file, use the `patch` command to apply it. For example:\n   ```\n   cd /usr/ports/x11/kde5\n   patch < /path/to/your/patch/file.patch\n   ```\n\n4. **Make and Install**: After applying the patch, you need to rebuild and reinstall the port to ensure your changes take effect. You can do this using:\n   ```\n   make clean install\n   ```\n\n5. **Test the Changes**: Run KDE to see if your patch resolves the issue or adds the desired functionali

# Part 2: Agentic search

So far we had two actions only: search and answer.

But we can let our "agent" formulate one or more search queries - and do it for a few iterations until we found an answer

Let's build a prompt:

* List available actions:
  * Search in FAQ
  * Answer using own knowledge
  * Answer using information extracted from FAQ
* Provide access to the previous actions
* Have clear stop criteria (no more than X iterations)
* We also specify the output format, so it's easier to parse it

In [24]:
prompt_template = """
You're a course teaching assistant.

You're given a QUESTION from a course student that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is built with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic.

Don't use search queries used at the previous iterations.


Don't repeat previously performed actions.

Don't perform more than {max_iterations} iterations for a given student question.
The current iteration number: {iteration_number}. If we exceed the allowed number
of iterations, give the best possible answer with the provided information.


Output templates:

If you want to perform search, use this template:

{{
"action": "SEARCH",
"reasoning": "<add your reasoning here>",
"keywords": ["search query 1", "search query 2", ...]
}}

If you can answer the QUESTION using CONTEXT, use this template:

{{
"action": "ANSWER_CONTEXT",
"answer": "<your answer>",
"source": "CONTEXT"
}}

If the context doesn't contain the answer, use your own knowledge to answer the question

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "OWN_KNOWLEDGE"
}}


<QUESTION>
{question}
</QUESTION>

<SEARCH_QUERIES>
{search_queries}
</SEARCH_QUERIES>

<CONTEXT>
{context}
</CONTEXT>

<PREVIOUS_ACTIONS>
{previous_actions}
</PREVIOUS_ACTIONS>
""".strip()

In [25]:
# Our code becomes more complicated. For the first iteration, we have:
question = "how do I join the course?"
# Track search queries, search results, and previous actions to keep history and avoid repeating searches/actions.
search_queries = []
search_results = []
previous_actions = []
# Builds the (empty) context string from the current search results
context = build_context(search_results)

# Fills in question, context, search queries, previous actions, and iteration info (what it knows so far and what it can do next).
prompt = prompt_template.format(
    question=question,
    context=context,
    search_queries="\n".join(search_queries),
    previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
    max_iterations=3,
    iteration_number=1
)
print(prompt)

You're a course teaching assistant.

You're given a QUESTION from a course student that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is built with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic.

Don't use search queries used at the previous iterations.


Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current iteration number: 1. If 

In [26]:
answer_json = llm(prompt)
answer = json.loads(answer_json)

In [27]:
# We need to save the actions (Tracks the agent’s decision history for future iterations and prompt construction):
previous_actions.append(answer)

In [28]:
# Output:
print(json.dumps(answer, indent=2))

{
  "action": "SEARCH",
  "reasoning": "The context does not contain any information about how to join the course. I need to search for details regarding the enrollment or registration process for the course.",
  "keywords": [
    "how to enroll in the course",
    "course registration process",
    "joining the course instructions"
  ]
}


In [29]:
# Keeps a record of all search queries so far to avoid duplicate searches and informing the next prompt
keywords = answer['keywords']
search_queries.extend(keywords)

In [30]:
# For each keyword call search() to query the FAQ database
for k in keywords:
    res = search(k)
    search_results.extend(res)

In [31]:
# Some of the search results will be duplicates, so we need to remove them:
def dedup(seq):
    seen = set()
    result = []
    for el in seq:
        _id = el['_id'] 
        # If the ID is already seen, skip this element
        if _id in seen:
            continue
        seen.add(_id)
        result.append(el)
    return result

In [32]:
search_results = dedup(search_results)

In [33]:
# Now let's make another iteration - use the same code as previously, but remove variable initialization and increase the iteration number:
# The first iteration starts the agentic loop with empty history; this iteration continues the loop, carrying over results and actions from previous steps.
# question = "how do I join the course?"

# search_queries = []
# search_results = []
# previous_actions = []

context = build_context(search_results)

prompt = prompt_template.format(
    question=question,
    context=context,
    search_queries="\n".join(search_queries),
    previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
    max_iterations=3,
    iteration_number=2
)
print(prompt)

You're a course teaching assistant.

You're given a QUESTION from a course student that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is built with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic.

Don't use search queries used at the previous iterations.


Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current iteration number: 2. If 

In [34]:
answer_json = llm(prompt)
answer = json.loads(answer_json)
print(json.dumps(answer, indent=2))

{
  "action": "ANSWER_CONTEXT",
  "answer": "To join the course, you need to register before the course starts using the provided registration link. You can also start learning and submitting homework without formally registering, as registration is mainly to gauge interest prior to the start date. The course is set to begin on January 15, 2024, at 17:00, starting with live Office Hours.",
  "source": "CONTEXT"
}


In [35]:
# Let's put everything together: A multi-step agentic search process to iteratively search, update context, and decide when to stop or answer
question = "what do I need to do to be successful at module 1?"
# Initializes the agent's state: empty search queries, results, and previous actions.
search_queries = []
search_results = []
previous_actions = []


iteration = 0
# Enters a loop for agentic search
while True:
    print(f'ITERATION #{iteration}...') # Prints the current iteration

    context = build_context(search_results)
    prompt = prompt_template.format(
        question=question,
        context=context,
        search_queries="\n".join(search_queries),
        previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
        max_iterations=3,
        iteration_number=iteration
    )

    print(prompt)
    # Sends the prompt to the LLM, process its response, and update the state for the next iteration
    answer_json = llm(prompt)
    answer = json.loads(answer_json)
    print(json.dumps(answer, indent=2))

    previous_actions.append(answer)

    action = answer['action']
    if action != 'SEARCH':
        break

    keywords = answer['keywords']
    search_queries = list(set(search_queries) | set(keywords))

    for k in keywords:
        res = search(k)
        search_results.extend(res)

    search_results = dedup(search_results)

    iteration = iteration + 1
    if iteration >= 4:
        break

    print()


ITERATION #0...
You're a course teaching assistant.

You're given a QUESTION from a course student that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is built with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic.

Don't use search queries used at the previous iterations.


Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current iteratio

In [36]:
# Put everything together in a function:
def agentic_search(question):
    search_queries = []
    search_results = []
    previous_actions = []

    iteration = 0

    while True:
        print(f'ITERATION #{iteration}...')

        context = build_context(search_results)
        prompt = prompt_template.format(
            question=question,
            context=context,
            search_queries="\n".join(search_queries),
            previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
            max_iterations=3,
            iteration_number=iteration
        )

        print(prompt)

        answer_json = llm(prompt)
        answer = json.loads(answer_json)
        print(json.dumps(answer, indent=2))

        previous_actions.append(answer)

        action = answer['action']
        if action != 'SEARCH':
            break

        keywords = answer['keywords']
        search_queries = list(set(search_queries) | set(keywords))

        for k in keywords:
            res = search(k)
            search_results.extend(res)

        search_results = dedup(search_results)

        iteration = iteration + 1
        if iteration >= 4:
            break

        print()

    return answer

In [37]:
# Test it:
agentic_search('how do I prepare for the course?')

ITERATION #0...
You're a course teaching assistant.

You're given a QUESTION from a course student that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is built with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic.

Don't use search queries used at the previous iterations.


Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current iteratio

{
  "action": "SEARCH",
  "reasoning": "Since there is no existing context related to preparing for the course, I need to search for relevant FAQs that could provide insights or tips for course preparation.",
  "keywords": [
    "course preparation",
    "how to prepare for a course",
    "tips for students preparing for a course"
  ]
}

ITERATION #1...
You're a course teaching assistant.

You're given a QUESTION from a course student that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is built with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action

{'action': 'ANSWER',
 'answer': 'To prepare for the course effectively, here are some strategies: \n1. **Familiarize Yourself with the Course Materials**: Review any provided syllabus, readings, and resources in advance.\n2. **Set Up Your Environment**: If the course involves technical components, ensure you have all the necessary tools and software installed.\n3. **Plan Your Schedule**: Allocate specific times for studying and attending live sessions. \n4. **Engage with Peers and Instructors**: Join the course-related communication platforms like Slack and Telegram to connect with fellow students and ask questions.\n5. **Practice Active Learning**: Engage with the content through discussions, exercises, and practical applications to reinforce your understanding.\n6. **Stay Organized**: Use calendars or planners to keep track of important dates and deadlines.\n7. **Set Goals**: Define what you hope to achieve in the course and track your progress.\n8. **Seek Additional Resources**: Loo

In [38]:
print(_['answer'])

To prepare for the course effectively, here are some strategies: 
1. **Familiarize Yourself with the Course Materials**: Review any provided syllabus, readings, and resources in advance.
2. **Set Up Your Environment**: If the course involves technical components, ensure you have all the necessary tools and software installed.
3. **Plan Your Schedule**: Allocate specific times for studying and attending live sessions. 
4. **Engage with Peers and Instructors**: Join the course-related communication platforms like Slack and Telegram to connect with fellow students and ask questions.
5. **Practice Active Learning**: Engage with the content through discussions, exercises, and practical applications to reinforce your understanding.
6. **Stay Organized**: Use calendars or planners to keep track of important dates and deadlines.
7. **Set Goals**: Define what you hope to achieve in the course and track your progress.
8. **Seek Additional Resources**: Look for supplementary materials or online r

# Part 3: Function calling

Function calling in OpenAI
We put all this logic inside our prompt.

But OpenAI and other providers provide a convenient API for adding extra functionality like search.

* https://platform.openai.com/docs/guides/function-calling

It's called "function calling" - you define functions that the model can call, and if it decides to make a call, it returns structured output for that.

For example, let's take our search function:


In [39]:
def search(query):
    boost = {'question': 3.0, 'section': 0.5}

    results = index.search(
        query=query,
        filter_dict={'course': 'data-engineering-zoomcamp'},
        boost_dict=boost,
        num_results=5,
        output_ids=True
    )

    return results

In [40]:
# We describe it like that:
search_tool = {
    "type": "function",
    "name": "search",
    "description": "Search the FAQ database",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "Search query text to look up in the course FAQ."
            }
        },
        "required": ["query"],
        "additionalProperties": False
    }

}

Here we have:

* name: search
* description: when to use it
* parameters: all the arguments that the function can take and their description

In order to use function calling, we'll use a newer API - the "responses" API (not "chat completions" as previously):

In [41]:
tools = [search_tool]

In [42]:
question = "How do I do well in module 1?"

In [43]:
developer_prompt = """
You're a course teaching assistant.
You're given a question from a course student and your task is to answer it.
""".strip()

chat_messages = [
    {"role": "developer", "content": developer_prompt},
    {"role": "user", "content": question}
]

response = client.responses.create( # the "responses" API
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)

In [44]:
response

Response(id='resp_689452dc54d081a3a8520ad5544eb3a80df1a2059b15a18a', created_at=1754551004.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-4o-mini-2024-07-18', object='response', output=[ResponseFunctionToolCall(arguments='{"query":"do well in module 1"}', call_id='call_mS8OiMYJGN2v8aY6luKWXEAY', name='search', type='function_call', id='fc_689452dd708c81a3af4333b24a3338ac0df1a2059b15a18a', status='completed')], parallel_tool_calls=True, temperature=1.0, tool_choice='auto', tools=[FunctionTool(name='search', parameters={'type': 'object', 'properties': {'query': {'type': 'string', 'description': 'Search query text to look up in the course FAQ.'}}, 'required': ['query'], 'additionalProperties': False}, strict=True, type='function', description='Search the FAQ database')], top_p=1.0, background=False, max_output_tokens=None, previous_response_id=None, reasoning=Reasoning(effort=None, generate_summary=None, summary=None), service_tier='default', status='co

In [45]:
# If the model thinks we should make a function call, it will tell us:
response.output

[ResponseFunctionToolCall(arguments='{"query":"do well in module 1"}', call_id='call_mS8OiMYJGN2v8aY6luKWXEAY', name='search', type='function_call', id='fc_689452dd708c81a3af4333b24a3338ac0df1a2059b15a18a', status='completed')]

In [46]:
# Let's make a call to search:
# response.choices[0].message.content
calls = response.output

In [47]:
call = calls[0]
call

ResponseFunctionToolCall(arguments='{"query":"do well in module 1"}', call_id='call_mS8OiMYJGN2v8aY6luKWXEAY', name='search', type='function_call', id='fc_689452dd708c81a3af4333b24a3338ac0df1a2059b15a18a', status='completed')

In [48]:
call.call_id

'call_mS8OiMYJGN2v8aY6luKWXEAY'

In [49]:
f_name = call.name
f_name

'search'

In [50]:
arguments = json.loads(call.arguments)
arguments

{'query': 'do well in module 1'}

In [51]:
# Using f_name we can find the function we need:
f = globals()[f_name] # f is a reference to the function

In [52]:
# And invoke it with the arguments:
results = f(**arguments) # Unpack dictionary into keyword arguments: f(query="module 1 success tips")

In [53]:
# Now let's save the results as json:
search_results = json.dumps(results, indent=2)
# search_results contains FAQ entries from the course database that match a search query:
print(search_results)

[
  {
    "text": "Even after installing pyspark correctly on linux machine (VM ) as per course instructions, faced a module not found error in jupyter notebook .\nThe solution which worked for me(use following in jupyter notebook) :\n!pip install findspark\nimport findspark\nfindspark.init()\nThereafter , import pyspark and create spark contex<<t as usual\nNone of the solutions above worked for me till I ran !pip3 install pyspark instead !pip install pyspark.\nFilter based on conditions based on multiple columns\nfrom pyspark.sql.functions import col\nnew_final.filter((new_final.a_zone==\"Murray Hill\") & (new_final.b_zone==\"Midwood\")).show()\nKrishna Anand",
    "section": "Module 5: pyspark",
    "question": "Module Not Found Error in Jupyter Notebook .",
    "course": "data-engineering-zoomcamp",
    "_id": 322
  },
  {
    "text": "You need to look for the Py4J file and note the version of the filename. Once you know the version, you can update the export command accordingly, th

In [54]:
# And save both the response and the result of the function call:
chat_messages.append(call) # keep a record of the model's tool usage for context in future API calls

In [55]:
# chat_messages.append(call) saves the function call request (what the model wanted to do).
# The current cell saves the function call output (the actual result/data returned by the function)
chat_messages.append({
    "type": "function_call_output",
    "call_id": call.call_id,
    "output": search_results,
})

In [56]:
# Now chat_messages contains both the call description (so it keeps track of history) and the results
# Let's make another call to the model:
response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)

In [57]:
# This should be the response (a final answer but can also be another call if it needs more information):
r = response.output[0]

In [58]:
print(r.content[0].text)

To do well in Module 1 of your course, here are some key tips:

1. **Understand the Basics**: Make sure you have a strong grasp of Docker and Terraform concepts covered in this module. Familiarize yourself with how these technologies work and their purposes.

2. **Follow Instructions**: Carefully follow all installation and setup instructions provided. Ensure all dependencies are correctly installed, including Docker and the necessary Python packages (like `psycopg2`).

3. **Practice Coding**: Engage in hands-on coding. Experiment with creating Docker images and using Terraform for infrastructure deployment.

4. **Resolve Errors Promptly**: If you encounter errors, such as `ModuleNotFoundError` or others, troubleshoot them immediately. For instance:
   - If facing a `ModuleNotFoundError` like for `psycopg2`, ensure it's installed via pip or conda.
   - If an error arises with SQLAlchemy, double-check your connection strings.

5. **Leverage Resources**: Utilize course materials, documen

In [59]:
r.type # 'message' represents a standard text response from the model

'message'

In [60]:
call.type # 'function_call' represents a  request from the model

'function_call'

## Making multiple calls
What if we want to make multiple calls? Change the developer prompt a little:

In [61]:
developer_prompt = """
You're a course teaching assistant.
You're given a question from a course student and your task is to answer it.
If you look up something in FAQ, convert the student question into multiple queries.
""".strip()

chat_messages = [
    {"role": "developer", "content": developer_prompt},
    {"role": "user", "content": question}
]

response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)

In [62]:
# Let's organize our code a little. First, create a function do_call:
def do_call(tool_call_response):
    function_name = tool_call_response.name # The name of the function to call
    arguments = json.loads(tool_call_response.arguments) # Parses the arguments from JSON

    f = globals()[function_name] # Finds the function in the global namespace
    result = f(**arguments)

    return {
        "type": "function_call_output",
        "call_id": tool_call_response.call_id, # "call_id" (to link output to the original call)
        "output": json.dumps(result, indent=2),
    }

In [63]:
# Now iterate over responses:
for entry in response.output:
    chat_messages.append(entry)
    print(entry.type)

    if entry.type == 'function_call':
        result = do_call(entry)
        chat_messages.append(result)
    elif entry.type == 'message':
        print(entry.text)

function_call
function_call
function_call


In [64]:
# First call will probably be function call, so let's do another one (This one is a text response)
# After updating the conversation history, next API call will likely be a regular text answer:
response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)

for entry in response.output:
    chat_messages.append(entry)
    print(entry.type)
    print()

    if entry.type == 'function_call':
        result = do_call(entry)
        chat_messages.append(result)
    elif entry.type == 'message':
        print(entry.content[0].text)

message

To excel in Module 1 of your course, here are some key strategies and tips:

### Understanding the Content:
1. **Familiarize with Docker and Terraform:** Ensure you have a solid grasp of the basics of both Docker and Terraform, as these technologies are essential for the module. 
2. **Hands-On Practice:** Set up local environments using Docker containers to apply what you've learned in theory.

### Addressing Common Issues:
1. **Dependencies:** If you encounter issues like `ModuleNotFoundError` for libraries such as `psycopg2`, make sure to install the required packages:
   - Use: `pip install psycopg2-binary`
   - If you face further issues, try upgrading the package: `pip install psycopg2-binary --upgrade`.

2. **Connection Strings:** When creating SQLAlchemy engines, ensure your connection strings are accurate:
   - Example: 
   ```python
   conn_string = "postgresql+psycopg://root:root@localhost:5432/ny_taxi"
   engine = create_engine(conn_string)
   ```

### General Study

## Putting it all together

Above was a text response. But what if it's not? Use two loops:

- First is the main Q&A loop - ask question, get back the answer
- Second is the request-response loop - send requests until there's a message reply from API

In [65]:
developer_prompt = """
You're a course teaching assistant.
You're given a question from a course student and your task is to answer it.

Use FAQ if your own knowledge is not sufficient to answer the question.
When using FAQ, perform deep topic exploration: make one request to FAQ,
and then based on the results, make more requests.

At the end of each response, ask the user a follow up question based on your answer.
""".strip()

chat_messages = [
    {"role": "developer", "content": developer_prompt},
]

After running the next cell, type your question when prompted. Type 'stop' to exit the chat loop.

In [66]:
while True: # main Q&A loop (waits for user input in a loop)
    question = input() # How do I do my best for module 1?
    if question == 'stop':
        break

    message = {"role": "user", "content": question}
    chat_messages.append(message)
    # request-response loop - Handles the conversation with the model for each question.
    while True: 
        response = client.responses.create(
            model='gpt-4o-mini',
            input=chat_messages,
            tools=tools
        )

        has_messages = False
        # Keep sending requests to API and processing responses until a standard text 'message' is received
        for entry in response.output:
            chat_messages.append(entry)

            if entry.type == 'function_call':
                print('function_call:', entry)
                print()
                result = do_call(entry)
                chat_messages.append(result)
            elif entry.type == 'message':
                print(entry.content[0].text)
                print()
                has_messages = True

        if has_messages:
            break

function_call: ResponseFunctionToolCall(arguments='{"query":"best practices for module 1"}', call_id='call_S34plQTcdDqakLEuBLS8d2Pw', name='search', type='function_call', id='fc_68945315cee881a090ca0d2395c1994c08c1c7312e6e0cae', status='completed')

function_call: ResponseFunctionToolCall(arguments='{"query":"module 1 Docker Terraform tips"}', call_id='call_Dwty4vYTSwKPQvggIgoXG1BN', name='search', type='function_call', id='fc_68945317497c81a0807c5e387bb12dc508c1c7312e6e0cae', status='completed')

To excel in Module 1, which focuses on Docker and Terraform, consider the following strategies:

1. **Follow Best Practices for Docker**:
   - Store your code within your default Linux distribution for optimal file system performance, especially if you’re using WSL2 on Windows. This setup can improve the overall efficiency of Docker.

2. **Ensure Proper Installation**:
   - Double-check that all necessary dependencies are installed correctly. For example, if you encounter issues with Psycopg2

We only exit the inner loop if there are no function calls. In this case, we ask the user for the next input (or "stop").

Let's make it a bit nicer using HTML:

In [67]:
%pip install markdown

Collecting markdown
  Downloading markdown-3.8.2-py3-none-any.whl.metadata (5.1 kB)
Downloading markdown-3.8.2-py3-none-any.whl (106 kB)
Installing collected packages: markdown
Successfully installed markdown-3.8.2
Note: you may need to restart the kernel to use updated packages.


In [80]:
from IPython.display import display, HTML
# first wget chat_assistant.py below
from chat_assistant import ChatInterface
import markdown 

In [84]:
# How do I do well in module 1?
chat_interface = ChatInterface()

developer_prompt = """
You're a course teaching assistant.
You're given a question from a course student and your task is to answer it.

Use FAQ if your own knowledge is not sufficient to answer the question.

At the end of each response, ask the user a follow up question based on your answer.
""".strip()

chat_messages = [
    {"role": "developer", "content": developer_prompt},
]

# Chat loop
while True:
    question = input("You: ") # Was removed from repo
    if question.strip().lower() == 'stop':
        print("Chat ended.")
        break
    print()

    message = {"role": "user", "content": question}
    chat_messages.append(message)

    while True:  # inner request loop
        response = client.responses.create(
            model='gpt-4o-mini',
            input=chat_messages,
            tools=tools
        )

        has_messages = False

        for entry in response.output:
            chat_messages.append(entry)

            if entry.type == "function_call":
                result = do_call(entry)
                chat_messages.append(result)
                chat_interface.display_function_call(entry, result) # My changes

            elif entry.type == "message":
                chat_interface.display_response(entry) # My changes
                has_messages = True

        if has_messages:
            break




Chat ended.


## Using multiple tools

What if we also want to use this chat app to add new entries to the FAQ? We'll need another function for it:

In [85]:
# Add a new custom made FAQ entry to search index
def add_entry(question, answer):
    doc = {
        'question': question,
        'text': answer,
        'section': 'user added',
        'course': 'data-engineering-zoomcamp'
    }
    index.append(doc)

# Description dictionary for OpenAI function calling
add_entry_description = {
    "type": "function",
    "name": "add_entry", # Allow the assistant to call add_entry automatically during a chat
    "description": "Add an entry to the FAQ database",
    "parameters": {
        "type": "object",
        "properties": {
            "question": {
                "type": "string",
                "description": "The question to be added to the FAQ database",
            },
            "answer": {
                "type": "string",
                "description": "The answer to the question",
            }
        },
        "required": ["question", "answer"],
        "additionalProperties": False
    }
}

We can just reuse the previous code. But we can also clean it up and make it more modular.

See the result in [chat_assistant.py](https://github.com/alexeygrigorev/rag-agents-workshop/blob/main/chat_assistant.py)

You can download it using wget:

In [None]:
#!wget https://raw.githubusercontent.com/alexeygrigorev/rag-agents-workshop/refs/heads/main/chat_assistant.py

--2025-08-07 07:31:32--  https://raw.githubusercontent.com/alexeygrigorev/rag-agents-workshop/refs/heads/main/chat_assistant.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3495 (3.4K) [text/plain]
Saving to: ‘chat_assistant.py’


2025-08-07 07:31:32 (41.1 MB/s) - ‘chat_assistant.py’ saved [3495/3495]



Here we define multiple classes:

* Tools - manages function tools for the agent
  * ```add_tool(function, description)```: Register a function with its description
  * ```get_tools()```: Return list of registered tool descriptions
  * ```function_call(tool_call_response)```: Execute a function call and return result
* ChatInterface - handles user input and display formatting
  * ```input()```: Get user input
  * ```display(message)```: Print a message
  * ```display_function_call(entry, result)```: Show function calls in HTML format
  * ```display_response(entry)```: Display AI responses with markdown
* ChatAssistant - main orchestrator for chat conversations.
  * ```__init__(tools, developer_prompt, chat_interface, client)```: Initialize assistant
  * ```gpt(chat_messages)```: Make OpenAI API calls
  * ```run()```: Main chat loop handling user input and AI responses
Let's use it:

In [86]:
# Setting up a modular chat assistant using the chat_assistant module
import chat_assistant

tools = chat_assistant.Tools() # Creates a Tools object: the search function
tools.add_tool(search, search_tool) # OpenAI function-calling description (search_tool)

tools.get_tools() # Retrieve the list of available tools

developer_prompt = """
You're a course teaching assistant.
You're given a question from a course student and your task is to answer it.

Use FAQ if your own knowledge is not sufficient to answer the question.

At the end of each response, ask the user a follow up question based on your answer.
""".strip()

chat_interface = chat_assistant.ChatInterface() # Creates a ChatInterface object for handling user input/output
# Initializes a ChatAssistant object with the tools, prompt, chat interface, and OpenAI client
chat = chat_assistant.ChatAssistant(
    tools=tools,
    developer_prompt=developer_prompt,
    chat_interface=chat_interface,
    client=client
)

In [99]:
# And run it:
chat.run()

Chat ended.


In [88]:
# Add the new tool to the custom function to add new FAQ entries to the search index
tools.add_tool(add_entry, add_entry_description)

In [89]:
# Displays all tools (custom functions) registered with the assistant's toolset
tools.get_tools()

[{'type': 'function',
  'name': 'search',
  'description': 'Search the FAQ database',
  'parameters': {'type': 'object',
   'properties': {'query': {'type': 'string',
     'description': 'Search query text to look up in the course FAQ.'}},
   'required': ['query'],
   'additionalProperties': False}},
 {'type': 'function',
  'name': 'add_entry',
  'description': 'Add an entry to the FAQ database',
  'parameters': {'type': 'object',
   'properties': {'question': {'type': 'string',
     'description': 'The question to be added to the FAQ database'},
    'answer': {'type': 'string', 'description': 'The answer to the question'}},
   'required': ['question', 'answer'],
   'additionalProperties': False}}]

And talk with the assistant:

* How do I do well for module 1?
* Add this back to FAQ

And check that it's in the index:

In [90]:
index.docs[-1]

{'text': 'Problem description\nInfrastructure created in AWS with CD-Deploy Action needs to be destroyed\nSolution description\nFrom local:\nterraform init -backend-config="key=mlops-zoomcamp-prod.tfstate" --reconfigure\nterraform destroy --var-file vars/prod.tfvars\nAdded by Erick Calderin',
 'section': 'Module 6: Best practices',
 'question': 'How to destroy infrastructure created via GitHub Actions',
 'course': 'mlops-zoomcamp'}

In [91]:
search("how do I do well for module 1?")

[{'text': 'Even after installing pyspark correctly on linux machine (VM ) as per course instructions, faced a module not found error in jupyter notebook .\nThe solution which worked for me(use following in jupyter notebook) :\n!pip install findspark\nimport findspark\nfindspark.init()\nThereafter , import pyspark and create spark contex<<t as usual\nNone of the solutions above worked for me till I ran !pip3 install pyspark instead !pip install pyspark.\nFilter based on conditions based on multiple columns\nfrom pyspark.sql.functions import col\nnew_final.filter((new_final.a_zone=="Murray Hill") & (new_final.b_zone=="Midwood")).show()\nKrishna Anand',
  'section': 'Module 5: pyspark',
  'question': 'Module Not Found Error in Jupyter Notebook .',
  'course': 'data-engineering-zoomcamp',
  '_id': 322},
 {'text': 'You need to look for the Py4J file and note the version of the filename. Once you know the version, you can update the export command accordingly, this is how you check yours:\n`

**Summary of the sequence**
1. You call: search("how do I do well for module 1?")
2. search() calls: index.search(...) (from minsearch)
3. index.search() does the actual search and ranking.
4. Results (top 5 relevant FAQ entries) are returned to you.

## Part 4: Using PydanticAI (Frameworks)
### Installing and using PydanticAI

There are frameworks that make it easier for us to create agents

One of them is [PydanticAI](https://ai.pydantic.dev/agents/):

In [None]:
#!pip install pydantic-ai

Collecting pydantic-ai
  Downloading pydantic_ai-0.6.0-py3-none-any.whl.metadata (11 kB)
Collecting pydantic-ai-slim==0.6.0 (from pydantic-ai-slim[ag-ui,anthropic,bedrock,cli,cohere,evals,google,groq,huggingface,mcp,mistral,openai,retries,vertexai]==0.6.0->pydantic-ai)
  Downloading pydantic_ai_slim-0.6.0-py3-none-any.whl.metadata (4.2 kB)
Collecting eval-type-backport>=0.2.0 (from pydantic-ai-slim==0.6.0->pydantic-ai-slim[ag-ui,anthropic,bedrock,cli,cohere,evals,google,groq,huggingface,mcp,mistral,openai,retries,vertexai]==0.6.0->pydantic-ai)
  Downloading eval_type_backport-0.2.2-py3-none-any.whl.metadata (2.2 kB)
Collecting griffe>=1.3.2 (from pydantic-ai-slim==0.6.0->pydantic-ai-slim[ag-ui,anthropic,bedrock,cli,cohere,evals,google,groq,huggingface,mcp,mistral,openai,retries,vertexai]==0.6.0->pydantic-ai)
  Downloading griffe-1.10.0-py3-none-any.whl.metadata (5.0 kB)
Collecting opentelemetry-api>=1.28.0 (from pydantic-ai-slim==0.6.0->pydantic-ai-slim[ag-ui,anthropic,bedrock,cli,cohe

In [93]:
from pydantic_ai import Agent, RunContext
from typing import Dict

In [94]:
# Create an agent:
chat_agent = Agent( # PydanticAI class to build an agentic application
    'openai:gpt-4o-mini',
    system_prompt=developer_prompt
)

In [95]:
# Lets the agent call search_tool automatically when it needs to look up information:
@chat_agent.tool # Registers search_tool as a tool the agent can use
# PydanticAI RunContext object provides the tool function access to the current state and metadata of the agent's run
def search_tool(ctx: RunContext, query: str) -> Dict[str, str]:
    """
    Search the FAQ for relevant entries matching the query.

    Parameters
    ----------
    query : str
        The search query string provided by the user.

    Returns
    -------
    list
        A list of search results (up to 5), each containing relevance information
        and associated output IDs.
    """
    print(f"search('{query}')")
    return search(query)

In [96]:
# Add new question-answer pairs to the FAQ database during a conversation
@chat_agent.tool
def add_entry_tool(ctx: RunContext, question: str, answer: str) -> None:
    """
    Add a new question-answer entry to FAQ.

    This function creates a document with the given question and answer,
    tagging it as user-added content.

    Parameters
    ----------
    question : str
        The question text to be added to the index.

    answer : str
        The answer or explanation corresponding to the question.

    Returns
    -------
    None
    """
    return add_entry(question, answer)

In [97]:
# It reads the functions' docstrings to automatically create function definition, so we don't need to worry about it:
user_prompt = "I just discovered the course. Can I join now?"
# await keyword means this must be run in an async context (run code without blocking the program while waiting for requests or operations)
agent_run = await chat_agent.run(user_prompt)

search('join course late enrollment')


In [98]:
agent_run.output

"Yes, you can still join the course even after the start date! Even if you don't officially register, you're eligible to submit homework. However, be mindful that there will be deadlines for turning in the final projects, so try not to leave everything until the last minute.\n\nWould you like to know more about the course content or any specific requirements?"

If want to learn more about implementing chat applications with Pydantic AI:

* https://ai.pydantic.dev/message-history/
* https://ai.pydantic.dev/examples/chat-app/

# Wrap up
In this workshop, we took our RAG application and made it agentic, by first tweaking the prompts, and then using the "function calling" functionality from OpenAI.

At the end, we put all the logic into the chat_assistant.py  script, and also explored PydanticAI to make it simpler.

What's next:

* MCP
* Agent deployment
* Agent monitoring