### From RAG to Agents: Building Smart AI Assistants

- Tutorial: https://github.com/alexeygrigorev/rag-agents-workshop
- Video: https://www.youtube.com/watch?v=GH3lrOsU3AU


In this workshop we

- Build a RAG application on the FAQ database
- Make it agentic
- Learn about agentic search
- Give tools to our agents
- Use PydanticAI to make it easier

For this workshop, we will use the following FAQ documents from [our free courses](https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html):

* [Machine Learning Zoomcamp](https://docs.google.com/document/d/1LpPanc33QJJ6BSsyxVg-pWNMplal84TdZtq10naIhD8/edit?tab=t.0) 
* [Data Engineering Zoomcamp](https://docs.google.com/document/d/19bnYs80DwuUimHM65UV3sylsCn2j1vziPOwzBwQrebw/edit?tab=t.0#heading=h.edeyusfgl4b7)
* [MLOps Zoomcamp](https://docs.google.com/document/d/12TlBfhIiKtyBv8RnsoJR6F72bkPDGEvPOItJIxaEzE0/edit?tab=t.0)

#### Environment

* For this workshop, all you need is Python with Jupyter.
* I use GitHub Codespaces to run it (see [here](https://www.loom.com/share/80c17fbadc9442d3a4829af56514a194)) but you can use whatever environment you like.
* Also, you need an [OpenAI account](https://openai.com/) (or an alternative provider).

#### Setting up Github Codespaces

Github Codespaces is the recommended environment for this 
workshop. But you can use any other environment with
Jupyter Notebook, including your laptop and Google Colab.

* Create a repository on GitHub, initialize it with README.md
* Add the OpenAI key:
    * Go to Settings -> Secrets and Variables (under Security) -> Codespaces
    * Click "New repository secret"
    * Name: `OPENAI_API_KEY`, Secret: your key
    * Click "Add secret"
* Create a codespace
    * Click "Code" 
    * Select the "Codespaces" tab
    * "Create codespaces on main"

#### Installing required libraries

Next we need to install the required libraries:

In [4]:
%pip install jupyter openai minsearch requests markdown pydantic-ai

Collecting pydantic-ai
  Downloading pydantic_ai-0.3.6-py3-none-any.whl.metadata (11 kB)
Collecting pydantic-ai-slim==0.3.6 (from pydantic-ai-slim[a2a,anthropic,bedrock,cli,cohere,evals,google,groq,mcp,mistral,openai,vertexai]==0.3.6->pydantic-ai)
  Downloading pydantic_ai_slim-0.3.6-py3-none-any.whl.metadata (3.8 kB)
Collecting eval-type-backport>=0.2.0 (from pydantic-ai-slim==0.3.6->pydantic-ai-slim[a2a,anthropic,bedrock,cli,cohere,evals,google,groq,mcp,mistral,openai,vertexai]==0.3.6->pydantic-ai)
  Downloading eval_type_backport-0.2.2-py3-none-any.whl.metadata (2.2 kB)
Collecting griffe>=1.3.2 (from pydantic-ai-slim==0.3.6->pydantic-ai-slim[a2a,anthropic,bedrock,cli,cohere,evals,google,groq,mcp,mistral,openai,vertexai]==0.3.6->pydantic-ai)
  Downloading griffe-1.7.3-py3-none-any.whl.metadata (5.0 kB)
Collecting opentelemetry-api>=1.28.0 (from pydantic-ai-slim==0.3.6->pydantic-ai-slim[a2a,anthropic,bedrock,cli,cohere,evals,google,groq,mcp,mistral,openai,vertexai]==0.3.6->pydantic-ai

In [32]:
import os
import requests
from minsearch import AppendableIndex
from openai import OpenAI
from IPython.display import display, HTML
import markdown
from pydantic_ai import Agent, RunContext
from typing import Dict
import json


### Part 0: Basic RAG

#### RAG

RAG consists of 3 parts:
* Search
* Prompt
* LLM

So in pythin it looks like:

```
def rag(query):
    search_results = search(query)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt)
    return answer
```

Let's implement each component step-by-step.

#### Search

First, we implement a basic search function that will query our FAQ database.  This function takes a query and returns relevant documents.

We will use `minsearch`  for that.

Get the documents:

In [6]:
docs_url = 'https://github.com/alexeygrigorev/llm-rag-workshop/raw/main/notebooks/documents.json'
docs_response = requests.get(docs_url)
documents_raw = docs_response.json()

documents = []

for course in documents_raw:
    course_name = course['course']

    for doc in course['documents']:
        doc['course'] = course_name
        documents.append(doc)

In [15]:
documents[2]

{'text': "Yes, even if you don't register, you're still eligible to submit the homeworks.\nBe aware, however, that there will be deadlines for turning in the final projects. So don't leave everything for the last minute.",
 'section': 'General course-related questions',
 'question': 'Course - Can I still join the course after the start date?',
 'course': 'data-engineering-zoomcamp'}

Index the documents:

In [7]:
index = AppendableIndex(
    text_fields = ["question", "text", "section"],
    keyword_fields = ["course"]
)

index.fit(documents)

<minsearch.append.AppendableIndex at 0x76ebf384b620>

Now search:

In [8]:
def search(query):
    boost = {'question': 3.0, 'section': 0.5}

    results = index.search(
        query = query,
        filter_dict = {'course': 'data-engineering-zoomcamp'},
        boost_dict = boost,
        num_results = 5,
        output_ids = True
    )

    return results

In [16]:
index.search("Can I still join the course?")

[{'text': 'Yes, you can. You won’t be able to submit some of the homeworks, but you can still take part in the course.\nIn order to get a certificate, you need to submit 2 out of 3 course projects and review 3 peers’ Projects by the deadline. It means that if you join the course at the end of November and manage to work on two projects, you will still be eligible for a certificate.',
  'section': 'General course-related questions',
  'question': 'The course has already started. Can I still join it?',
  'course': 'machine-learning-zoomcamp'},
 {'text': "Yes, even if you don't register, you're still eligible to submit the homeworks.\nBe aware, however, that there will be deadlines for turning in the final projects. So don't leave everything for the last minute.",
  'section': 'General course-related questions',
  'question': 'Course - Can I still join the course after the start date?',
  'course': 'data-engineering-zoomcamp'},
 {'text': "Here’s how you join a in Slack: https://slack.com/

In [17]:
search("Can I still join the course?")

[{'text': "Yes, even if you don't register, you're still eligible to submit the homeworks.\nBe aware, however, that there will be deadlines for turning in the final projects. So don't leave everything for the last minute.",
  'section': 'General course-related questions',
  'question': 'Course - Can I still join the course after the start date?',
  'course': 'data-engineering-zoomcamp',
  '_id': 2},
 {'text': "No, you can only get a certificate if you finish the course with a “live” cohort. We don't award certificates for the self-paced mode. The reason is you need to peer-review capstone(s) after submitting a project. You can only peer-review projects at the time the course is running.",
  'section': 'General course-related questions',
  'question': 'Certificate - Can I follow the course in a self-paced mode and get a certificate?',
  'course': 'data-engineering-zoomcamp',
  '_id': 11},
 {'text': 'Yes, we will keep all the materials after the course finishes, so you can follow the cou

**Explanation:**

- This function is the foundation of our RAG system
- It looks up in the FAQ to find relevant information
- The result is used to build context for the LLM

#### Prompt

We create a function to format the search results into a structured contect that our LLM can use.

In [18]:
prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

<QUESTION>
{question}
</QUESTION>

<CONTEXT>
{context}
</CONTEXT>
""".strip()

In [21]:
def build_prompt(query, search_results):
    context = ""

    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"

    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

In [22]:
question = "Can I still join the course?"

In [23]:
search_results = search(question)

In [24]:
prompt = build_prompt(question, search_results)
print(prompt)

You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

<QUESTION>
Can I still join the course?
</QUESTION>

<CONTEXT>
section: General course-related questions
question: Course - Can I still join the course after the start date?
answer: Yes, even if you don't register, you're still eligible to submit the homeworks.
Be aware, however, that there will be deadlines for turning in the final projects. So don't leave everything for the last minute.

section: General course-related questions
question: Certificate - Can I follow the course in a self-paced mode and get a certificate?
answer: No, you can only get a certificate if you finish the course with a “live” cohort. We don't award certificates for the self-paced mode. The reason is you need to peer-review capstone(s) after submitting a project. You can only peer-review projects at the time the course is running.

section: General

**Explanation:**

- Takes search results
- Formats each document
- Put everything in a prompt


#### The RAG Flow

We add a call to an LLM and combine everything into a complete RAG pipeline:

In [11]:
client = OpenAI()

In [12]:
def llm(prompt):
    response = client.chat.completions.create(
        model = 'gpt-4o-mini',
        messages = [{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

In [13]:
def rag(query):
    search_results = search(query)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt)
    return answer

In [25]:
answer = llm(prompt)
print(answer)

Yes, you can still join the course even after the start date. You are eligible to submit the homework assignments, but be aware that there will be deadlines for the final projects, so make sure not to leave everything until the last minute.


**Explanation:**
- `build_prompt`: Formats the search results into a prompt
- `llm`: Makes the API call to the language model
- `rag`: Combines search and LLM into a single function

In [26]:
rag("How do I run Kafka in Docker?")

'To run Kafka in Docker, ensure that all your Kafka broker Docker containers are operational. You can verify this by using the command `docker ps` to check the status of your containers. If they are not running, navigate to the folder containing your Docker Compose YAML file and execute the command `docker compose up -d` to start all the instances.'

Looking up locally, but there is no information

In [27]:
rag("How do I patch KDE under FreeBSD?")

"I'm sorry, but there is no information available in the provided context regarding how to patch KDE under FreeBSD."

The external LLM has more information

In [28]:
print(llm("How do I patch KDE under FreeBSD?"))

Patching KDE under FreeBSD involves several steps, depending on whether you're looking to patch the source code or apply a specific fix/patch to an installed KDE application. Below are the general steps to patch KDE under FreeBSD:

### 1. Install the Required Packages

Before applying a patch, make sure you have the necessary tools installed:

```bash
pkg install git patch
```

### 2. Obtain the KDE Source Code

If you want to patch the source code instead of a binary, you'll need to check out the KDE source from the FreeBSD ports tree or from the official KDE repository.

#### From the Ports Tree

You can check out the KDE ports:

```bash
cd /usr/ports
portsnap fetch update
cd x11/kde5
```

#### From KDE's Git Repository

Alternatively, you can clone the KDE source code directly.

```bash
git clone https://invent.kde.org/<project>.git
cd <project>
```

Replace `<project>` with the specific KDE component you are interested in.

### 3. Apply the Patch

If you have a patch file, you can 

### Part 1: Agentic RAG

Now let's make our flow agentic

### Agents and Agentic Flows 

Agents are AI systems that can:

- Make decisions about what actions to take
- Use tools to accomplish tasks
- Maintain state and context
- Learn from previous interactions
- Work towards specific goals

Agentic flow is not necessarily a completely independent agent,
but it can still make some decisions during the flow execution

A typical agentic flow consists of:

1. Receiving a user request
2. Analyzing the request and available tools
3. Deciding on the next action
4. Executing the action using appropriate tools
5. Evaluating the results
6. Either completing the task or continuing with more actions

The key difference from basic RAG is that agents can:

- Make multiple search queries
- Combine information from different sources
- Decide when to stop searching
- Use their own knowledge when appropriate
- Chain multiple actions together

So in agentic RAG, the system

- has access to the history of previous actions
- makes decisions independently based on the current information
  and the previous actions

Let's implement this step by step.

#### Making RAG More Agentic

First, we'll take the prompt we have so far and make it a little more "agentic":

- Tell the LLM that it can answer the question directly or look up context
- Provide output templates
- Show clearly what's the source of the answer

In [29]:
prompt_template = """
You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.
At the beginning the context is EMPTY.

<QUESTION>
{question}
</QUESTION>

<CONTEXT>
{context}
</CONTEXT>

If CONTEXT is EMPTY, you can use our FAQ database.
In this case, use the following output template:

{{
"action": "SEARCH",
"reasoning": "<add your reasoning here>"
}}

If you can answer the QUESTION using CONTEXT, use this template:

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "CONTEXT"
}}

If the context doesn't contain the answer, use your own knowledge to answer the question

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "OWN_KNOWLEDGE"
}}
""".strip()

In [30]:
question = 'How can I run Docker on Windows 10?'
context = 'EMPTY'

prompt = prompt_template.format(question=question, context=context)
answer = llm(prompt)
print(answer)

{
"action": "ANSWER",
"answer": "To run Docker on Windows 10, you need to follow these steps: 1. Ensure you have Windows 10 Pro, Education, or Enterprise (version 15063 or later) since Docker requires Hyper-V support which is available only in these editions. 2. Download Docker Desktop for Windows from the official Docker website. 3. Run the installer and follow the prompts to complete the installation. 4. After installation, Docker Desktop may prompt you to enable the WSL 2 feature if you haven't already. If prompted, follow the instructions to set up Windows Subsystem for Linux version 2. 5. Once installed, start Docker Desktop. You might need to sign in with a Docker Hub account or create one if you don’t have it. 6. After starting Docker, you can test it by running a simple command in PowerShell or the Command Prompt, such as `docker --version` to check if Docker is installed correctly. You can also run a sample container (e.g., `docker run hello-world`) to verify everything is wor

If we ask for something that it can't answer:

In [31]:
question = "how do I join the course?"
context = "EMPTY"

prompt = prompt_template.format(question=question, context=context)
answer = llm(prompt)
print(answer)

{
"action": "ANSWER",
"answer": "To join the course, you typically need to register through the course website or platform where it's offered. Look for a registration or enrollment button, and follow the instructions to create an account or log in. If there are prerequisites or deadlines, make sure to check those as well.",
"source": "OWN_KNOWLEDGE"
}


In [None]:
answer_json = llm(prompt)

In [34]:
answer = json.loads(answer_json)
answer['action']

'SEARCH'

Let's implement the search:

In [36]:
def build_context(search_results):
    context = ""

    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"

    return context.strip()
    

In [37]:
search_results = search(question)
context = build_context(search_results)
prompt = prompt_template.format(question=question, context=context)
print(prompt)

You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.
At the beginning the context is EMPTY.

<QUESTION>
how do I join the course?
</QUESTION>

<CONTEXT>
section: General course-related questions
question: Course - Can I still join the course after the start date?
answer: Yes, even if you don't register, you're still eligible to submit the homeworks.
Be aware, however, that there will be deadlines for turning in the final projects. So don't leave everything for the last minute.

section: General course-related questions
question: Course - When will the course start?
answer: The purpose of this document is to capture frequently asked technical questions
The exact day and hour of the course will be 15th Jan 2024 at 17h00. The course will start with the first  “Office Hours'' live.1
Subscribe to course public Google Calendar (it works from Desktop only).
Register before the course starts

In [38]:
answer = llm(prompt)
print(answer)

{
"action": "ANSWER",
"answer": "To join the course, you need to register before the course starts using the provided registration link. The course begins on 15th January 2024 at 17h00. Make sure to also join the course's Telegram channel for announcements and register in DataTalks.Club's Slack to stay updated.",
"source": "CONTEXT"
}


Let's put this together:

- First attempt to answer it with our knowledge
- If needed, do the lookup and then answer

In [41]:
def agentic_rag_v1(question):
    context = "EMPTY"
    prompt = prompt_template.format(question=question, context=context)
    answer_json = llm(prompt)
    answer = json.loads(answer_json)
    print(answer)

    if answer['action'] == 'SEARCH':
        print('need to perform search...')
        search_results = search(question)
        context = build_context(search_results)

        prompt = prompt_template.format(question=question, context=context)
        answer_json = llm(prompt)
        answer = json.loads(answer_json)
        print(answer)

    return answer

In [42]:
agentic_rag_v1('how do I join the course?')

{'action': 'ANSWER', 'answer': 'To join the course, you typically need to visit the course website or the platform where the course is hosted. There, you should find an option to enroll or register. You may need to create an account or log in if you already have one. Follow the provided instructions to complete your enrollment.', 'source': 'OWN_KNOWLEDGE'}


{'action': 'ANSWER',
 'answer': 'To join the course, you typically need to visit the course website or the platform where the course is hosted. There, you should find an option to enroll or register. You may need to create an account or log in if you already have one. Follow the provided instructions to complete your enrollment.',
 'source': 'OWN_KNOWLEDGE'}

In [43]:
agentic_rag_v1('how to patch KDE under FreeBSD?')

{'action': 'ANSWER', 'answer': "To patch KDE under FreeBSD, you can typically follow these steps:\n\n1. **Update Ports Collection:** Make sure your FreeBSD ports tree is up to date. You can update it by using the `portsnap` command:\n   ```\n   portsnap fetch update\n   ```\n\n2. **Navigate to KDE Port Directory:** Go to the directory where KDE is installed. For example, if you are using KDE Plasma, you would navigate to the corresponding KDE port directory like this:\n   ```\n   cd /usr/ports/x11/kde5\n   ```\n\n3. **Fetch the Latest Patch:** If there are any patches available for the KDE port you are using, you can fetch them. Check the KDE development site or FreeBSD's patches repository.\n\n4. **Apply the Patch:** After downloading the patch, you can usually apply it using the `patch` utility. For example:\n   ```\n   patch < path/to/patchfile.patch\n   ```\n\n5. **Build and Install:** Once the patch is applied, you need to rebuild and install KDE. You can do this with:\n   ```\n  

{'action': 'ANSWER',
 'answer': "To patch KDE under FreeBSD, you can typically follow these steps:\n\n1. **Update Ports Collection:** Make sure your FreeBSD ports tree is up to date. You can update it by using the `portsnap` command:\n   ```\n   portsnap fetch update\n   ```\n\n2. **Navigate to KDE Port Directory:** Go to the directory where KDE is installed. For example, if you are using KDE Plasma, you would navigate to the corresponding KDE port directory like this:\n   ```\n   cd /usr/ports/x11/kde5\n   ```\n\n3. **Fetch the Latest Patch:** If there are any patches available for the KDE port you are using, you can fetch them. Check the KDE development site or FreeBSD's patches repository.\n\n4. **Apply the Patch:** After downloading the patch, you can usually apply it using the `patch` utility. For example:\n   ```\n   patch < path/to/patchfile.patch\n   ```\n\n5. **Build and Install:** Once the patch is applied, you need to rebuild and install KDE. You can do this with:\n   ```\n 

### Part 2: Agentic Search

So far we had two actions only: search and answer.

But we can let our "agent" formulate one or more search queries - and do it for a few iterations until we find an answer.

Let's build a prompt:

- List available actions:
    - Search in FAQ
    - Answer using own knowledge
    - Answer using information extracted from FAQ 
- Provide access to the previous actions
- Have clear stop criteria (no more than X iterations)
- We also specify the output format, so it's easier to parse it

In [44]:
prompt_template = """
You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to and add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic. 

Don't use search queries used at the previous iterations.

Don't repeat previously performed actions.

Don't perform more than {max_iterations} iterations for a given student question.
The current iteration number: {iteration_number}. If we exceed the allowed number 
of iterations, give the best possible answer with the provided information.

Output templates:

If you want to perform search, use this template:

{{
"action": "SEARCH",
"reasoning": "<add your reasoning here>",
"keywords": ["search query 1", "search query 2", ...]
}}

If you can answer the QUESTION using CONTEXT, use this template:

{{
"action": "ANSWER_CONTEXT",
"answer": "<your answer>",
"source": "CONTEXT"
}}

If the context doesn't contain the answer, use your own knowledge to answer the question

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "OWN_KNOWLEDGE"
}}

<QUESTION>
{question}
</QUESTION>

<SEARCH_QUERIES>
{search_queries}
</SEARCH_QUERIES>

<CONTEXT> 
{context}
</CONTEXT>

<PREVIOUS_ACTIONS>
{previous_actions}
</PREVIOUS_ACTIONS>
""".strip()

Our code becomes more complicated.  For the first iteration, we have:

In [45]:
question = "how do I join the course?"

search_queries = []
search_results = []
previous_actions = []
context = build_context(search_results)

prompt = prompt_template.format(
    question=question,
    context=context,
    search_queries="\n".join(search_queries),
    previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
    max_iterations=3,
    iteration_number=1
)
print(prompt)

You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to and add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic. 

Don't use search queries used at the previous iterations.

Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current iteration number

In [46]:
answer_json = llm(prompt)
answer = json.loads(answer_json)
print(json.dumps(answer, indent=2))

{
  "action": "SEARCH",
  "reasoning": "To provide accurate information on how to join the course, I need to look for relevant details in our FAQ database.",
  "keywords": [
    "join course",
    "course enrollment",
    "how to register for course"
  ]
}


We need to save the actions:

In [47]:
previous_actions.append(answer)

Save the search queries:

In [48]:
keywords = answer['keywords']
search_queries.extend(keywords)

In [49]:
for k in keywords:
    res = search(k)
    search_results.extend(res)

Some of the search results will be duplicates, so we need to remove them:

In [None]:
for k in keywords:
    res = search(k)
    search_results.extend(res)

In [50]:
def dedup(seq):
    seen = set()
    result = []
    for el in seq:
        _id = el['_id']
        if _id in seen:
            continue
        seen.add(_id)
        result.append(el)
    return result

In [51]:
search_results = dedup(search_results)

Now let's make another iteration - use the same code as previously, but remove variable initialization and increase the iteration number:

In [52]:
context = build_context(search_results)

prompt = prompt_template.format(
    question=question,
    context=context,
    search_queries="\n".join(search_queries),
    previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
    max_iterations=3,
    iteration_number=2
)
print(prompt)

answer_json = llm(prompt)
answer = json.loads(answer_json)
print(json.dumps(answer, indent=2))

You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to and add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic. 

Don't use search queries used at the previous iterations.

Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current iteration number

Putting everything together:

In [55]:
question = "what do I need to do to be successful at module 1?"

search_queries = []
search_results = []
previous_actions = []


iteration = 0

while True:
    print(f'ITERATION #{iteration}...')

    context = build_context(search_results)
    prompt = prompt_template.format(
        question=question,
        context=context,
        search_queries="\n".join(search_queries),
        previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
        max_iterations=3,
        iteration_number=iteration
    )

    print(prompt)

    answer_json = llm(prompt)
    answer = json.loads(answer_json)
    print(json.dumps(answer, indent=2))

    previous_actions.append(answer)

    action = answer['action']
    if action != 'SEARCH':
        break

    keywords = answer['keywords']
    search_queries = list(set(search_queries) | set(keywords))
    
    for k in keywords:
        res = search(k)
        search_results.extend(res)

    search_results = dedup(search_results)
    
    iteration = iteration + 1
    if iteration >= 4:
        break

    print()

ITERATION #0...
You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to and add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic. 

Don't use search queries used at the previous iterations.

Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current 

Make it a function:

In [56]:
def agentic_search(question):
    search_queries = []
    search_results = []
    previous_actions = []

    iteration = 0
    
    while True:
        print(f'ITERATION #{iteration}...')
    
        context = build_context(search_results)
        prompt = prompt_template.format(
            question=question,
            context=context,
            search_queries="\n".join(search_queries),
            previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
            max_iterations=3,
            iteration_number=iteration
        )
    
        print(prompt)
    
        answer_json = llm(prompt)
        answer = json.loads(answer_json)
        print(json.dumps(answer, indent=2))

        previous_actions.append(answer)
    
        action = answer['action']
        if action != 'SEARCH':
            break
    
        keywords = answer['keywords']
        search_queries = list(set(search_queries) | set(keywords))

        for k in keywords:
            res = search(k)
            search_results.extend(res)
    
        search_results = dedup(search_results)
        
        iteration = iteration + 1
        if iteration >= 4:
            break
    
        print()

    return answer

In [57]:
agentic_search('how do I prepare for the course?')

ITERATION #0...
You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to and add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic. 

Don't use search queries used at the previous iterations.

Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current 

{'action': 'ANSWER',
 'answer': "To prepare for the course effectively, consider the following tips:\n\n1. **Familiarize Yourself with Course Materials:** Check the course syllabus and any materials provided beforehand to understand what tools and resources you'll need.\n2. **Join Relevant Channels:** Make sure to join the course's Telegram channel and Slack workspace to stay updated on announcements and engage with peers.\n3. **Set Up Your Environment:** If the course involves coding or using specific tools, prepare your workspace by installing the necessary software and confirming that your hardware meets the requirements.\n4. **Time Management:** Allocate specific times in your schedule for studying to ensure consistency and sufficient preparation.\n5. **Engage in Preliminary Readings:** If the course recommends any readings or preparatory material, engage with these resources to build a foundational understanding before the course starts.\n6. **Review Technical Skills:** Brush up o

### Part 3: Function Calling

#### Function calling in OpenAI

We put all this logic inside our prompt.

But OpenAI and other providers provide a convenient API for adding extra functionality like search.

* https://platform.openai.com/docs/guides/function-calling

It's called "function calling" - you define functions that the model can call, and if it decides to make a call, it returns structured output for that.

For example, let's take our `search` function:


In [58]:
def search(query):
    boost = {'question': 3.0, 'section': 0.5}

    results = index.search(
        query = query,
        filter_dict = {'course': 'data-engineering-zoomcamp'},
        boost_dict = boost,
        num_results = 5,
        output_ids = True
    )

    return results

In [59]:
search_tool = {
    "type": "function",
    "name": "search",
    "description": "Search the FAQ database",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "Search query text to look up in the course FAQ."
            }
        },
        "required": ["query"],
        "additionalProperties": False
    }
}

Here we have:

- `name`: `search`
- `description`: when to use it
- `parameters`: all the arguments that the function can take and their description

In order to use function calling, we'll use a newer API - the "responses" API (not "chat completions" as previously):

In [60]:
question = "How do I do well in module 1?"

developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.
""".strip()

tools = [search_tool]

chat_messages = [
    {"role": "developer", "content": developer_prompt},
    {"role": "user", "content": question}
]

response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)
response.output

[ResponseFunctionToolCall(arguments='{"query":"module 1 success tips"}', call_id='call_BSJZ9O1Vd5huaJ44Iu9C1SEa', name='search', type='function_call', id='fc_686ae1076198819b8db38ba4d8ef1f9708e11cc44508575f', status='completed')]

If the model thinks we should make a function call, it will tell us.

Let's make a call to `search`:

In [61]:
calls = response.output
call = calls[0]
call

ResponseFunctionToolCall(arguments='{"query":"module 1 success tips"}', call_id='call_BSJZ9O1Vd5huaJ44Iu9C1SEa', name='search', type='function_call', id='fc_686ae1076198819b8db38ba4d8ef1f9708e11cc44508575f', status='completed')

In [62]:
call_id = call.call_id
call_id

'call_BSJZ9O1Vd5huaJ44Iu9C1SEa'

In [63]:
f_name = call.name
f_name

'search'

In [64]:
arguments = json.loads(call.arguments)
arguments

{'query': 'module 1 success tips'}

Using `f_name` we can find the function we need:

In [65]:
f = globals()[f_name]

and invoke it with the arguments:

In [66]:
results = f(**arguments)

Now let's save the results as json:

In [67]:
search_results = json.dumps(results, indent=2)
print(search_results)

[
  {
    "text": "Even after installing pyspark correctly on linux machine (VM ) as per course instructions, faced a module not found error in jupyter notebook .\nThe solution which worked for me(use following in jupyter notebook) :\n!pip install findspark\nimport findspark\nfindspark.init()\nThereafter , import pyspark and create spark contex<<t as usual\nNone of the solutions above worked for me till I ran !pip3 install pyspark instead !pip install pyspark.\nFilter based on conditions based on multiple columns\nfrom pyspark.sql.functions import col\nnew_final.filter((new_final.a_zone==\"Murray Hill\") & (new_final.b_zone==\"Midwood\")).show()\nKrishna Anand",
    "section": "Module 5: pyspark",
    "question": "Module Not Found Error in Jupyter Notebook .",
    "course": "data-engineering-zoomcamp",
    "_id": 322
  },
  {
    "text": "You need to look for the Py4J file and note the version of the filename. Once you know the version, you can update the export command accordingly, th

and save both the response and the result of the function call:

In [68]:
chat_messages.append(call)

chat_messages.append({
    "type": "function_call_output",
    "call_id": call.call_id,
    "output": search_results
})

Now `chat_messages` contains both the call description (so it keeps track of history) and the results.

Let's make another call to the model:

In [69]:
response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)

This time it should be the response (but also can be another call):

In [70]:
r = response.output[0]
print(r.content[0].text)

To excel in Module 1, here are some effective tips based on common challenges and solutions from your course:

1. **Understand the Basics**: Make sure you have a solid grasp of Docker and Terraform, as these are the foundations for this module. Review any provided introductory materials.

2. **Set Up Your Environment**: Properly install all required tools. If you're encountering issues (like missing Python packages), try:
   - Running `pip install psycopg2-binary` if you face a `ModuleNotFoundError` related to psycopg2.
   - Updating your existing installations with `pip install --upgrade psycopg2-binary`, if necessary.

3. **Review Common Errors**: Familiarize yourself with the common errors encountered in this module, such as:
   - Ensure you have the correct connection string when using SQLAlchemy.
   - Double-check that environment variables are set properly as per the course instructions.

4. **Engage with Resources**: Utilize any videos, reading materials, and forums provided for

#### Making Multiple Calls

What if we want to make multiple calls?  Change the developer prompt a little:

In [71]:
developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.
If you look up something in FAQ, convert the student question into multiple queries.
""".strip()

chat_messages = [
    {"role": "developer", "content": developer_prompt},
    {"role": "user", "content": question}
]

response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)

Let's organize the code.

In [72]:
def do_call(tool_call_response):
    function_name = tool_call_response.name
    arguments = json.loads(tool_call_response.arguments)

    f = globals()[function_name]
    result = f(**arguments)

    return {
        "type": "function_call_output",
        "call_id": tool_call_response.call_id,
        "output": json.dumps(result, indent=2),
    }

Iterate over responses:

In [73]:
for entry in response.output:
    chat_messages.append(entry)
    print(entry.type)

    if entry.type == 'function_call':      
        result = do_call(entry)
        chat_messages.append(result)
    elif entry.type == 'message':
        print(entry.text) 

function_call
function_call
function_call


The first call will probably be a function call, so let's do another one:

In [74]:
response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)

for entry in response.output:
    chat_messages.append(entry)
    print(entry.type)
    print()

    if entry.type == 'function_call':      
        result = do_call(entry)
        chat_messages.append(result)
    elif entry.type == 'message':
        print(entry.content[0].text) 

message

To do well in Module 1, here are some tips and strategies:

1. **Understand the Course Content**:
   - Focus on the core concepts related to Docker and Terraform, as these are essential for your understanding of the module.

2. **Practice Hands-On Exercises**:
   - Engage in practical exercises to apply what you've learned. This includes setting up Docker containers and managing infrastructure with Terraform.

3. **Troubleshooting Common Issues**:
   - Be prepared to encounter common errors. For instance, you might face errors like "No module named 'psycopg2'." If you encounter this, try running:
     ```bash
     pip install psycopg2-binary
     ```
     If that fails, consider updating your packages or installing PostgreSQL if you need it for your work.

4. **Utilize Community Resources**:
   - Learn from discussions and solutions that other students have shared in forums or groups. For example, referencing common fixes for module import errors in Python.

5. **Review Exampl

This one is a text response.

#### Putting Everything Together

Let's make two loops: 

- First is the main Q&A loop - ask question, get back the answer
- Second is the request loop - send requests until there's a message reply from the API

Note: you will need to type stop to end the chat in VSCode

In [75]:
developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.

Use FAQ if your own knowledge is not sufficient to answer the question.
When using FAQ, perform deep topic exploration: make one request to FAQ,
and then based on the results, make more requests.

At the end of each response, ask the user a follow up question based on your answer.
""".strip()

chat_messages = [
    {"role": "developer", "content": developer_prompt},
]

In [76]:
while True: # main Q&A loop
    question = input() # How do I do my best for module 1?
    if question == 'stop':
        break

    message = {"role": "user", "content": question}
    chat_messages.append(message)

    while True: # request-response loop - query API till get a message
        response = client.responses.create(
            model='gpt-4o-mini',
            input=chat_messages,
            tools=tools
        )

        has_messages = False
        
        for entry in response.output:
            chat_messages.append(entry)
        
            if entry.type == 'function_call':      
                print('function_call:', entry)
                print()
                result = do_call(entry)
                chat_messages.append(result)
            elif entry.type == 'message':
                print(entry.content[0].text)
                print()
                has_messages = True

        if has_messages:
            break

function_call: ResponseFunctionToolCall(arguments='{"query":"module 1 best practices"}', call_id='call_fHDi4R1NF7WAchwYmO7jw4Yh', name='search', type='function_call', id='fc_686ae43df82081998e24e6d256d4c3890d93f1fa785bda70', status='completed')

function_call: ResponseFunctionToolCall(arguments='{"query":"module 1 study tips"}', call_id='call_nKPv12FFSgFdceAavxX35ZXv', name='search', type='function_call', id='fc_686ae43ec0d48199a66822b9430bb0780d93f1fa785bda70', status='completed')

To excel in Module 1, which covers Docker and Terraform, here are some best practices and tips you can follow:

1. **Understand Docker Basics**: Familiarize yourself with Docker concepts such as containers, images, and Dockerfiles. This foundational knowledge is crucial as it's heavily utilized in the module.

2. **Follow Installation Instructions Carefully**: Make sure you correctly install any required software. For example, installing the `psycopg2` package may be necessary if you're working with SQLAlch

It's also possible that there's both message and tool calls, but we'll ignore this case for now.  (It's easy to fix - just check if there are no function calls, and only then ask the user for input.)

Let's make it a bit nicer using HTML:

In [77]:
developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.

Use FAQ if your own knowledge is not sufficient to answer the question.

At the end of each response, ask the user a follow up question based on your answer.
""".strip()

chat_messages = [
    {"role": "developer", "content": developer_prompt},
]

# Chat loop
while True:
    
    if question.strip().lower() == 'stop':
        print("Chat ended.")
        break
    print()

    message = {"role": "user", "content": question}
    chat_messages.append(message)

    while True:  # inner request loop
        response = client.responses.create(
            model='gpt-4o-mini',
            input=chat_messages,
            tools=tools
        )

        has_messages = False

        for entry in response.output:
            chat_messages.append(entry)

            if entry.type == "function_call":
                result = do_call(entry)
                chat_messages.append(result)
                display_function_call(entry, result)

            elif entry.type == "message":
                display_response(entry)
                has_messages = True

        if has_messages:
            break

Chat ended.


#### Using Multiple Tools

What if we also want to use this chat app to add new entries to the FAQ?  We'll need another function for it:

In [78]:
def add_entry(question, answer):
    doc = {
        'question': question,
        'text': answer,
        'section': 'user added',
        'course': 'data-engineering-zoomcamp'
    }
    index.append(doc)

Description:

In [79]:
add_entry_description = {
    "type": "function",
    "name": "add_entry",
    "description": "Add an entry to the FAQ database",
    "parameters": {
        "type": "object",
        "properties": {
            "question": {
                "type": "string",
                "description": "The question to be added to the FAQ database",
            },
            "answer": {
                "type": "string",
                "description": "The answer to the question",
            }
        },
        "required": ["question", "answer"],
        "additionalProperties": False
    }
}

We can just reuse the preivous code. But we can also clean it up
and make it more modular. 

See the result in [`chat_assistant.py`](chat_assistant.py)

You can download it using `wget`:

```bash
wget https://raw.githubusercontent.com/alexeygrigorev/rag-agents-workshop/refs/heads/main/chat_assistant.py
```

Here we define multiple classes:

- `Tools` - manages function tools for the agent
    - `add_tool(function, description)`: Register a function with its description
    - `get_tools()`: Return list of registered tool descriptions
    - `function_call(tool_call_response)`: Execute a function call and return result
- `ChatInterface` - handles user input and display formatting
    - `input()`: Get user input
    - `display(message)`: Print a message
    - `display_function_call(entry, result)`: Show function calls in HTML format
    - `display_response(entry)`: Display AI responses with markdown
- `ChatAssistant` - main orchestrator for chat conversations.
    - `__init__(tools, developer_prompt, chat_interface, client)`: Initialize assistant
    - `gpt(chat_messages)`: Make OpenAI API calls
    - `run()`: Main chat loop handling user input and AI responses

In [80]:
import chat_assistant

tools = chat_assistant.Tools()
tools.add_tool(search, search_tool)

tools.get_tools()

[{'type': 'function',
  'name': 'search',
  'description': 'Search the FAQ database',
  'parameters': {'type': 'object',
   'properties': {'query': {'type': 'string',
     'description': 'Search query text to look up in the course FAQ.'}},
   'required': ['query'],
   'additionalProperties': False}}]

In [81]:
developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.

Use FAQ if your own knowledge is not sufficient to answer the question.

At the end of each response, ask the user a follow up question based on your answer.
""".strip()

In [82]:
chat_interface = chat_assistant.ChatInterface()

chat = chat_assistant.ChatAssistant(
    tools=tools,
    developer_prompt=developer_prompt,
    chat_interface=chat_interface,
    client=client
)

Run the chat assistant:

In [83]:
chat.run()

Chat ended.


Adding in the entry and description tools:

In [84]:
tools.add_tool(add_entry, add_entry_description)
tools.get_tools()

[{'type': 'function',
  'name': 'search',
  'description': 'Search the FAQ database',
  'parameters': {'type': 'object',
   'properties': {'query': {'type': 'string',
     'description': 'Search query text to look up in the course FAQ.'}},
   'required': ['query'],
   'additionalProperties': False}},
 {'type': 'function',
  'name': 'add_entry',
  'description': 'Add an entry to the FAQ database',
  'parameters': {'type': 'object',
   'properties': {'question': {'type': 'string',
     'description': 'The question to be added to the FAQ database'},
    'answer': {'type': 'string', 'description': 'The answer to the question'}},
   'required': ['question', 'answer'],
   'additionalProperties': False}}]

In [85]:
chat.run()

Chat ended.


Checking that it is in the index:

In [86]:
index.docs[-1]

{'question': 'How do I do well for module 1?',
 'text': 'To do well in Module 1 of the course, here are some tips:\n\n1. **Understand the Basics**: Familiarize yourself with Docker and Terraform concepts, as they are pivotal for this module.\n2. **Environment Setup**: Make sure your environment is set up correctly. This includes installing necessary packages like `psycopg2` for PostgreSQL. If you encounter errors like "ModuleNotFoundError: No module named \'psycopg2\'", try running:\n   ```\n   pip install psycopg2-binary\n   ```\n   or (if you already have it):\n   ```\n   pip install psycopg2-binary --upgrade\n   ```\n3. **Use Correct Connection Strings**: If you face errors with SQLAlchemy, ensure you’re using the correct connection string format. Instead of:\n   ```python\n   create_engine(\'postgresql://root:root@localhost:5432/ny_taxi\')\n   ```\n   Use:\n   ```python\n   conn_string = "postgresql+psycopg://root:root@localhost:5432/ny_taxi"\n   engine = create_engine(conn_string)

### Part 4: Using PydanticAI

#### Installing and Using PydanticAI

There are frameworks that make it easier for us to create agents

One of them is [PydanticAI](https://ai.pydantic.dev/agents/):

To install: `pip install pydantic-ai`

To import: `from pydantic_ai import Agent, RunContext`

Create an agent:

In [87]:
chat_agent = Agent(  
    'openai:gpt-4o-mini',
    system_prompt=developer_prompt
)

Now we can use it to automate tool description:

In [88]:
@chat_agent.tool
def search_tool(ctx: RunContext, query: str) -> Dict[str, str]:
    """
    Search the FAQ for relevant entries matching the query.

    Parameters
    ----------
    query : str
        The search query string provided by the user.

    Returns
    -------
    list
        A list of search results (up to 5), each containing relevance information 
        and associated output IDs.
    """
    print(f"search('{query}')")
    return search(query)


@chat_agent.tool
def add_entry_tool(ctx: RunContext, question: str, answer: str) -> None:
    """
    Add a new question-answer entry to FAQ.

    This function creates a document with the given question and answer, 
    tagging it as user-added content.

    Parameters
    ----------
    question : str
        The question text to be added to the index.

    answer : str
        The answer or explanation corresponding to the question.

    Returns
    -------
    None
    """
    return add_entry(question, answer)

It reads the functions' docstrings to automatically create function definition, so we don't need to worry about it.

Let's use it:

In [89]:
user_prompt = "I just discovered the course. Can I join now?"
agent_run = await chat_agent.run(user_prompt)
print(agent_run.output)

search('Can I join the course now?')
Yes, you can still join the course even if you haven't registered. You are eligible to start learning and submit your homework without registering. However, keep in mind that there are deadlines for the final projects, so try not to leave everything to the last minute.

Would you like to know more about the course materials or deadlines?


If want to learn more about implementing chat applications with Pydantic AI:

- https://ai.pydantic.dev/message-history/
- https://ai.pydantic.dev/examples/chat-app/


### Wrap Up

In this workshop, we took our RAG application and made it agentic, by first tweaking the prompts, and then using the "function calling" functionality from OpenAI.

At the end, we put all the logic into the `chat_assistant.py` script, and also explored PydanticAI to make it simpler.

What's next:

- MCP
- Agent Deployment
- Agent Monitoring