# Agents + RAG - Workshop

In [1]:
# install minsearch
!pip install minsearch

Collecting minsearch
  Downloading minsearch-0.0.4-py3-none-any.whl.metadata (8.1 kB)
Downloading minsearch-0.0.4-py3-none-any.whl (11 kB)
Installing collected packages: minsearch
Successfully installed minsearch-0.0.4


In [2]:
# get the documents
import requests 

docs_url = 'https://github.com/alexeygrigorev/llm-rag-workshop/raw/main/notebooks/documents.json'
docs_response = requests.get(docs_url)
documents_raw = docs_response.json()

documents = []

for course in documents_raw:
    course_name = course['course']

    for doc in course['documents']:
        doc['course'] = course_name
        documents.append(doc)

In [3]:
# index the documents
from minsearch import AppendableIndex

index = AppendableIndex(
    text_fields=["question", "text", "section"],
    keyword_fields=["course"]
)

index.fit(documents)

<minsearch.append.AppendableIndex at 0x103ce9000>

In [4]:
# now search for a question
def search(query):
    boost = {'question': 3.0, 'section': 0.5}

    results = index.search(
        query=query,
        filter_dict={'course': 'data-engineering-zoomcamp'},
        boost_dict=boost,
        num_results=5,
        output_ids=True
    )

    return results

**Explanation:**

- This search function is the foundation of our RAG system.
- It looks up in the FAQ to find relevant information.
- The result is used to build context for the LLM.


## Prompt

We create a function to format the search results into a structured context that our LLM can use.

In [6]:
prompt_template = """
You're a course teaching assistant. Answer the QUESTION based on the CONTEXT from the FAQ database.
Use only the facts from the CONTEXT when answering the QUESTION.

<QUESTION>
{question}
</QUESTION>

<CONTEXT>
{context}
</CONTEXT>
""".strip()

def build_prompt(query, search_results):
    context = ""

    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"

    prompt = prompt_template.format(question=query, context=context).strip()
    return prompt

**Explanation:**

- Takes search results
- Formats each document
- Put everything in a prompt

## The RAG flow

We add a call to an LLM and combine everything into a complete RAG pipeline.

In [7]:
from openai import OpenAI
client = OpenAI()

def llm(prompt):
    response = client.chat.completions.create(
        model='gpt-4o-mini',
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

def rag(query):
    search_results = search(query)
    prompt = build_prompt(query, search_results)
    answer = llm(prompt)
    return answer

**Explanation:**

- `build_prompt`: Formats the search results into a prompt
- `llm`: Makes the API call to the language model
- `rag`: Combines search and LLM into a single function


## Part 1: Agentic RAG
Now let's make our flow agentic

### Agents and Agentic flows

**Agents are AI systems that can:**

- Make decisions about what actions to take
- Use tools to accomplish tasks
- Maintain state and context
- Learn from previous interactions
- Work towards specific goals

Agentic flow is not necessarily a completely independent agent, but it can still make some decisions during the flow execution

**A typical agentic flow consists of:**

1. Receiving a user request
2. Analyzing the request and available tools
3. Deciding on the next action
4. Executing the action using appropriate tools
5. Evaluating the results
6. Either completing the task or continuing with more actions

**The key difference from basic RAG is that agents can:**

- Make multiple search queries
- Combine information from different sources
- Decide when to stop searching
- Use their own knowledge when appropriate
- Chain multiple actions together

So in **agentic RAG**, the system
- has access to the **history of previous actions**
- **makes decisions independently** based on the current information and the previous actions

Let's implement this step by step.

### Making RAG more agentic

First, we'll take the prompt we have so far and make it a little more "agentic":

- Tell the LLM that it can answer the question directly or look up context
- Provide output templates
- Show clearly what's the source of the answer

In [9]:
prompt_template = """
You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.
At the beginning the context is EMPTY.

<QUESTION>
{question}
</QUESTION>

<CONTEXT> 
{context}
</CONTEXT>

If CONTEXT is EMPTY, you can use our FAQ database.
In this case, use the following output template:

{{
"action": "SEARCH",
"reasoning": "<add your reasoning here>"
}}

If you can answer the QUESTION using CONTEXT, use this template:

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "CONTEXT"
}}

If the context doesn't contain the answer, use your own knowledge to answer the question

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "OWN_KNOWLEDGE"
}}
""".strip()

👩🏽‍💻 Let's use it. 

In [10]:
question = "how do I run docker on gentoo?"
context = "EMPTY"

prompt = prompt_template.format(question=question, context=context)
print(prompt)

answer = llm(prompt)
print(answer)

You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.
At the beginning the context is EMPTY.

<QUESTION>
how do I run docker on gentoo?
</QUESTION>

<CONTEXT> 
EMPTY
</CONTEXT>

If CONTEXT is EMPTY, you can use our FAQ database.
In this case, use the following output template:

{
"action": "SEARCH",
"reasoning": "<add your reasoning here>"
}

If you can answer the QUESTION using CONTEXT, use this template:

{
"action": "ANSWER",
"answer": "<your answer>",
"source": "CONTEXT"
}

If the context doesn't contain the answer, use your own knowledge to answer the question

{
"action": "ANSWER",
"answer": "<your answer>",
"source": "OWN_KNOWLEDGE"
}
{
"action": "ANSWER",
"answer": "To run Docker on Gentoo, you'll first need to install Docker. You can do this by emerging the Docker package. Use the following command:\n\n```bash\nemerge app-emulation/docker\n```\n\nAfter the installation, you w

👀 Implementing **MAKE THE SEARCH**

In [12]:
def build_context(search_results):
    context = ""

    for doc in search_results:
        context = context + f"section: {doc['section']}\nquestion: {doc['question']}\nanswer: {doc['text']}\n\n"

    return context.strip()

In [13]:
search_results = search(question)
context = build_context(search_results)
prompt = prompt_template.format(question=question, context=context)
print(prompt)

You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.
At the beginning the context is EMPTY.

<QUESTION>
how do I run docker on gentoo?
</QUESTION>

<CONTEXT> 
section: Module 1: Docker and Terraform
question: Docker - Error response from daemon: error while creating buildmount source path '/run/desktop/mnt/host/c/<your path>': mkdir /run/desktop/mnt/host/c: file exists
answer: When you run this command second time
docker run -it \
-e POSTGRES_USER="root" \
-e POSTGRES_PASSWORD="root" \
-e POSTGRES_DB="ny_taxi" \
-v <your path>:/var/lib/postgresql/data \
-p 5432:5432 \
postgres:13
The error message above could happen. That means you should not mount on the second run. This command helped me:
When you run this command second time
docker run -it \
-e POSTGRES_USER="root" \
-e POSTGRES_PASSWORD="root" \
-e POSTGRES_DB="ny_taxi" \
-p 5432:5432 \
postgres:13

section: Module 1: Docker and 

In [14]:
# direct querying it
answer = llm(prompt)
print(answer)

{
"action": "ANSWER",
"answer": "To run Docker on Gentoo, you will need to follow these steps: \n\n1. **Install Docker:** You can install Docker using the Portage package manager. Run the following command: \n   ```bash\n   sudo emerge app-emulation/docker\n   ```\n\n2. **Start the Docker service:** You need to start the Docker daemon for it to function properly. You can do this using OpenRC (the default init system for Gentoo) by running: \n   ```bash\n   sudo rc-service docker start\n   ```\n\n3. **Add your user to the docker group:** This will allow you to run Docker commands without using 'sudo'. Run: \n   ```bash\n   sudo gpasswd -a $USER docker\n   ``` \n   After adding your user, make sure to log out and log back in for the group changes to take effect.\n\n4. **Verify installation:** You can verify that Docker is running correctly by executing: \n   ```bash\n   docker run hello-world\n   ``` \n   This command should pull the hello-world image and run it, displaying a message if 

Let's put this together:

- First attempt to answer it with our know knowledge
- If needed, do the lookup and then answer

In [17]:
import json

def agentic_rag_v1(question):
    context = "EMPTY"
    prompt = prompt_template.format(question=question, context=context)
    answer_json = llm(prompt)
    answer = json.loads(answer_json)
    print(answer)

    if answer['action'] == 'SEARCH':
        print('need to perform search...')
        search_results = search(question)
        context = build_context(search_results)
        
        prompt = prompt_template.format(question=question, context=context)
        answer_json = llm(prompt)
        answer = json.loads(answer_json)
        print(answer)

    return answer

In [18]:
# test it
agentic_rag_v1('how do I join the course?')


{'action': 'SEARCH', 'reasoning': 'The question about how to join the course requires checking the FAQ database as there is no information in the current context.'}
need to perform search...
{'action': 'ANSWER', 'answer': "To join the course, make sure to register before it starts using the provided registration link. The course will commence on January 15, 2024, at 17:00. Additionally, you should join the course's public Google Calendar, Telegram channel for announcements, and Slack for communication. Even if you miss the registration deadline, you can still participate by submitting homework assignments, but be aware of the final project deadlines.", 'source': 'CONTEXT'}


{'action': 'ANSWER',
 'answer': "To join the course, make sure to register before it starts using the provided registration link. The course will commence on January 15, 2024, at 17:00. Additionally, you should join the course's public Google Calendar, Telegram channel for announcements, and Slack for communication. Even if you miss the registration deadline, you can still participate by submitting homework assignments, but be aware of the final project deadlines.",
 'source': 'CONTEXT'}

In [None]:
# test it again
agentic_rag_v1('what happens in gossip girl?')

{'action': 'ANSWER', 'answer': "In 'Gossip Girl,' a drama series set in New York City, the story revolves around the lives of privileged teenagers attending an elite private school on the Upper East Side. The plot is narrated by an anonymous blogger known as 'Gossip Girl,' who exposes the secrets and scandalous affairs of the characters. Central themes include friendship, love, betrayal, and the complexities of adolescence as the characters navigate their social lives, relationships, and personal challenges. The show features a mix of romance, intrigue, and social commentary, culminating in various twists and turns throughout its seasons.", 'source': 'OWN_KNOWLEDGE'}


{'action': 'ANSWER',
 'answer': "In 'Gossip Girl,' a drama series set in New York City, the story revolves around the lives of privileged teenagers attending an elite private school on the Upper East Side. The plot is narrated by an anonymous blogger known as 'Gossip Girl,' who exposes the secrets and scandalous affairs of the characters. Central themes include friendship, love, betrayal, and the complexities of adolescence as the characters navigate their social lives, relationships, and personal challenges. The show features a mix of romance, intrigue, and social commentary, culminating in various twists and turns throughout its seasons.",
 'source': 'OWN_KNOWLEDGE'}

## Part 2: Agentic search
So far we had two actions only: search and answer.

But we can let our "agent" formulate one or more search queries - and do it for a few iterations until we found an answer

Let's build a prompt:

1. List available actions:
    - Search in FAQ
    - Answer using own knowledge
    - Answer using information extracted from FAQ
2. Provide access to the previous actions
3. Have clear stop criteria (no more than X iterations)
4. We also specify the output format, so it's easier to parse it

In [24]:
prompt_template = """
You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to and add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic. 

Don't use search queries used at the previous iterations.

Don't repeat previously performed actions.

Don't perform more than {max_iterations} iterations for a given student question.
The current iteration number: {iteration_number}. If we exceed the allowed number 
of iterations, give the best possible answer with the provided information.

Output templates:

If you want to perform search, use this template:

{{
"action": "SEARCH",
"reasoning": "<add your reasoning here>",
"keywords": ["search query 1", "search query 2", ...]
}}

If you can answer the QUESTION using CONTEXT, use this template:

{{
"action": "ANSWER_CONTEXT",
"answer": "<your answer>",
"source": "CONTEXT"
}}

If the context doesn't contain the answer, use your own knowledge to answer the question

{{
"action": "ANSWER",
"answer": "<your answer>",
"source": "OWN_KNOWLEDGE"
}}

<QUESTION>
{question}
</QUESTION>

<SEARCH_QUERIES>
{search_queries}
</SEARCH_QUERIES>

<CONTEXT> 
{context}
</CONTEXT>

<PREVIOUS_ACTIONS>
{previous_actions}
</PREVIOUS_ACTIONS>
""".strip()

Our code becomes more complicated. For the first iteration, we have:

In [25]:
question = "how do I join the course?"

search_queries = []
search_results = []
previous_actions = []
context = build_context(search_results)

prompt = prompt_template.format(
    question=question,
    context=context,
    search_queries="\n".join(search_queries),
    previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
    max_iterations=3,
    iteration_number=1
)
print(prompt)

You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to and add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic. 

Don't use search queries used at the previous iterations.

Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current iteration number

In [26]:
answer_json = llm(prompt)
answer = json.loads(answer_json)
print(json.dumps(answer, indent=2))

{
  "action": "SEARCH",
  "reasoning": "The student's question pertains to joining the course, which likely involves specific enrollment procedures, eligibility, and other related details not currently available in the context. Therefore, I need to search for information related to course enrollment.",
  "keywords": [
    "how to enroll in the course",
    "course registration process",
    "joining the course details"
  ]
}


We need to sabe the actions, so let's do it:

In [27]:
previous_actions.append(answer)

Save the search queries:

In [28]:
keywords = answer['keywords']
search_queries.extend(keywords)

And perform the search:

In [29]:
for k in keywords:
    res = search(k)
    search_results.extend(res)

Some of the search results will be duplicates, so we need to remove them:

In [30]:
def dedup(seq):
    seen = set()
    result = []
    for el in seq:
        _id = el['_id']
        if _id in seen:
            continue
        seen.add(_id)
        result.append(el)
    return result

search_results = dedup(search_results)

Now let's make another iteration - use the same code as previously, but remove variable initialisation and increase the iteration number.

In [31]:
# question = "how do I join the course?"

# search_queries = []
# search_results = []
# previous_actions = []

context = build_context(search_results)

prompt = prompt_template.format(
    question=question,
    context=context,
    search_queries="\n".join(search_queries),
    previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
    max_iterations=3,
    iteration_number=2
)
print(prompt)

answer_json = llm(prompt)
answer = json.loads(answer_json)
print(json.dumps(answer, indent=2))

You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to and add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic. 

Don't use search queries used at the previous iterations.

Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current iteration number

Let's put everything together.

In [32]:
question = "what do I need to do to be successful at module 1?"

search_queries = []
search_results = []
previous_actions = []


iteration = 0

while True:
    print(f'ITERATION #{iteration}...')

    context = build_context(search_results)
    prompt = prompt_template.format(
        question=question,
        context=context,
        search_queries="\n".join(search_queries),
        previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
        max_iterations=3,
        iteration_number=iteration
    )

    print(prompt)

    answer_json = llm(prompt)
    answer = json.loads(answer_json)
    print(json.dumps(answer, indent=2))

    previous_actions.append(answer)

    action = answer['action']
    if action != 'SEARCH':
        break

    keywords = answer['keywords']
    search_queries = list(set(search_queries) | set(keywords))
    
    for k in keywords:
        res = search(k)
        search_results.extend(res)

    search_results = dedup(search_results)
    
    iteration = iteration + 1
    if iteration >= 4:
        break

    print()

ITERATION #0...
You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to and add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic. 

Don't use search queries used at the previous iterations.

Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current 

Or, as a function:

In [33]:
def agentic_search(question):
    search_queries = []
    search_results = []
    previous_actions = []

    iteration = 0
    
    while True:
        print(f'ITERATION #{iteration}...')
    
        context = build_context(search_results)
        prompt = prompt_template.format(
            question=question,
            context=context,
            search_queries="\n".join(search_queries),
            previous_actions='\n'.join([json.dumps(a) for a in previous_actions]),
            max_iterations=3,
            iteration_number=iteration
        )
    
        print(prompt)
    
        answer_json = llm(prompt)
        answer = json.loads(answer_json)
        print(json.dumps(answer, indent=2))

        previous_actions.append(answer)
    
        action = answer['action']
        if action != 'SEARCH':
            break
    
        keywords = answer['keywords']
        search_queries = list(set(search_queries) | set(keywords))

        for k in keywords:
            res = search(k)
            search_results.extend(res)
    
        search_results = dedup(search_results)
        
        iteration = iteration + 1
        if iteration >= 4:
            break
    
        print()

    return answer

In [34]:
# test it
agentic_search('how do I prepare for the course?')

ITERATION #0...
You're a course teaching assistant.

You're given a QUESTION from a course student and that you need to answer with your own knowledge and provided CONTEXT.

The CONTEXT is build with the documents from our FAQ database.
SEARCH_QUERIES contains the queries that were used to retrieve the documents
from FAQ to and add them to the context.
PREVIOUS_ACTIONS contains the actions you already performed.

At the beginning the CONTEXT is empty.

You can perform the following actions:

- Search in the FAQ database to get more data for the CONTEXT
- Answer the question using the CONTEXT
- Answer the question using your own knowledge

For the SEARCH action, build search requests based on the CONTEXT and the QUESTION.
Carefully analyze the CONTEXT and generate the requests to deeply explore the topic. 

Don't use search queries used at the previous iterations.

Don't repeat previously performed actions.

Don't perform more than 3 iterations for a given student question.
The current 

{'action': 'ANSWER',
 'answer': "To prepare for the course effectively, consider the following steps: \n\n1. **Register Early**: Ensure you register for the course before it starts. This is important to gain access to materials and communications.\n2. **Join Communication Channels**: Sign up for the course Telegram channel for announcements and join DataTalks.Club's Slack for community engagement and support.\n3. **Study Resources**: Familiarize yourself with general course materials and review any recommended reading or resources provided by the instructors.\n4. **Technical Setup**: Make sure your technical environment is ready, which may include downloading necessary software tools and setting up Git/GitHub as needed for course assignments.\n5. **Time Management**: Plan a study schedule to keep up with the course pace when it begins.\n6. **Pre-course Information**: Look for any pre-course materials that might be shared ahead of the start date, and read through all available documenta

## Part 3: Function Calling

### **Function Calling in OpenAI**

We put all this logic inside our prompt.

But OpenAI and other providers provide a convenient API for adding extra functionality like search.

[https://platform.openai.com/docs/guides/function-calling](https://platform.openai.com/docs/guides/function-calling)

It's called "function calling" - you define functions that the model can call, and if it decides to make a call, it returns structured output for that.

For example, let's take our search function:

In [35]:
def search(query):
    boost = {'question': 3.0, 'section': 0.5}

    results = index.search(
        query=query,
        filter_dict={'course': 'data-engineering-zoomcamp'},
        boost_dict=boost,
        num_results=5,
        output_ids=True
    )

    return results

We describe it like that:

In [36]:
search_tool = {
    "type": "function",
    "name": "search",
    "description": "Search the FAQ database",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "Search query text to look up in the course FAQ."
            }
        },
        "required": ["query"],
        "additionalProperties": False
    }
}

Here we have:

- `name`: `search`
- `description`: when to use it
- `parameters`: all the arguments that the function can take and their description

In order to use function calling, we'll use a newer API - the "responses" API (not "chat completions" as previously):

In [37]:
question = "How do I do well in module 1?"

developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.
""".strip()

tools = [search_tool]

chat_messages = [
    {"role": "developer", "content": developer_prompt},
    {"role": "user", "content": question}
]

response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)
response.output

[ResponseFunctionToolCall(arguments='{"query":"do well in module 1"}', call_id='call_atn301uMuGbUlUilPpHNBXaF', name='search', type='function_call', id='fc_687fdadd907c81928231caf3c04fa4c703b79f1d22fc6ddd', status='completed')]

If the model thinkgs we should make a function call, it will tell us:

`[ResponseFunctionToolCall(arguments='{"query":"How to do well in module 1"}', call_id='call_AwYwOak5Ljeidh4HbE3RxMZJ', name='search', type='function_call', id='fc_6848604db67881a298ec38121c1555ef0dee5fa0cdb59912', status='completed')]`

Let's make a call to `search`. 

In [38]:
calls = response.output
call = calls[0]
call

call_id = call.call_id
call_id

f_name = call.name
f_name

arguments = json.loads(call.arguments)
arguments

{'query': 'do well in module 1'}

Using `f_name`we can find the function we need:

In [39]:
f = globals()[f_name]

And invoke it with the arguments:

In [40]:
results = f(**arguments)

Now let's sabe the results as json:

In [41]:
search_results = json.dumps(results, indent=2)
print(search_results)

[
  {
    "text": "Following dbt with BigQuery on Docker readme.md, after `docker-compose build` and `docker-compose run dbt-bq-dtc init`, encountered error `ModuleNotFoundError: No module named 'pytz'`\nSolution:\nAdd `RUN python -m pip install --no-cache pytz` in the Dockerfile under `FROM --platform=$build_for python:3.9.9-slim-bullseye as base`",
    "section": "Module 4: analytics engineering with dbt",
    "question": "DBT - Error: No module named 'pytz' while setting up dbt with docker",
    "course": "data-engineering-zoomcamp",
    "_id": 299
  },
  {
    "text": "Even after installing pyspark correctly on linux machine (VM ) as per course instructions, faced a module not found error in jupyter notebook .\nThe solution which worked for me(use following in jupyter notebook) :\n!pip install findspark\nimport findspark\nfindspark.init()\nThereafter , import pyspark and create spark contex<<t as usual\nNone of the solutions above worked for me till I ran !pip3 install pyspark inst

And save both the response and the result of the function call:

In [42]:
chat_messages.append(call)

chat_messages.append({
    "type": "function_call_output",
    "call_id": call.call_id,
    "output": search_results,
})

Now `chat_messages` contains both the call description (so it keeps track of history) and the results.

Let's make another call to the model.

In [43]:
response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)

This time it should be the response (but also can be another call):

In [44]:
r = response.output[0]
print(r.content[0].text)

To do well in Module 1 of your course, here are some tips based on common challenges and strategies for success:

1. **Understand the Basics**: Ensure you have a good grasp of the foundational concepts covered in Module 1. This includes Docker and Terraform, as well as how they relate to your course subject.

2. **Installation and Setup**: Pay close attention to the installation steps. Common issues include module not found errors (such as 'psycopg2'). Make sure you have the necessary libraries installed. For example, you can install `psycopg2` using:
   ```bash
   pip install psycopg2-binary
   ```

3. **Hands-On Practice**: Engage actively with the practical components of the module. Create and manage Docker containers as instructed and practice writing Terraform configurations.

4. **Check Dependencies**: If you encounter errors, verify that all required modules are installed and check that your environmental paths are set correctly.

5. **Utilize Resources**: Refer to the course ma

### **Making multiple calls**


What if we want to make multiple calls? Change the developer prompt a little:

In [45]:
developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.
If you look up something in FAQ, convert the student question into multiple queries.
""".strip()

chat_messages = [
    {"role": "developer", "content": developer_prompt},
    {"role": "user", "content": question}
]

response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)

This time let's start to organise the code a litte.

First, create a function `do_call`: 

In [46]:
def do_call(tool_call_response):
    function_name = tool_call_response.name
    arguments = json.loads(tool_call_response.arguments)

    f = globals()[function_name]
    result = f(**arguments)

    return {
        "type": "function_call_output",
        "call_id": tool_call_response.call_id,
        "output": json.dumps(result, indent=2),
    }

Now iterate over responses:

In [47]:
for entry in response.output:
    chat_messages.append(entry)
    print(entry.type)

    if entry.type == 'function_call':      
        result = do_call(entry)
        chat_messages.append(result)
    elif entry.type == 'message':
        print(entry.text) 

function_call
function_call
function_call


First call will the probably be a `function call`, so let's do another one:

In [48]:
response = client.responses.create(
    model='gpt-4o-mini',
    input=chat_messages,
    tools=tools
)

for entry in response.output:
    chat_messages.append(entry)
    print(entry.type)
    print()

    if entry.type == 'function_call':      
        result = do_call(entry)
        chat_messages.append(result)
    elif entry.type == 'message':
        print(entry.content[0].text) 

message

To excel in Module 1, here are some useful tips and strategies:

### 1. Understand the Key Topics
Module 1 focuses on **Docker and Terraform**, which are essential tools in data engineering. Make sure you review the following:

- **Docker Basics**: Understand containerization, images, and how to work with Docker containers.
- **Terraform Fundamentals**: Familiarize yourself with infrastructure as code, creating and managing infrastructure using Terraform.

### 2. Follow the Course Instructions
Pay close attention to the course materials and instructions. Here are some common challenges and solutions from students:

- **ModuleNotFoundError** for libraries like `psycopg2`:
  - If you encounter `ModuleNotFoundError: No module named 'psycopg2'`, you may need to install it using:
    ```bash
    pip install psycopg2-binary
    ```
  - If you've already installed it, try updating it with:
    ```bash
    pip install psycopg2-binary --upgrade
    ```

- **SQLAlchemy Errors**: To reso

### **Putting everthing together**

But what if it's not?

Let's make two loops:

- First is the main Q&A loop - ask question, get back the answer
- Second is the request loop - send requests until there's a message reply from the API

In [49]:
developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.

Use FAQ if your own knowledge is not sufficient to answer the question.
When using FAQ, perform deep topic exploration: make one request to FAQ,
and then based on the results, make more requests.

At the end of each response, ask the user a follow up question based on your answer.
""".strip()

chat_messages = [
    {"role": "developer", "content": developer_prompt},
]

In [None]:
while True: # main Q&A loop
    question = input() # How do I do my best for module 1?
    if question == 'stop':
        break

    message = {"role": "user", "content": question}
    chat_messages.append(message)

    while True: # request-response loop - query API till get a message
        response = client.responses.create(
            model='gpt-4o-mini',
            input=chat_messages,
            tools=tools
        )

        has_messages = False
        
        for entry in response.output:
            chat_messages.append(entry)
        
            if entry.type == 'function_call':      
                print('function_call:', entry)
                print()
                result = do_call(entry)
                chat_messages.append(result)
            elif entry.type == 'message':
                print(entry.content[0].text)
                print()
                has_messages = True

        if has_messages:
            break

It seems like you didn't include your question. Could you please provide more details or ask anything specific you'd like assistance with?

It looks like your message didn't come through. Please try sending your question again, and I'll be happy to help!



It's also possible that there's both message and tool calls, but we'll ignore this case for now. (It's easy to fix - just check if there are no function calls, and only then ask the user for input.)

Let's make it a bit nicer using HTML:

In [4]:
pip install markdown

Collecting markdown
  Downloading markdown-3.8.2-py3-none-any.whl.metadata (5.1 kB)
Downloading markdown-3.8.2-py3-none-any.whl (106 kB)
Installing collected packages: markdown
Successfully installed markdown-3.8.2
Note: you may need to restart the kernel to use updated packages.


In [5]:
from IPython.display import display, HTML
import markdown # pip install markdown

    

developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.

Use FAQ if your own knowledge is not sufficient to answer the question.

At the end of each response, ask the user a follow up question based on your answer.
""".strip()

chat_messages = [
    {"role": "developer", "content": developer_prompt},
]

# Chat loop
while True:
    
    if question.strip().lower() == 'stop':
        print("Chat ended.")
        break
    print()

    message = {"role": "user", "content": question}
    chat_messages.append(message)

    while True:  # inner request loop
        response = client.responses.create(
            model='gpt-4o-mini',
            input=chat_messages,
            tools=tools
        )

        has_messages = False

        for entry in response.output:
            chat_messages.append(entry)

            if entry.type == "function_call":
                result = do_call(entry)
                chat_messages.append(result)
                display_function_call(entry, result)

            elif entry.type == "message":
                display_response(entry)
                has_messages = True

        if has_messages:
            break

NameError: name 'question' is not defined

### **Using multiple tools**

What if we also want to use this chat app to add new entries to the FAQ? We will need another function for it:

In [None]:
def add_entry(question, answer):
    doc = {
        'question': question,
        'text': answer,
        'section': 'user added',
        'course': 'data-engineering-zoomcamp'
    }
    index.append(doc)

In [None]:
# description

add_entry_description = {
    "type": "function",
    "name": "add_entry",
    "description": "Add an entry to the FAQ database",
    "parameters": {
        "type": "object",
        "properties": {
            "question": {
                "type": "string",
                "description": "The question to be added to the FAQ database",
            },
            "answer": {
                "type": "string",
                "description": "The answer to the question",
            }
        },
        "required": ["question", "answer"],
        "additionalProperties": False
    }
}

We can just reuse the preivous code. But we can also clean it up
and make it more modular. 

See the result in [`chat_assistant.py`](chat_assistant.py)

You can download it using `wget`:

```bash
wget https://raw.githubusercontent.com/alexeygrigorev/rag-agents-workshop/refs/heads/main/chat_assistant.py
```

Here we define multiple classes:

- `Tools` - manages function tools for the agent
    - `add_tool(function, description)`: Register a function with its description
    - `get_tools()`: Return list of registered tool descriptions
    - `function_call(tool_call_response)`: Execute a function call and return result
- `ChatInterface` - handles user input and display formatting
    - `input()`: Get user input
    - `display(message)`: Print a message
    - `display_function_call(entry, result)`: Show function calls in HTML format
    - `display_response(entry)`: Display AI responses with markdown
- `ChatAssistant` - main orchestrator for chat conversations.
    - `__init__(tools, developer_prompt, chat_interface, client)`: Initialize assistant
    - `gpt(chat_messages)`: Make OpenAI API calls
    - `run()`: Main chat loop handling user input and AI responses

Let's use it:

In [None]:
wget https://raw.githubusercontent.com/alexeygrigorev/rag-agents-workshop/refs/heads/main/chat_assistant.py

In [None]:
import chat_assistant

tools = chat_assistant.Tools()
tools.add_tool(search, search_tool)

tools.get_tools()

developer_prompt = """
You're a course teaching assistant. 
You're given a question from a course student and your task is to answer it.

Use FAQ if your own knowledge is not sufficient to answer the question.

At the end of each response, ask the user a follow up question based on your answer.
""".strip()

chat_interface = chat_assistant.ChatInterface()

chat = chat_assistant.ChatAssistant(
    tools=tools,
    developer_prompt=developer_prompt,
    chat_interface=chat_interface,
    client=client
)

In [None]:
# and run it
chat.run()


In [None]:
# now let's add the new tool
tools.add_tool(add_entry, add_entry_description)
tools.get_tools()

And talk with the assistant:

- How do I do well for module 1?
- Add this back to FAQ

And check that it's in the index:

In [None]:
index.docs[-1]

## Part 4: Using Pydantic AI

### Installing and using PydanticAI

There are frameworks that make it easier for us to create agents

One of them is PydanticAI:

In [1]:
pip install pydantic-ai

Collecting pydantic-ai
  Downloading pydantic_ai-0.4.5-py3-none-any.whl.metadata (11 kB)
Collecting pydantic-ai-slim==0.4.5 (from pydantic-ai-slim[ag-ui,anthropic,bedrock,cli,cohere,evals,google,groq,huggingface,mcp,mistral,openai,vertexai]==0.4.5->pydantic-ai)
  Downloading pydantic_ai_slim-0.4.5-py3-none-any.whl.metadata (4.1 kB)
Collecting eval-type-backport>=0.2.0 (from pydantic-ai-slim==0.4.5->pydantic-ai-slim[ag-ui,anthropic,bedrock,cli,cohere,evals,google,groq,huggingface,mcp,mistral,openai,vertexai]==0.4.5->pydantic-ai)
  Using cached eval_type_backport-0.2.2-py3-none-any.whl.metadata (2.2 kB)
Collecting griffe>=1.3.2 (from pydantic-ai-slim==0.4.5->pydantic-ai-slim[ag-ui,anthropic,bedrock,cli,cohere,evals,google,groq,huggingface,mcp,mistral,openai,vertexai]==0.4.5->pydantic-ai)
  Using cached griffe-1.7.3-py3-none-any.whl.metadata (5.0 kB)
Collecting opentelemetry-api>=1.28.0 (from pydantic-ai-slim==0.4.5->pydantic-ai-slim[ag-ui,anthropic,bedrock,cli,cohere,evals,google,groq,hu

Let's import it.

In [6]:
from pydantic_ai import Agent, RunContext

# and create an agent
chat_agent = Agent(  
    'openai:gpt-4o-mini',
    system_prompt=developer_prompt
)

Now we cam use it to automate tool description:

In [8]:
from typing import Dict


@chat_agent.tool
def search_tool(ctx: RunContext, query: str) -> Dict[str, str]:
    """
    Search the FAQ for relevant entries matching the query.

    Parameters
    ----------
    query : str
        The search query string provided by the user.

    Returns
    -------
    list
        A list of search results (up to 5), each containing relevance information 
        and associated output IDs.
    """
    print(f"search('{query}')")
    return search(query)


@chat_agent.tool
def add_entry_tool(ctx: RunContext, question: str, answer: str) -> None:
    """
    Add a new question-answer entry to FAQ.

    This function creates a document with the given question and answer, 
    tagging it as user-added content.

    Parameters
    ----------
    question : str
        The question text to be added to the index.

    answer : str
        The answer or explanation corresponding to the question.

    Returns
    -------
    None
    """
    return add_entry(question, answer)

UserError: Tool name conflicts with existing tool: 'search_tool'

It reads the functions' docstrings to automatically create function definition, so we don't need to worry about it.

Let's use it:

In [None]:
user_prompt = "I just discovered the course. Can I join now?"
agent_run = await chat_agent.run(user_prompt)
print(agent_run.output)

If want to learn more about implementing chat applications with Pydantic AI:

https://ai.pydantic.dev/message-history/
https://ai.pydantic.dev/examples/chat-app/

## Wrap up

In this workshop, we took our RAG application and made it agentic, by first tweaking the prompts, and then using the "function calling" functionality from OpenAI.

At the end, we put all the logic into the chat_assistant.py  script, and also explored PydanticAI to make it simpler.

What's next:

- MCP
- Agent deployment
- Agent monitoring