<a href="https://colab.research.google.com/github/salmantec/AI-Agents-Crash-Course/blob/feat%2FDay-4/Day-4/Day_4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
## Agents and Tools

In [4]:
# So far, we have done:
# - Day 1: Downloaded the data from a GitHub repository
# - Day 2: Processed it by chunking it where necessary
# - Day 3: Indexed the data so it's searchable

# Note that it took us quite a lot of time. We're halfway through the course, and only now we started working on agents. Most of the time so far, we have spent on data preparation.

# This is not a coincidence. Data preparation is the most time-consuming and critical part of building AI agents. Without properly prepared, cleaned, and indexed data, even the most sophisticated agent will provide poor results.

# Now it's time to create an AI agent that will use this data through the search engine that we created yesterday.

# This allows us to build context-aware agents. They can provide accurate, relevant answers based on your specific domain knowledge rather than just general training data.

# In particular, we will:
# - Learn what makes an AI system "agentic" through tool use
# - Build an agent that can use the search function
# - Use Pydantic AI to make it easier to implement agents

# At the end of this lesson, you'll have a working AI Agent that you can answer your questions in a Jupyter notebook.

In [5]:
## 1. Tools and Agents

# Agent - an agent is an LLM that can not only generate texts, but also invoke tools. Tools are external functions that the LLM can call in order to retrieve information, perform calculations, or take actions.

# In our case, the agent needs to answer our questions using the content of the GitHub repository. So, the tool (only one) is a search(query).

# But first, let's consider a situation where we have no tools at all. This is not an agent, it's just an LLM that can generate texts. Access to tools is what makes agents "agentic".


In [6]:
!pip install uv

Collecting uv
  Downloading uv-0.8.22-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Downloading uv-0.8.22-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.3/21.3 MB[0m [31m66.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: uv
Successfully installed uv-0.8.22


In [7]:
!uv pip install openai minsearch requests python-frontmatter

[2mUsing Python 3.12.11 environment at: /usr[0m
[2K[2mResolved [1m32 packages[0m [2min 670ms[0m[0m
[2K[2mPrepared [1m2 packages[0m [2min 33ms[0m[0m
[2K[2mInstalled [1m2 packages[0m [2min 10ms[0m[0m
 [32m+[39m [1mminsearch[0m[2m==0.0.5[0m
 [32m+[39m [1mpython-frontmatter[0m[2m==1.1.0[0m


In [None]:
from google.colab import userdata
userdata.get('GROQ_API_KEY')

In [9]:
# Let's see the difference with an example.

# We will try asking a question without giving the LLM access to search:


from openai import OpenAI

openai_client = OpenAI(api_key=userdata.get('GROQ_API_KEY'), base_url="https://api.groq.com/openai/v1")

user_prompt = "I just discovered the course, can I joni now?"

messages = [
    {"role": "user", "content": user_prompt}
]

response = openai_client.responses.create(
    model='openai/gpt-oss-20b',
    input=messages
)

print(response.output_text)

Sure thing! I’d be happy to help you get on board.  
Here’s what usually needs to happen to join a course:

| Step | What to do | Why it matters |
|------|------------|----------------|
| **1️⃣ Create an account (if you don’t already have one)** | Sign‑up on the platform with your name, email, and a secure password. | You need a profile to track progress, receive updates, and interact with instructors or peers. |
| **2️⃣ Find the course** | Navigate to the “Courses” or “Catalog” page, and search for the course title or code. | Makes sure you’re looking at the right offering (there can be multiple sections). |
| **3️⃣ Check prerequisites & eligibility** | Read the syllabus or course description. Some courses require prior coursework, a certain GPA, or a specific level of proficiency. | Helps you avoid enrollment if you’re not ready, and saves you time. |
| **4️⃣ Enroll / Register** | Click the “Enroll,” “Register,” or “Add to Cart” button. If it’s a paid course, you’ll be prompted to co

In [10]:
# The response is generic. In our case, it's this:

# “It depends on the course you're interested in. Many courses allow late enrollment, while others might have specific deadlines. I recommend checking the course's official website or contacting the instructor or administration for more details on joining.”

# This answer is not really useful.

# But if we let it invoke the search(query), the agent can give us a more useful answer.

# Here's how the conversation would flow with our agent using the search tool:

# - User: "I just discovered the course, can I join now?"
# - Agent thinking: I can't answer this question, so I need to search for information about course enrollment and timing.
# - Tool call: search("course enrollment join registration deadline")
# - Tool response: (...search results...)
# - Agent response: "Yes, you can still join the course even after the start date..."

# We will now explore how to implement it with OpenAI.


In [11]:
#  find read_repo_data in the first lesson and sliding_window in the second lesson

import io
import zipfile
import requests
import frontmatter

def read_repo_data(repo_owner, repo_name):
  """
  Download and parse all markdown files from a github repository

  Args:
    repo_owner : Github username or organization
    repo_name: Repository name

  Returns:
    List of dictionaries containing file content and metadata
  """
  prefix = 'https://codeload.github.com'
  url = f'{prefix}/{repo_owner}/{repo_name}/zip/refs/heads/main'
  resp = requests.get(url)

  if resp.status_code != 200:
    raise Exception(f"Failed to download repository {repo_owner}/{repo_name}: {resp.status_code}")

  repository_data = []

  # Create a ZipFile object from the downloaded content
  zf = zipfile.ZipFile(io.BytesIO(resp.content))

  for file_info in zf.infolist():
    filename = file_info.filename
    filename_lower = filename.lower()

    if not (filename_lower.endswith('.md') or (filename_lower.endswith('.mdx'))):
      continue

    try:
      with zf.open(file_info) as f_in:
        content = f_in.read().decode('utf-8', errors='ignore')
        post = frontmatter.loads(content)
        data = post.to_dict()
        data['filename'] = filename
        repository_data.append(data)
    except Exception as e:
      print(f"Error processing {filename}: {e}")
      continue

  zf.close()
  return repository_data

In [12]:
# Let's now index this data with minsearch:

from minsearch import Index

# For DataTalksClub FAQ, it's similar, except we don't need to chunk the data. For the data engineering course, it'll look like this:

dtc_faq = read_repo_data('DataTalksClub', 'faq')

de_dtc_faq = [d for d in dtc_faq if 'data-engineering' in d['filename']]

faq_index = Index(
    text_fields=["question", "content"],
    keyword_fields=[]
)

faq_index.fit(de_dtc_faq)

query = 'Course: Can I still join the course after the start date?'
results = faq_index.search(query)
print(results)

# This is text search, also known as "lexical search". We look for exact matches between our query and the documents.

[{'id': '3f1424af17', 'question': 'Course: Can I still join the course after the start date?', 'sort_order': 3, 'content': "Yes, even if you don't register, you're still eligible to submit the homework.\n\nBe aware, however, that there will be deadlines for turning in homeworks and the final projects. So don't leave everything for the last minute.", 'filename': 'faq-main/_questions/data-engineering-zoomcamp/general/003_3f1424af17_course-can-i-still-join-the-course-after-the-start.md'}, {'id': '9e508f2212', 'question': 'Course: When does the course start?', 'sort_order': 1, 'content': "The next cohort starts January 13th, 2025. More info at [DTC](https://datatalks.club/blog/guide-to-free-online-courses-at-datatalks-club.html).\n\n- Register before the course starts using this [link](https://airtable.com/shr6oVXeQvSI5HuWD).\n- Join the [course Telegram channel with announcements](https://t.me/dezoomcamp).\n- Don’t forget to register in DataTalks.Club's Slack and join the channel.", 'file

In [13]:
## 2. Function Calling with OpenAI

# Let's create an agent now. In OpenAI's terminology, we'll need to use "function calling" (https://platform.openai.com/docs/guides/function-calling)

# We will begin with our FAQ example and text search. You can easily extend it to vector or hybrid search or change it to the Evidently docs.

# This is the function we implemented yesterday:

def text_search(query):
    return faq_index.search(query, num_results=5)

# We can't just pass this function to OpenAI. First, we need to describe this function, so the LLM understands how to use it.

# This is done using a special description format:

text_search_tool = {
    "type": "function",
    "name": "text_search",
    "description": "Search the FAQ database",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "Search query text to look up in the course FAQ."
            }
        },
        "required": ["query"],
        "additionalProperties": False
    }
}


# This description tells OpenAI:
# - The function is called text_search
# - It searches the FAQ database
# - It takes one required parameter: query (a string)
# - The query should be the search text to look up in the course FAQ

# Now we can use it

system_prompt = """
You are a helpful assistant for a course
"""

question = "I just discovered the course, can I join now?"

chat_messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": question}
]

response = openai_client.responses.create(
    model='openai/gpt-oss-20b',
    input=chat_messages,
    tools=[text_search_tool]
)

# Previously, we had a simple text response; now, the response includes function calls.
# Let's look at response.output. In my case, it contains the following:

print(response.output)

# The agent analyzed the user's question and determined that to answer it, it needs to invoke the text_search function with the arguments {"query":"join course"}.

# Let's invoke the function with these arguments:

import json

call = response.output[1]

arguments = json.loads(call.arguments)
result = text_search(**arguments)

call_output = {
    "type": "function_call_output",
    "call_id": call.call_id,
    "output": json.dumps(result),
}

# Here's what's happening:
# - The LLM decided to execute a function and let us know about it
# - We executed the function and saved the results
# - Now we need to pass this information back to the LLM

# We do it by extending the chat_messages list and sending the entire conversation history back to the LLM:

chat_messages.append(call)
chat_messages.append(call_output)

response = openai_client.responses.create(
    model='openai/gpt-oss-20b',
    input=chat_messages,
    tools=[text_search_tool]
)

print(response.output_text)

# LLMs are stateless. When we make one call to the OpenAI API and then shortly afterwards make another, it doesn't know anything about the first call. So if we only send it call_output, it would have no idea how to respond to it.

# This is why we need to send it the entire conversation history. It needs to know everything that happened so far:
# - The system prompt (so it knows what the initial instructions are) - system_prompt
# - The user prompt (so it knows what task it needs to perform) - question
# - The decision to invoke the text_search tool (so it knows what function was called) - that's our call
# - The output of the function (so it knows what the function returned) - that's our call_output


# After we invoke it, we get back the response:
#      “Yes, you can still join the course even after the start date. While you won't be able to officially register, you are eligible to submit your homework. Just keep in mind that there are deadlines
#       for submitting assignments and final projects, so it's best not to leave everything to the last minute.”
# This is a useful response that we were hoping to get.



[ResponseReasoningItem(id='resp_01k69t1s81fsd81m2q30bv7da7', summary=[], type='reasoning', content=[Content(text='The user asks: "I just discovered the course, can I join now?" We need to answer. We might need to search FAQ for relevant answer. Use text_search. The query: "join now after discovering course" or "can I join course now" etc. Let\'s use search.', type='reasoning_text')], encrypted_content=None, status='completed'), ResponseFunctionToolCall(arguments='{"query":"join course now"}', call_id='fc_e047c4b9-6933-4e26-8f4f-b2637339804a', name='text_search', type='function_call', id='fc_e047c4b9-6933-4e26-8f4f-b2637339804a', status='completed')]
Absolutely! You can still join the Data Engineering Zoomcamp – even after the cohort has started.

### How to get in

| Step | What to do | Link |
|------|------------|------|
| 1️⃣ | **Register** (the official sign‑up form). Even if you’re signing up after the start date, you’ll still receive all the learning material and can submit the as

In [14]:
## 3. System Prompt: Instructions

# Let's take another look at the code we wrote previously

chat_messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": question}
]

response = openai_client.responses.create(
    model='openai/gpt-oss-20b',
    input=chat_messages,
    tools=[text_search_tool]
)

# We have two things here:
# - system_prompt contains instructions for the LLM
# - question ("user prompt") is the actual question or task

# The system prompt is very important: it influences how the agent behaves. This is how we can control what the agent does and how it responds to user questions.

# Usually, the more complete the instructions in the system prompt are, the better the results.

# So we can extend it

system_prompt = """
You are a helpful assistant for a course.

Use the search tool to find relevant information from the course materials before answering questions.

If you can find specific information through search, use it to provide accurate answers.
If the search doesn't return relevant results, let the user know and provide general guidance.
"""

# When working with agents, the system prompt becomes one of the most essential variables we can adjust to influence our agent.
# For example, if we want the agent to make multiple search queries, we can modify the prompt:

system_prompt = """
You are a helpful assistant for a course.

Always search for relevant information before answering.
If the first search doesn't give you enough information, try different search terms.

Make multiple searches if needed to provide comprehensive answers.
"""


In [15]:
## 4. Pydantic AI

# Dealing with function calls can be cumbersome. We first need to understand which function we need to invoke. Then we need to pass the results back to the LLM and perform other tasks. It's easy to make a mistake there.

# That's why we'll use a library to handle it. There are many agentic libraries: OpenAI Agents SDK, Langchain, Pydantic AI, and many more.

# Today, we will use Pydantic AI. I like its API; it's simpler than other libraries and has good documentation.

# Let's install it

!uv pip install pydantic-ai

[2mUsing Python 3.12.11 environment at: /usr[0m
[2mAudited [1m1 package[0m [2min 127ms[0m[0m


In [16]:
# For Pydantic AI (and for other agents libraries), we don't need to describe the function in the JSON format like we did witht the plain OpenAI API. The libraries take care of it.

# But we do need to add docstrings and type hints to our function. I asked ChatGPT to do it:

from typing import List, Any

def text_search(query: str) -> List[Any]:
    """
    Perform a text-based search on the FAQ index.

    Args:
        query (str): The search query string

    Returns:
        List[Any]: A list of up to 5 search results returned by the FAQ index
    """

    return faq_index.search(query, num_results=5)

In [None]:
from pydantic_ai import Agent

agent = Agent(
    name="faq_agent",
    instructions=system_prompt,
    tools=[text_search],
    model='gpt-4o-mini'
)


# We don't need to do anything with our text_search function. We just pass it directly to the agent.
# Let's run it:

question = "I just discovered the course, can I join now?"

result = await agent.run(user_prompt=question)

In [None]:
# We use await because Pydantic AI is asynchronous. If you're not running in Jupyter, you need to use asyncio.run():

import asyncio

result = asyncio.run(agent.run(user_prompt=question))

# The output:
# “Yes, you can still join the course even after the start date. Although you may not officially register, you are eligible to submit your homework. Just keep in mind that there are deadlines for turning in homework and final projects, so it's advisable not to delay everything until the last minute.”
# We can also look inside the result to get a detailed breakdown of the agent's reasoning and actions:
result.new_messages()

# It contains four items:
# - ModelRequest: Represents a request sent to the model. It includes the user's prompt (UserPromptPart) and the agent's instructions.
# - ModelResponse: The model's reply. We see a ToolCallPart with the decision to invoke text_search.
# - ModelRequest: Contains ToolReturnPart - the results returned by the tool (search results from the FAQ index).
# - ModelResponse: The final answer generated by the model in TextPart.

# Pydantic AI and other frameworks handle all the complexity of function calling for us. We don't need to manually parse responses, handle tool calls, or manage conversation history. This makes our code cleaner and less error-prone.

# We implemented an agent. Great! But how good is it? Is the prompt we came up good? What's better for our agent, text search, vector search or hybrid? Tomorrow we will be able to answer these questions: we will learn how to use AI to evaluate our agent.
