# Building a Simple Agent in Python

Welcome to this tutorial on building a simple agent in Python! In this notebook, we will focus on creating a basic agent that can communicate with a **Large Language Model (LLM)** to perform tasks autonomously. We will build upon the concepts you learned in the previous notebook, such as prompts and LLM interaction, and now put those into practice to create a functional agent.

By the end of this tutorial, you will know how to configure a simple agent, interact with an LLM, and process the responses it provides.

## What is an AI Agent?

An **AI Agent** is a system that can use a **Large Language Model (LLM)** to understand, reason, plan, and execute tasks. It operates autonomously, meaning it can perform actions and make decisions with minimal human input. AI agents can handle complex problems by breaking them down into smaller steps, using tools, accessing memory, and adjusting their behavior based on the context provided.

At its core, an AI agent is designed to:

1. **Receive a task**: Take input from the user, such as a question or command.
2. **Plan a solution**: Break down the problem, select appropriate tools, and reason through possible solutions.
3. **Execute the plan**: Perform actions, such as retrieving information or generating responses, based on the plan it has created.
4. **Respond with the result**: Present the final output in a structured and actionable format.

![agent](images/agentic-vs-non-agentic.png)

### Key Components of an AI Agent:

- **Agent Core**: The central decision-making unit that defines the goals of the agent, selects tools, and coordinates task execution.
- **Memory Module**: Tracks the agent’s past interactions. This can include short-term memory (for current tasks) or long-term memory (for ongoing interactions over time).
- **Tools**: External resources or APIs that the agent can use to perform specific tasks, like retrieving data, interacting with other systems, or performing calculations.
- **Planning Module**: Helps the agent break down complex tasks into smaller, more manageable sub-tasks. It allows the agent to handle sophisticated queries by decomposing them into simpler steps.
- **State Management**: State management is an essential concept when working with agents. The "state" of an agent refers to its current condition or the information it has at any given time. For instance, if an agent is working through a list of tasks, its state might include which tasks have been completed, which are in progress, and which are yet to be started. *Managing this state is crucial because it allows the agent to keep track of what it has done and what it needs to do next. Without proper state management, an agent might lose track of its progress, repeat tasks, or skip important steps. In this notebook, you'll learn how to manage an agent's state effectively, ensuring that it operates smoothly and efficiently.*

### Example:

Imagine an agent designed to help with financial analysis. When asked, "What are the three key takeaways from the Q2 earnings call for FY 2023?", the agent doesn’t just search for a single piece of information. Instead, it breaks the query into multiple steps:
1. Identify relevant sections of the earnings call.
2. Analyze the key points about technology or business developments.
3. Present the findings clearly to the user.

In this way, an AI agent uses the reasoning capabilities of an LLM combined with external tools to provide a comprehensive response.

![agent](images/agent.png)

Throughout this notebook, we will keep things simple, focusing on building the core functionality of the agent, so you can see how everything fits together in practice.

## 1. Setting up the Environment

Before we can communicate with the LLM, let’s install any required libraries and ensure our environment is ready.

In [19]:
%pip install requests jsonschema tenacity

Note: you may need to restart the kernel to use updated packages.


These packages are used for:

- **requests:** Making HTTP requests to interact with models.
- **jsonschema:** Validating the structure of the agent's output.
- **tenacity:** Handling retries in case of errors when communicating with the model.

### Datetime Function

We'll create is a simple function to get the current time. This is important because our agent might need to timestamp certain actions or events. Let's write a function that returns the current date and time in UTC format:

In [20]:
from datetime import datetime, timezone


def get_current_utc_datetime():
    now_utc = datetime.now(timezone.utc)
    return now_utc.strftime("%Y-%m-%d %H:%M:%S.%f UTC")[:-3]


# Example usage:
print("Current UTC datetime:", get_current_utc_datetime())

Current UTC datetime: 2024-09-06 18:41:14.438990 


## 2. Configuring a Simple Model

In this section, we configure the machine learning model that our agent will use to process tasks. The `ModelService` class manages the interaction with the model (in this case, "llama3.1:8b-instruct-fp16"), allowing the agent to handle tasks such as listing VMs and retrieving details.

### Model Configuration

We initialize the `ModelService` with a specific model configuration, including parameters such as model endpoint, temperature (for controlling randomness), and others. This step enables our agent to perform model-based tasks using the provided configuration.

In [21]:
from services.model_service import ModelService

# Initialize the service with the model configuration
ollama_service = ModelService(model="llama3.1:latest")

## 3. System Prompt for the Simple Agent

In this cell, we define the **System Prompt** for our agent. The system prompt helps set the behavior of the agent when interacting with the **Large Language Model (LLM)**. This particular prompt instructs the LLM to act as a helpful assistant, focused on answering questions clearly and concisely.

### Key Aspects of the System Prompt:
1. **Assistant Role:** The agent is instructed to provide useful, clear, and concise responses to user queries.
2. **Guidelines:**
   - **Clear Responses:** The agent is expected to give direct and simple answers to the user's questions.
   - **Avoid Unnecessary Details:** It avoids overloading the user with too much information—responses should be brief and relevant.
   - **Polite Tone:** The agent will always maintain a friendly and helpful tone in its responses.
   - **Answer Format:** Responses are expected to be returned in simple text, with no special formatting unless specifically requested by the user.
   
### Example Interaction:
This system prompt also includes an example to illustrate the expected behavior of the agent:
- If the user asks, "What is Python used for?", the agent will respond with something like:
  - "Python is a versatile programming language used for web development, data analysis, automation, and more."

This **System Prompt** ensures that the agent remains consistent in how it responds, providing useful and easy-to-understand answers in all cases.


In [22]:
SYS_PROMPT = """
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Cutoff Knowledge Date: December 2023
Current Date: {datetime}

You are a helpful assistant. Your job is to answer questions clearly and concisely. Make sure your responses are easy to understand and provide useful information.

---

### Example Interaction:

User: "What is Python used for?"

Assistant: "Python is a versatile programming language used for web development, data analysis, automation, and more."

---

### Key Points:
- **Be Direct:** Answer only what is asked, without extra information.
- **Be Polite:** Maintain a friendly and helpful tone.
<|eot_id|>
"""

## 4.Agent Class for Interacting with the LLM

In this cell, we define the `Agent` class, which serves as the core component responsible for interacting with the **Large Language Model (LLM)**. This class is designed to handle the interaction between user input and the LLM, process the model's responses, and maintain an internal state.

### Key Components:

1. **Initialization (`__init__`):**
   - The `Agent` class is initialized with:
     - `state`: A dictionary to track the agent's current state.
     - `role`: The defined role of the agent (e.g., "assistant").
     - `ollama_service`: A service interface that handles communication with the LLM (in this case, represented by `ModelService`).

2. **State Management (`update_state`):**
   - The `update_state` method allows the agent to update its internal state based on key-value pairs. This helps the agent track progress or store important data.
   - If the agent tries to update a state key that doesn't exist, it will warn the user.

3. **Model Interaction (`invoke_model`):**
   - This is the core method that interacts with the LLM. It prepares the input payload by combining the **system prompt** and **user prompt**.
   - The payload is then sent to the model service (`ollama_service`), which communicates with the LLM, retrieves the response, and processes it.
   - The final processed response from the model is returned for further use.

4. **Main Task Execution (`work`):**
   - The `work` method is the primary interface for executing tasks based on user input.
   - It formats the **system prompt** (using a predefined prompt template) and combines it with the **user prompt** (the user's specific question or request).
   - The `work` method then invokes the model and returns the final response from the LLM.
   - This method can be easily extended for more complex tasks.

### Example Workflow:
- The agent receives a **user request**.
- It prepares the system and user prompts and sends them to the model.
- The LLM processes the input and returns a response.
- The agent processes the LLM’s response and returns the final output.

This structure makes it easy to build on top of the agent, allowing you to handle more complex interactions, manage state efficiently, and extend the agent’s behavior over time.


In [23]:
from typing import Dict, Any


class Agent:

    def __init__(self, state: Dict[str, Any], role: str, ollama_service: ModelService):
        """
        Initialize the Agent with a state, role, and model configuration.
        """
        self.state = state
        self.role = role
        self.ollama_service = ollama_service

    def update_state(self, key: str, value: Any):
        """
        Update the state of the agent. Warn if the key doesn't exist.
        """
        if key in self.state:
            self.state[key] = value
        else:
            print(f"Warning: Attempting to update a non-existing state key '{key}'.")

    def invoke_model(self, sys_prompt: str, user_prompt: str):
        """
        Prepare the payload, send the request to the model, and process the response.
        """
        # Prepare the payload
        payload = self.ollama_service.prepare_payload(
            user_prompt,
            sys_prompt,
        )

        # Invoke the model and get the response
        response_json = self.ollama_service.request_model_generate(
            payload,
        )

        # Process the model's response
        response_content = self.ollama_service.process_model_response(response_json)

        # Return the processed response
        return response_content

    def work(
        self,
        user_request: str,
        sys_prompt: str = SYS_PROMPT,
    ) -> str:
        """
        Execute a simple task based on the user's request.
        """
        # Define a simple system prompt
        formatted_sys_prompt = sys_prompt.format(datetime=get_current_utc_datetime())
        user_prompt = f"""<|start_header_id|>user<|end_header_id|>\n\n{user_request}<|eot_id|>
            <|start_header_id|>assistant<|end_header_id|>"""

        # Invoke the model with the user's request
        response = self.invoke_model(
            sys_prompt=formatted_sys_prompt, user_prompt=user_prompt
        )

        # Return the processed response
        return response

### Running the Agent

Finally, let's demonstrate how to run the agent with a simple example. We'll use the agent class we've just implemented to process a task based on user input.

In [24]:
# Initialize the agent with an empty state and a role
agent_state = {"response": ""}
agent_role = "Helper Agent"
agent = Agent(state=agent_state, role=agent_role, ollama_service=ollama_service)

# Execute a task
user_input = "Tell me everything you know about Tesla."
response = agent.work(user_request=user_input)
print("Agent's response:", response)

Agent's response: {
    "name": "Tesla",
    "description": "Tesla, Inc., commonly known as Tesla, is an American multinational corporation that specializes in electric vehicle (EV) manufacturing and clean energy technologies.",
    "founder": "Elon Musk",
    "founding_date": "2003",
    "headquarters": "Austin, Texas",
    "products": {
        "vehicles": [
            {
                "model_s": "Sedan, SUVs, trucks, and Cybertruck",
                "features": "Autopilot technology, long-range battery options, all-wheel drive"
            },
            {
                "Model X": "Full-size luxury SUV with falcon-wing doors",
                "features": "Panoramic windshield, semi-autonomous driving capabilities"
            },
            {
                "Cybertruck": "Electric pickup truck",
                "features": "Blazer-like bed liner, quad-motor setup, glass roof"
            }
        ],
        "solar_products": [
            {
                "SolarRoof": "A phot

## 5. Switching to Llama 3.1 Instruct Model

In this section, we switch our model to **Llama 3.1 Instruct**, specifically designed for applications that combine both conversation and tool-calling. The Llama 3.1 Instruct model can generate responses while simultaneously utilizing built-in tools such as **Brave Search** and **Wolfram Alpha** to retrieve real-time information or perform complex calculations.

### Why Use Llama 3.1 Instruct?

The **Instruct** model is optimized for tool-based interactions and can:
- **Handle Tool Calls:** Automatically call tools such as Brave Search or Wolfram Alpha to answer questions that require real-time data or mathematical computation.
- **Generate Python Code:** In environments like `ipython`, the model can directly generate Python code, which can be executed to provide results.
- **Custom Tools:** Allow developers to define custom tools within prompts, enabling the agent to make intelligent decisions based on available resources.

### Setting Up the Model Service

We initialize the **Llama 3.1 Instruct** model service below. This service is designed to manage interactions between the agent and the model, facilitating tool usage when required.


In [25]:
# Initialize the service with the model configuration
ollama_instruct_service = ModelService(model="llama3.1:8b-instruct-fp16")

## 6. Updated System Prompt for Tool Use

In this section, we define a **System Prompt** specifically for the **Llama 3.1 Instruct** model. This prompt activates tool usage within the model, allowing the agent to utilize built-in tools like **Brave Search** when necessary.

### Key Changes to the System Prompt:
- **Environment Setup:** The model operates within an `ipython` environment, automatically enabling code interpretation.
- **Tools Definition:** The model can now call **Brave Search** when real-time information is requested.
- **Instructions for Real-Time Queries:** If the required data isn’t available in the model’s training set, the agent falls back to calling **Brave Search** for web-based answers.

This **System Prompt** instructs the agent to respond concisely and clearly, and enables it to use **Brave Search** when real-time data is needed, making the agent more dynamic and adaptable.

In [29]:
INSTRUCT_SYS_PROMPT = """
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Environment: ipython
Tools: brave_search
Cutoff Knowledge Date: December 2023
Today’s Date: {datetime}

You are a helpful and concise assistant. Your role is to provide accurate, direct, and easy-to-understand answers in a friendly tone. For real-time information, use brave_search. If real-time data is unavailable, inform the user politely.<|eot_id|>
"""

## 7. Running the Updated Agent with Tool Support

Now that we have set up the **Llama 3.1 Instruct** model and the updated system prompt with tool integration, let’s test the agent by asking it a question that requires real-time information. In this example, the agent uses **Brave Search** to retrieve the latest news on Tesla.

In this example, the agent sends the request to the **Llama 3.1 Instruct** model, which will decide if it needs to call **Brave Search** to find the most recent information about Tesla. The result is then processed and returned to the user.

In [32]:
# Initialize the agent with an empty state and a role
agent_state = {"response": ""}
agent_role = "Helper Agent"
agent = Agent(
    state=agent_state, role=agent_role, ollama_service=ollama_instruct_service
)

# Execute a task
user_input = "What are the lastest knews on Tesla."
response = agent.work(user_request=user_input, sys_prompt=INSTRUCT_SYS_PROMPT)
print("Agent's response:", response)

Agent's response: {
    "search_term": "latest news on Tesla"
}


## 8. Conclusion

In this notebook, we’ve built a **simple AI-powered agent** that interacts with a **Large Language Model (LLM)** to answer user queries. Let’s recap what we've accomplished:

- **Defined the agent's behavior** through a **System Prompt**, instructing the LLM on how to respond to questions.
- **Created the `Agent` class** to manage interactions between the user and the LLM.
- **Implemented state management** to allow the agent to track progress or other necessary information.
- **Configured a model service** to handle communication between the agent and the LLM, enabling it to retrieve and process responses.
- **Executed the agent** with a user query, receiving a helpful response from the LLM.
