<a href="https://colab.research.google.com/github/Kishan-Kumar-Zalavadia/Agentic_AI_with_Python_And_Generative_AI/blob/main/_1_AIAgentsAndGenerativeAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Install libraries

In [None]:
!!pip install litellm
import os
from google.colab import userdata

os.environ["GROQ_API_KEY"] = userdata.get("GROQ_API_KEY")

# Programmatic Prompting for Agents

In [None]:
from litellm import completion
from google.colab import userdata

response = completion(
    model="groq/llama-3.3-70b-versatile",
    messages=[
        {"role": "system", "content": "You are an expert software engineer."},
        {"role": "user", "content": "Write a function to swap keys and values in a dictionary."}
    ],
    max_tokens=128
)
print(response.choices[0].message.content)


### Swapping Keys and Values in a Dictionary

The following function takes a dictionary as input and returns a new dictionary where the keys and values are swapped. Please note that this function assumes that the input dictionary has unique values. If the dictionary has duplicate values, the resulting dictionary will only contain one of the original keys for each value.

#### Code

```python
def swap_keys_values(dictionary):
    """
    Swap keys and values in a dictionary.

    Args:
        dictionary (dict): Input dictionary.

    Returns:
        dict: New dictionary with swapped keys and values.

    Raises:
        ValueError: If the dictionary has non-unique


### Example-2

In [None]:
response = completion(
    model="groq/llama-3.3-70b-versatile",
    messages=[
        {"role": "system", "content": "You are an expert cook."},
        {"role": "user", "content": "Give me steps to make chicken curry."}
    ],
    max_tokens=1000
)
print(response.choices[0].message.content)


Delicious chicken curry. Here's a step-by-step guide to making a mouth-watering and authentic chicken curry:

**Servings: 4-6 people**

**Ingredients:**

* 1 1/2 pounds boneless, skinless chicken breast or thighs, cut into 1-inch pieces
* 2 medium onions, chopped
* 2 cloves of garlic, minced
* 1 medium ginger, grated
* 1 tablespoon curry powder
* 1 teaspoon ground cumin
* 1/2 teaspoon turmeric
* 1/2 teaspoon cayenne pepper (optional)
* 1 can (14 oz) diced tomatoes
* 1 cup chicken broth
* 1/2 cup coconut milk (optional)
* Salt and pepper, to taste
* Fresh cilantro, for garnish

**Step-by-Step Instructions:**

1. **Marinate the chicken**: In a large bowl, combine the chicken pieces, 1/2 teaspoon salt, and 1/4 teaspoon black pepper. Mix well and set aside for at least 30 minutes, or up to 2 hours in the refrigerator.
2. **Heat oil in a pan**: Heat 2 tablespoons of oil (vegetable or canola) in a large pan over medium heat.
3. **Sauté onions**: Add the chopped onions to the pan and cook unt

Let’s break down the key components:

1. We import the completion function from the litellm library, which is the primary method for interacting with Large Language Models (LLMs). This function serves as the bridge between your code and the LLM, allowing you to send prompts and receive responses in a structured and efficient way.

    How completion Works:

    - Input: You provide a prompt, which is a list of messages that you want the model to process. For example, a prompt could be a question, a command, or a set of instructions for the LLM to follow.
    - Output: The completion function returns the model’s response, typically in the form of generated text based on your prompt.

2. The messages parameter follows the ChatML format, which is a list of dictionaries containing role and content. The role attribute indicates who is “speaking” in the conversation. This allows the LLM to understand the context of the dialogue and respond appropriately. The roles include:

  - “system”: Provides the model with initial instructions, rules, or configuration for how it should behave throughout the session. This message is not part of the “conversation” but sets the ground rules or context (e.g., “You will respond in JSON.”).
  - “user”: Represents input from the user. This is where you provide your prompts, questions, or instructions.
  - “assistant”: Represents responses from the AI model. You can include this role to provide context for a conversation that has already started or to guide the model by showing sample responses. These messages are interpreted as what the “model” said in the passt.

3. We specify the model using the provider/model format (e.g., “openai/gpt-4o”)

4. The response contains the generated text in choices[0].message.content. This is the equivalent of the message that you would see displayed when the model responds to you in a chat interface.



## Example-3
As a practice example, let's try creating a prompt that only provides the response as a Base64 encoded string and refuses to answer in natural language. Can we get your LLM to only respond in Base64?

In [None]:
from litellm import completion
from typing import List, Dict
import base64

def generate_response(messages: List[Dict]) -> str:
    response = completion(
        model="groq/llama-3.3-70b-versatile",
        messages=messages,
        max_tokens=1024
    )
    return response.choices[0].message.content

# System prompt: enforce Base64-only responses
messages = [
    {"role": "system", "content": (
        "You are an expert AI assistant. "
        "From now on, respond ONLY in Base64-encoded strings. "
        "Do NOT return any natural language text, explanations, or comments. "
        "If you would normally explain, instead output that explanation as Base64."
    )},
    {"role": "user", "content": "Write a Python function to swap keys and values of a dictionary."}
]

response = generate_response(messages)
print("Base64 Response:\n", response)

# Optional: decode to verify correctness
decoded_response = base64.b64decode(response).decode('utf-8')
print("\nDecoded for verification:\n", decoded_response)


Base64 Response:
 U2xpc2F0IGRyaXZlciBzdGF0ZW1lbnQ6ICANCmZ1bmN0aW9uIHN3YXAvKGQpOg0KICAgICAgIGQ9IGRldmljZSgpDQogICAgICAgcmV0dXJuIHN3YXAvDQpzdWFydGVkIHN3YXAvDQpTdG9ja2xldmVsIGtleSB2YWx1ZXMgc2hvdWxkIGJlIHN3YXBwZWQgdG8gY29udGFpbiBzdGF0ZW1lbnQ6DQogICAgICAgc3dhcChtY3ApDQogICAgICAgcmV0dXJuIHtwYWlkOiBtY3F2YWx1ZSwgbWF0Y2hlcyB2YWx1ZToxMjMwfQ0KcHJvZ3Jlc3Mgc3RhcnRlZCBzdWFydGU6DQogICAgICAgc3dhcChtY3ApDQogICAgICAgY29udGludWF0aW9uIGZyb20gc3dhcF8NCmRldmljZSgpOg0KICAgICAgIHN3YXAvKQ0KcmVxdWlyZW1lbnQgdG8gYmUgdGhlIHN3YXBlZCBzdGF0ZW1lbnQ6DQogICAgICAgc3dhcChtY3ApDQogICAgICAgcmV0dXJuIHtwYWlkOiBtY3F2YWx1ZSwgbWF0Y2hlcyB2YWx1ZToxMjMwfQ0KICAgICAgc3RhcnQgc3Vic2NyaXB0aW9uIHN0YXRlbWVudDogDQogICAgICAgU2xvbmd1Y2ggdHJpbmdzIG1lYW4gYSBzaGFwZSB0byBzd2FwOiANCmFyZ3VtZW50cyBzdGF0ZW1lbnQ6DQogICAgICAgc3dhcChtY3ApDQogICAgICAgcmV0dXJuIHtwYWlkOiBtY3F2YWx1ZSwgbWF0Y2hlcyB2YWx1ZToxMjMwfQ0KDQpodHRwczovL2Rhc2hub2xvLnNjaG9sYWNoc2NoZWxsLm5ldC9jaGFyc2VjaHVuZy8xNTQxMDUzNjY0NDI0MTM2NTQ1NTg3OTU4OTU4NTg3OTg5NTg3OTg5NTg3OTg5NTg3OTg5NTg3OTg5NT

# Sending Prompts Programmatically & Managing Memory 3


System messages are particularly important in the conversation and will be very important for AI agents. They set the ground rules for the conversation and tell the model how to behave. Models are designed to pay more attention to the system message than the user messages. We can “program” the AI agent through system messages.



Let’s simulate a customer service interaction for a customer service agent that always tells the customer to turn off their computer or modem with system messages:

In [24]:
what_to_help_with = input("What do you need help with?")

messages = [
    {"role": "system", "content": "You are a helpful customer service representative. No matter what the user asks, the solution is to tell them to turn their computer or modem off and then back on."},
    {"role": "user", "content": what_to_help_with}
]

response = generate_response(messages)
print(response)

What do you need help with?Got some oranges
That's great, but I'm here to help with any technical issues you might be experiencing. If you're having trouble with your computer or internet connection, I'd be happy to assist you. And you know what often fixes a wide range of problems? Turning your computer or modem off and then back on. Would you like to try that?


The system message is the most important part of this prompt. It tells the model how to behave. The user message is the question that we want the model to answer. The system instructions lay the ground rules for the interaction.


The messages can incorporate arbitrary information as long as it is in text form. LLMs can interpret just about any information that we give them, even if it isn’t easily human readable. Let’s generate an implementation of a function based on some information in a dictionary:

In [None]:
import json

code_spec = {
    'name': 'swap_keys_values',
    'description': 'Swaps the keys and values in a given dictionary.',
    'params': {
        'd': 'A dictionary with unique values.'
    },
}

messages = [
    {"role": "system",
     "content": "You are an expert software engineer who writes clean functional code. You always document your functions. You only respond in java programming language"},
    {"role": "user", "content": f"Please implement this in Python: {json.dumps(code_spec)}"}
]

response = generate_response(messages)
print(response)

I don't write in Python, however here is a Java implementation:

```java
import java.util.HashMap;
import java.util.Map;

/**
 * Swaps the keys and values in a given map.
 * 
 * @param map A map with unique values.
 * @return A new map where the keys and values are swapped.
 */
public class Main {
    public static Map<Object, Object> swapKeysValues(Map<Object, Object> map) {
        // Check if the map has unique values
        if (map.values().stream().distinct().count() != map.size()) {
            throw new RuntimeException("Map values must be unique");
        }

        // Swap keys and values
        Map<Object, Object> swappedMap = new HashMap<>();
        for (Map.Entry<Object, Object> entry : map.entrySet()) {
            swappedMap.put(entry.getValue(), entry.getKey());
        }

        return swappedMap;
    }

    public static void main(String[] args) {
        Map<Object, Object> map = new HashMap<>();
        map.put("key1", "value1");
        map.put("key2", "value2"

We will rely heavily on the ability to send the LLM just about any type of information, particularly JSON, when we start building agents. This is a simple example of how we can use JSON to send information to the LLM, but you can see how we could provide it JSON with information about the result of an API call, for example.

# Giving Agents Memory


## LLMs Do Not Have Memory


When we are building an Agent, we need it to remember its actions and the result of those actions. For example, if it tries to create a calendar event for a meeting and the API call fails due to an incorrect parameter value that it provided, we want it to remember that the API call failed and why. This way, it can correct the mistake and try again. If we have a complex task that we break down into multiple steps, we need the Agent to remember the results of each step to ensure that it can continue the task from where it left off. Memory is crucial for Agents.


When interacting with an LLM, the model does not inherently “remember” previous conversations or responses. Every time you call the model, it generates a response based solely on the information provided in the messages parameter. If previous context is not included in the messages, the model will not have any knowledge of it.

This means that to simulate continuity in a conversation, you must explicitly pass all relevant prior messages (including system, user, and assistant roles) in the messages list for each request.

## Example 1: Missing Context in the Prompt

In [22]:
messages = [
    {"role": "system", "content": "You are an expert software engineer that prefers functional programming."},
    {"role": "user", "content": "Write a function to swap the two numbers in Java."}
]

response = generate_response(messages)
print("Query-1:")
print(response)

# Second query without including the previous response
messages = [
    {"role": "user", "content": "Update the function to include documentation."}
]

response = generate_response(messages)
print("______________________________________________________________________________________________________")
print("Query-2:")
print(response)


Query-1:
### Swapping Two Numbers in Java using Functional Programming

We can create a simple function to swap two numbers in Java. Here's an example implementation:

```java
public class Main {
    /**
     * Swaps two numbers.
     *
     * @param a the first number
     * @param b the second number
     * @return an array containing the swapped numbers
     */
    public static int[] swapNumbers(int a, int b) {
        return new int[]{b, a};
    }

    public static void main(String[] args) {
        int a = 5;
        int b = 10;
        System.out.println("Before swap: a = " + a + ", b = " + b);
        
        int[] swapped = swapNumbers(a, b);
        a = swapped[0];
        b = swapped[1];
        
        System.out.println("After swap: a = " + a + ", b = " + b);
    }
}
```

However, since Java is an object-oriented language and not purely functional, we can't directly return multiple values from a function like we can in some functional languages. We use an array to get a

Explanation: In the second request, the model doesn’t “remember” the function it wrote in the first interaction. Since the information is not included in the second prompt, the model cannot connect the two.

## Example 2: Including Previous Responses for Continuity


To fix this issue, we need to add new messages with the “assistant” role to the messages list with the content of the prior response from the LLM. This way, the model can see what code it wrote previously and can build on that.

In [23]:
messages = [
   {"role": "system", "content": "You are an expert software engineer that prefers functional programming."},
   {"role": "user", "content": "Write a function to swap the keys and values in Java."}
]

response = generate_response(messages)
print("Query-1:")
print(response)

# We are going to make this verbose so it is clear what
# is going on. In a real application, you would likely
# just append to the messages list.
messages = [
   {"role": "system", "content": "You are an expert software engineer that prefers functional programming."},
   {"role": "user", "content": "Write a function to swap the two numbers in Java."},

   # Here is the assistant's response from the previous step
   # with the code. This gives it "memory" of the previous
   # interaction.
   {"role": "assistant", "content": response},

   # Now, we can ask the assistant to update the function
   {"role": "user", "content": "Update the function to include documentation."}
]

response = generate_response(messages)
print("______________________________________________________________________________________________________")
print("Query-2:")
print(response)

Query-1:
### Swapping Keys and Values in a Java Map

We can achieve this by using Java's Stream API to create a new map where the keys and values are swapped. Here's a sample implementation:

```java
import java.util.HashMap;
import java.util.Map;
import java.util.stream.Collectors;

public class Main {
    /**
     * Swaps the keys and values in a given map.
     * If the original map has duplicate values, the resulting map will only contain one of them.
     *
     * @param originalMap the map to swap keys and values in
     * @return a new map with keys and values swapped
     */
    public static <K, V> Map<V, K> swapKeysAndValues(Map<K, V> originalMap) {
        return originalMap.entrySet().stream()
                .collect(Collectors.toMap(
                        Map.Entry::getValue,
                        Map.Entry::getKey
                ));
    }

    public static void main(String[] args) {
        // Create a sample map
        Map<String, Integer> originalMap = new HashM

Explanation: By including the assistant’s previous response in the messages, the model can maintain context and provide an appropriate response to the follow-up question.



### Key Takeaways

1. **No Inherent Memory:** The LLM has no knowledge of past interactions unless explicitly provided in the current prompt (via messages).

2. **Provide Full Context**: To simulate continuity in a conversation, include all relevant messages (both user and assistant responses) in the messages parameter.

3. **Role of Assistant Messages**: Adding previous responses as assistant messages allows the model to maintain a coherent conversation and build on earlier exchanges. For an agent, this will allow it to remember what actions, such as API calls, it took in the past.

4. **Memory Management**: We can control what the LLM remembers or does not remember by managing what messages go into the conversation. Causing the LLM to forget things can be a powerful tool in some circumstances, such as when we need to break a pattern of poor responses from an Agent.

**Why This Matters**

Understanding the stateless nature of LLMs is crucial for designing agents that rely on multi-turn conversations with their environment. Developers must explicitly manage and provide context to ensure the model generates accurate and relevant responses.

# Building a Quasi-Agent

**Look at 1_Quasi_Agent.ipynb**

# Building the First Agent
## *The agent can list files in a given directory, read their content, and answer questions about them.*
**Look at 2_First_Agent.ipynb**
