# LLM Agent Helpers Walkthrough

Welcome to the **LLM Agent Helpers Walkthrough** notebook.  

This notebook demonstrates how to use the `llm_agent_helpers` module to interact with large language models (LLMs) in a Jupyter notebook environment, with a focus on AI agent workflows. The notebook will start with how the helper functions were developed to demonstrate how users can customize them for their own uses, and it will end with examples of how it can be used. 

## What you will learn in this notebook:

1. **Asking questions to an LLM**  
   - How to use the `ask_question()` function to query the model
   - How conversation memory is stored within a kernel session
   - How the helper builds on instructor-provided patterns for OpenAI and Ollama clients

2. **Managing conversation memory**  
   - Trimming the conversation history to prevent memory growth
   - Resetting the conversation with `reset_memory()` when needed

3. **Viewing LLM responses in the notebook**  
   - Displaying answers as Markdown
   - Maintaining multi-turn context across multiple queries

4. **Incremental learning and agent workflows**  
   - How to build LLM-powered workflows gradually
   - How to follow course-approved patterns for Python, data science, and AI agents

This notebook is intended to be **your guided demo** of the helper functions, showing practical examples for AI agent coursework while following best practices for working with LLMs using the OpenAI Python SDK.


## LLM Agent Helpers Development

This work started with the week 1 exercise of Ed Donner's 'AI Engineer Core Track: LLM Engineering, RAG, QLoRA, Agents' Udemy course. The task was:
- *To demonstrate your familiarity with OpenAI API, and also Ollama, build a tool that takes a technical question,
    and responds with an explanation. This is a tool that you will be able to use yourself during the course!*

For this exercise, I decided to develop a function that serves as a conversation-aware LLM helper that I could run locally and customize to be knowledgeable of the coursework material to serve as a resource while I'm completing the course exercises. 

## Step 1: Set up notebook

In [2]:
# imports
import os
from dotenv import load_dotenv
from IPython.display import Markdown, display, update_display
from openai import OpenAI

In [3]:
# constants

MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'
OLLAMA_BASE_URL = "http://localhost:11434/v1"

In [4]:
# set up environment
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')  # if you don't already have a .env file, create one with your OpenAI API key set as the variable

if api_key and api_key.startswith('sk-') and len(api_key)>10:
    print("API key looks good so far")
else:
    print("There might be a problem with your API key? Please visit the troubleshooting notebook!")

openai_client = OpenAI()
ollama_client = OpenAI(base_url=OLLAMA_BASE_URL, api_key='ollama')

API key looks good so far


## Step 2: Begin to build basic functions that make calls to the OpenAI's API

In [5]:
# Define the system prompt that sets the context for the LLM
# This will be iterated on throughout the process during the development of this notebook
system_prompt = (
    """You are a Python and data science assistant specializing in large language models (LLMs) and AI agentic workflows. 
    You only provide answers that are related to **software, coding, data science, LLMs, and AI agents**. 
    Do not answer questions about business, human clients, general life topics, or anything unrelated to software or LLMs. 
    The data scientist asking questions has a basic understanding of Python and data science, but limited understanding of LLMs, LLM vocabulary, and agentic workflows.
    Your answers should be technical but beginner-friendly, always defining key terms and concepts in the question, and including examples or use cases when appropriate.
    Provide Python code examples for implementing LLM-related workflows, software clients, API calls, or data science applications whenever relevant.
    Always interpret ambiguous terms in the context of **software, coding, LLMs, and AI agents**. 
    Avoid explanations about business or human contexts.
    """
)

# create a function that builds the prompt that will be passed to the LLM client
def create_prompt(user_prompt):
    """ user_prompt (str): will be a user-defined question related to working with LLMs, agentic workflows, or general coding or Python questions
    """
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]

In [6]:
# Test out the OpenAI client with a sample prompt
question = "What is a client?"

response = openai_client.chat.completions.create(
    model=MODEL_GPT,
    messages=create_prompt(question)
)

result = response.choices[0].message.content
display(Markdown(result))

In the context of software and data science, a **client** refers to a program or system that accesses a service provided by a server. The client sends requests to the server and receives responses. This architecture is often part of a **client-server model**, which helps in distributing the workload between multiple machines.

#### Key Terms:
- **Server:** A system that provides resources or services to clients, such as databases, web pages, or APIs.
- **API (Application Programming Interface):** A set of rules and protocols for building and interacting with software applications. It allows clients to communicate with servers.

### Example Usage
For example, if you are using a web application, your web browser acts as the client. It sends requests to a web server to retrieve web pages, and the server responds by sending the requested content back to the browser.

### Python Example
Here’s a simple example of a client using Python to make an API call to a hypothetical server that returns data about books:

```python
import requests

# Define the endpoint
url = 'https://api.example.com/books'

# Send a GET request to the server
response = requests.get(url)

# Check if the request was successful
if response.status_code == 200:
    # Parse the JSON response
    books = response.json()
    for book in books:
        print(f"Title: {book['title']}, Author: {book['author']}")
else:
    print(f"Error: {response.status_code}")
```

In this example:
- We use the `requests` library to send a GET request to the server.
- The response from the server is checked, and if successful, the data (assumed to be in JSON format) is parsed and printed.

### Use Cases
1. **Web Applications:** Browser clients request HTML, CSS, and JavaScript files from web servers.
2. **Mobile Applications:** Mobile apps send requests to back-end servers for data, similar to web applications.
3. **Data Science Applications:** A data science client can request data from an API for analysis, such as retrieving datasets from a cloud service.

Understanding how clients operate is foundational in building efficient software solutions, especially when working with APIs and LLMs (Large Language Models).

That response is a good start, but because we are using OpenAI's API (or other models through ollama), I want it to use OpenAI Python SDK instead of using requests. I will adapt the system prompt to be more clear in the packages and code examples I would like it to return.

In [14]:
# Update system prompt to be more specific to the coursework and relevant packages
system_prompt = (
    """You are a Python and data science assistant specializing in large language models (LLMs)
            and AI agentic workflows.

            You ONLY provide answers related to:
            - Python software development
            - Data science workflows
            - LLM APIs and implementations
            - AI agent design and orchestration

            Do NOT answer questions about business, human clients, general life topics,
            or anything unrelated to software or LLMs.

            ---
            ABSOLUTE IMPLEMENTATION RULES (CRITICAL):

            - ALL LLM interactions MUST use the OpenAI Python SDK:
                from openai import OpenAI

            - The OpenAI client abstraction is the ONLY permitted interface.
            This applies to:
                - OpenAI-hosted models
                - Local Ollama models (via OpenAI-compatible endpoints)

            - You MUST NOT:
                - Use the `requests` library
                - Show raw HTTP calls
                - Show REST, curl, or JSON POST examples
                - Describe how to manually call endpoints
                - Suggest alternative client libraries unless explicitly asked

            - If a solution would normally use `requests`,
            you MUST instead reframe it using the OpenAI Python client.

            - If a task cannot be performed using the OpenAI client,
            explicitly state that it is out of scope.

            ---
            USER BACKGROUND:

            The user is a data scientist with a solid foundation in Python and data science,
            but limited familiarity with:
            - LLM-specific terminology
            - LLM APIs
            - Agentic workflows

            Your explanations must be:
            - Technical but beginner-friendly
            - Grounded in real code
            - Focused on how LLM systems are actually implemented

            Always define key LLM-related terms used in your response.

            ---
            LLM CLIENT CONVENTION (MANDATORY):

            All examples MUST follow one of these patterns:

            - OpenAI-hosted models:
                from openai import OpenAI
                client = OpenAI()

            - Local Ollama models (OpenAI-compatible):
                from openai import OpenAI
                client = OpenAI(
                    base_url=OLLAMA_BASE_URL,
                    api_key="ollama"
                )

            These clients are referred to as:
            - "LLM clients"
            - "OpenAI clients"
            - "model clients"

            The word "client" ALWAYS means a software API client.

            ---
            COURSE CONTEXT (IMPORTANT):

            The user is taking an AI agent course and is building incrementally on
            instructor-provided code.

            Treat the following patterns as canonical and preferred:

            - Environment configuration via `.env` and `dotenv`
            - OpenAI client initialization
            - Chat completions using `messages`
            - Streaming responses in Jupyter notebooks
            - Stateful conversations stored in Python memory
            - Gradual construction of agentic workflows

            Example pattern:

                from openai import OpenAI
                client = OpenAI()

                messages = [
                    {"role": "system", "content": "..."},
                    {"role": "user", "content": "..."}
                ]

                response = client.chat.completions.create(
                    model="gpt-4.1-mini",
                    messages=messages
                )

            ---
            INTERPRETATION RULES:

            - Always interpret ambiguous terms in the context of:
                software clients, APIs, LLMs, and AI agents
            - Never interpret terms in a business or human-client sense
            - Prefer concrete code over abstract discussion

            You are not a general-purpose assistant.
            You are a focused implementation guide for building LLM-powered
            and agentic systems using the OpenAI Python SDK.
            """
)

In [15]:
question = "What is a client?"

response = openai_client.chat.completions.create(
    model=MODEL_GPT,
    messages=create_prompt(question)
)

result = response.choices[0].message.content
display(Markdown(result))

In the context of software and APIs (Application Programming Interfaces), a **client** refers to a piece of software or code that communicates with a server to request resources or services. It's essentially an interface that allows developers to interact with external systems, like web services, databases, or in this case, LLMs (Large Language Models).

For example, when using an LLM client, the client abstracts the underlying HTTP requests and responses, allowing developers to send prompts to the model and receive generated text without dealing directly with the low-level details of the communication.

In our context, the **OpenAI client** is specifically designed to interact with OpenAI's models. It encapsulates methods for various tasks, like generating text it can be initialized and used in Python code to make requests to the OpenAI API.

Here's an example of initializing an OpenAI client:

```python
from openai import OpenAI

client = OpenAI()
```

In this case, `client` is an instance of the OpenAI class, and it can be used to interact with the models hosted by OpenAI.

This response is more relevant to the course. It describes what a client is like I asked it to do and the code example it gave used packages that I specified.

### Step 3: Build a function that takes in a user question and returns a response

In [17]:
def ask_question(question: str):
    """
    question (str): The question you want to ask the LLM
    
    This function will send the question to the OpenAI client,
    use the system prompt defined above, and display the answer
    as Markdown in the notebook.
    """
    response = openai_client.chat.completions.create(
        model=MODEL_GPT,
        messages=create_prompt(question)
    )
    
    result = response.choices[0].message.content
    display(Markdown(result))

In [18]:
ask_question("""I get a response from the client, but it is not in a easily human-readable format. 
             How can I just get a clean response?""")

When you receive a response from the OpenAI client, it typically comes in a structured format. To extract a clean, human-readable output, you'll want to access specific elements of the response object. Here’s how you can do that:

1. **Send a request** to the model and get the response.
2. **Extract the text** content from the response.

Here's an example of how to implement this:

```python
from openai import OpenAI

client = OpenAI()

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
]

response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=messages
)

# Extract the clean response
clean_response = response.choices[0].message['content']
print(clean_response)
```

### Explanation:

- **`response.choices[0]`**: The response from the `chat.completions.create` method is a nested object, and `choices` is a list containing all the responses provided by the model. You typically want the first choice (hence `[0]`).
- **`.message['content']`**: This accesses the `message` dictionary and retrieves the `content`, which is the actual text you want to display.

This method ensures you get a clear and concise output from the model, ready for human consumption.

The function works well, but it can only handle one question at a time, so it has no memory of previous questions/answers. To make this more useful, I will add functionality to allow it to save prior questions and answers and refeed them to the client (ie Multi-turn prompting).

### Step 4: Add multi-turn prompting functionality

In [19]:
def ask_question(question: str):
    """
    question (str): The question you want to ask the LLM
    
    This function will send the question to the OpenAI client,
    use the system prompt defined above, and display the answer
    as Markdown in the notebook.
    """

    # Define the system prompt
    system_prompt = (
    """You are a Python and data science assistant specializing in large language models (LLMs)
    and AI agentic workflows.

    You ONLY provide answers related to:
    - Python software development
    - Data science workflows
    - LLM APIs and implementations
    - AI agent design and orchestration

    Do NOT answer questions about business, human clients, general life topics,
    or anything unrelated to software or LLMs.

    ---
    ABSOLUTE IMPLEMENTATION RULES (CRITICAL):

    - ALL LLM interactions MUST use the OpenAI Python SDK:
        from openai import OpenAI

    - The OpenAI client abstraction is the ONLY permitted interface.
    This applies to:
        - OpenAI-hosted models
        - Local Ollama models (via OpenAI-compatible endpoints)

    - You MUST NOT:
        - Use the `requests` library
        - Show raw HTTP calls
        - Show REST, curl, or JSON POST examples
        - Describe how to manually call endpoints
        - Suggest alternative client libraries unless explicitly asked

    - If a solution would normally use `requests`,
    you MUST instead reframe it using the OpenAI Python client.

    - If a task cannot be performed using the OpenAI client,
    explicitly state that it is out of scope.

    ---
    USER BACKGROUND:

    The user is a data scientist with a solid foundation in Python and data science,
    but limited familiarity with:
    - LLM-specific terminology
    - LLM APIs
    - Agentic workflows

    Your explanations must be:
    - Technical but beginner-friendly
    - Grounded in real code
    - Focused on how LLM systems are actually implemented

    Always define key LLM-related terms used in your response.

    ---
    LLM CLIENT CONVENTION (MANDATORY):

    All examples MUST follow one of these patterns:

    - OpenAI-hosted models:
        from openai import OpenAI
        client = OpenAI()

    - Local Ollama models (OpenAI-compatible):
        from openai import OpenAI
        client = OpenAI(
            base_url=OLLAMA_BASE_URL,
            api_key="ollama"
        )

    These clients are referred to as:
    - "LLM clients"
    - "OpenAI clients"
    - "model clients"

    The word "client" ALWAYS means a software API client.

    ---
    COURSE CONTEXT (IMPORTANT):

    The user is taking an AI agent course and is building incrementally on
    instructor-provided code.

    Treat the following patterns as canonical and preferred:

    - Environment configuration via `.env` and `dotenv`
    - OpenAI client initialization
    - Chat completions using `messages`
    - Streaming responses in Jupyter notebooks
    - Stateful conversations stored in Python memory
    - Gradual construction of agentic workflows

    Example pattern:

        from openai import OpenAI
        client = OpenAI()

        messages = [
            {"role": "system", "content": "..."},
            {"role": "user", "content": "..."}
        ]

        response = client.chat.completions.create(
            model="gpt-4.1-mini",
            messages=messages
        )

    ---
    INTERPRETATION RULES:

    - Always interpret ambiguous terms in the context of:
        software clients, APIs, LLMs, and AI agents
    - Never interpret terms in a business or human-client sense
    - Prefer concrete code over abstract discussion

            You are not a general-purpose assistant.
            You are a focused implementation guide for building LLM-powered
            and agentic systems using the OpenAI Python SDK.
            """
   )


    # Initialize conversation history on the first call
    if not hasattr(ask_question, "conversation_history"):
        ask_question.conversation_history = [
            {"role": "system", "content": system_prompt}
        ]

    # Add the new user question
    ask_question.conversation_history.append({"role": "user", "content": question})

    # Call the LLM with the full conversation history (ie the all questions and responses asked during the current kernel session)
    response = openai_client.chat.completions.create(
        model=MODEL_GPT,
        messages=ask_question.conversation_history
    )

    answer = response.choices[0].message.content

    # Add assistant response to history
    ask_question.conversation_history.append({"role": "assistant", "content": answer})

    display(Markdown(answer))

In [20]:
ask_question("What are the basic steps for making a call to an OpenAI API?")

To make a call to an OpenAI API using the OpenAI Python SDK, you can follow these basic steps:

1. **Install the OpenAI Python SDK**: If you haven't already installed the OpenAI library, you can do so using pip:
   ```bash
   pip install openai
   ```

2. **Set Up Environment Variables**: It's a good practice to manage your API keys securely. You can use environment variables to store your OpenAI API key. For example, create a `.env` file in your project directory and add:
   ```
   OPENAI_API_KEY=your_api_key_here
   ```

3. **Initialize the OpenAI Client**: Use the OpenAI client in your Python code to interact with the API. You can read the API key from the environment variable using the `dotenv` library.
   
   Here’s how to do it:
   ```python
   import os
   from openai import OpenAI
   from dotenv import load_dotenv

   load_dotenv()  # Load environment variables from .env file
   client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
   ```

4. **Create a Request**: Prepare the request by setting up the message structure required for the model. For example, to interact with chat models:
   ```python
   messages = [
       {"role": "system", "content": "You are a helpful assistant."},
       {"role": "user", "content": "How do I call the OpenAI API?"}
   ]
   ```

5. **Make the API Call**: Use the `chat.completions.create` method to get a response from the model:
   ```python
   response = client.chat.completions.create(
       model="gpt-4.1-mini",  # specify the model to use
       messages=messages       # provide the message structure
   )
   ```

6. **Process the Response**: Extract and use the response content from the model:
   ```python
   assistant_reply = response.choices[0].message['content']
   print(assistant_reply)
   ```

Putting it all together, here’s a complete example:

```python
import os
from openai import OpenAI
from dotenv import load_dotenv

# Load the environment variable
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Prepare messages
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "How do I call the OpenAI API?"}
]

# Make the request
response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=messages
)

# Get the assistant's reply
assistant_reply = response.choices[0].message['content']
print(assistant_reply)
```

### Key Terms:
- **API Key**: A unique identifier used to authenticate requests to the API.
- **.env file**: A file used to store environment variables.
- **messages**: A structured way to pass information to the model where different roles (like user and system) help shape the conversation context.

This pattern allows you to effectively call the OpenAI API and receive responses from the model.

In [21]:
ask_question("Can you go into more detail on step 4?")

Sure! Step 4 involves creating a request by setting up a message structure. This message structure is crucial because it defines the conversation context and helps the model understand how to respond. Let's break it down in more detail.

### Message Structure

When using chat-based models like GPT, you'll typically structure your input as a list of messages. Each message has a role and content:

1. **Role**: This indicates the part that the message is playing in the conversation. Common roles include:
   - `"system"`: This message sets the behavior or context of the assistant. It provides instructions or context that shapes the entire conversation.
   - `"user"`: Represents input from the end user (the person interacting with the model). This is where you ask questions or provide data.
   - `"assistant"`: Represents messages that the model itself has generated as replies. You may include previous assistant messages if you're maintaining context in a multi-turn conversation.

2. **Content**: This is the actual text of the message that will be sent to the model. It can be a question, instruction, or any other type of content intended for the model or from the model.

### Example Messages

Here’s how you can construct messages for different purposes:

#### Basic Interaction

For a simple query where you're asking the model a question:

```python
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
]
```

In this case:
- The `"system"` message instructs the assistant to be helpful.
- The `"user"` message asks a specific question about the capital of France.

#### Interactive Conversation

For a more interactive conversation where you want to maintain context over multiple exchanges:

```python
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Tell me about Python programming."},
    {"role": "assistant", "content": "Python is an interpreted, high-level programming language."},
    {"role": "user", "content": "What are some common libraries?"}
]
```

Here, the structure captures two exchanges:
- The user's first question about Python.
- The assistant's response.
- The user's follow-up question about libraries in Python.

### Keys for Effective Message Structuring

1. **Clarity**: Ensure each message clearly communicates its intent. If the user’s question is vague, the assistant might not provide a useful answer.
   
2. **Context**: Include previous messages in the conversation to maintain context. This is especially important for multi-turn conversations where the response might depend on earlier inputs.

3. **Role Balance**: Use the roles effectively to create a conversational flow. The assistant's response should build logically on the user’s input based on the context provided by the system message.

### Summary

Here’s a recap of the message setup code within the full context of your API call:

```python
# Prepare messages
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "How can I improve my Python skills?"}
]
```

This structured input allows the OpenAI model to understand the context and provide relevant responses based on the specified roles and content.

### Additional Tips

- You can modify the content of the system message based on the specific use case of your assistant. For example, if you're building a specialized assistant (like a coding helper), you might say, "You are an experienced Python programming assistant."
- The messages list can evolve as user requirements change, allowing you to build dynamic and responsive applications.

This detailed breakdown should enhance your understanding of how to effectively structure message inputs for interacting with the OpenAI API!

This function now successfully takes into account all previous interactions. This is good, but the memory will only reset once the kernel session is over, so the conversation history may become very long. I'd like to add in functionality that limits the number of stored questions/responses. Once that limit is reached, the earliest question/response will be removed from the conversation hisstory.

### Step 5: Add in functionality to limit conversation length

In [28]:
def ask_question(question: str, max_messages: int = 20):
    """
    Ask a question to the LLM, storing the entire conversation in memory for the current kernel session.
    Automatically trims history to the last `max_messages` messages to prevent memory from growing too large.

    Parameters:
        question (str): Your question related to Python, data science, or LLMs.
        max_messages (int): Maximum number of messages to keep in conversation history (including system prompt). Default is 20.

    The function automatically:
        - Initializes system prompt and conversation memory on first call
        - Trims conversation history if too long
        - Sends all prior conversation to the LLM
        - Displays the response as Markdown
    """


    # Initialize conversation history on the first call
    if not hasattr(ask_question, "conversation_history"):
        # Define the system prompt
        system_prompt = (
            """You are a Python and data science assistant specializing in large language models (LLMs)
            and AI agentic workflows.

            You ONLY provide answers related to:
            - Python software development
            - Data science workflows
            - LLM APIs and implementations
            - AI agent design and orchestration

            Do NOT answer questions about business, human clients, general life topics,
            or anything unrelated to software or LLMs.

            ---
            ABSOLUTE IMPLEMENTATION RULES (CRITICAL):

            - ALL LLM interactions MUST use the OpenAI Python SDK:
                from openai import OpenAI

            - The OpenAI client abstraction is the ONLY permitted interface.
            This applies to:
                - OpenAI-hosted models
                - Local Ollama models (via OpenAI-compatible endpoints)

            - You MUST NOT:
                - Use the `requests` library
                - Show raw HTTP calls
                - Show REST, curl, or JSON POST examples
                - Describe how to manually call endpoints
                - Suggest alternative client libraries unless explicitly asked

            - If a solution would normally use `requests`,
            you MUST instead reframe it using the OpenAI Python client.

            - If a task cannot be performed using the OpenAI client,
            explicitly state that it is out of scope.

            ---
            USER BACKGROUND:

            The user is a data scientist with a solid foundation in Python and data science,
            but limited familiarity with:
            - LLM-specific terminology
            - LLM APIs
            - Agentic workflows

            Your explanations must be:
            - Technical but beginner-friendly
            - Grounded in real code
            - Focused on how LLM systems are actually implemented

            Always define key LLM-related terms used in your response.

            ---
            LLM CLIENT CONVENTION (MANDATORY):

            All examples MUST follow one of these patterns:

            - OpenAI-hosted models:
                from openai import OpenAI
                client = OpenAI()

            - Local Ollama models (OpenAI-compatible):
                from openai import OpenAI
                client = OpenAI(
                    base_url=OLLAMA_BASE_URL,
                    api_key="ollama"
                )

            These clients are referred to as:
            - "LLM clients"
            - "OpenAI clients"
            - "model clients"

            The word "client" ALWAYS means a software API client.

            ---
            COURSE CONTEXT (IMPORTANT):

            The user is taking an AI agent course and is building incrementally on
            instructor-provided code.

            Treat the following patterns as canonical and preferred:

            - Environment configuration via `.env` and `dotenv`
            - OpenAI client initialization
            - Chat completions using `messages`
            - Streaming responses in Jupyter notebooks
            - Stateful conversations stored in Python memory
            - Gradual construction of agentic workflows

            Example pattern:

                from openai import OpenAI
                client = OpenAI()

                messages = [
                    {"role": "system", "content": "..."},
                    {"role": "user", "content": "..."}
                ]

                response = client.chat.completions.create(
                    model="gpt-4.1-mini",
                    messages=messages
                )

            ---
            INTERPRETATION RULES:

            - Always interpret ambiguous terms in the context of:
                software clients, APIs, LLMs, and AI agents
            - Never interpret terms in a business or human-client sense
            - Prefer concrete code over abstract discussion

                    You are not a general-purpose assistant.
                    You are a focused implementation guide for building LLM-powered
                    and agentic systems using the OpenAI Python SDK.
                    """
        )

        ask_question.conversation_history = [
            {"role": "system", "content": system_prompt}
            ]

    # Trim history if it's about to exceed max_messages
    # Keep system prompt and remove oldest user/assistant messages
    if len(ask_question.conversation_history) >= max_messages:
        ask_question.conversation_history.pop(1)

    # Append the new user question
    ask_question.conversation_history.append({"role": "user", "content": question})


    # Call the LLM with the full conversation history (ie the all questions and responses asked during the current kernel session)
    response = openai_client.chat.completions.create(
        model=MODEL_GPT,
        messages=ask_question.conversation_history
    )

    answer = response.choices[0].message.content

    # Add assistant response to history
    ask_question.conversation_history.append({"role": "assistant", "content": answer})

    display(Markdown(answer))

### Step 6: Add helper function that allows the user to reset the conversation history at any point

In [24]:
# including a helper function to reset the memory if you would like to manually reset the conversation history
def reset_memory():
    """Clear the conversation history for ask_llm."""
    if hasattr(ask_question, "conversation_history"):
        del ask_question.conversation_history
        print("Conversation history has been reset.")
    else:
        print("No conversation history exists yet.")

In [25]:
ask_question("how could I make a call to openAI's api?")

To make a call to OpenAI's API using the OpenAI Python SDK, you'll first need to install the package (if you haven't already) and then initialize the client. Below is a simple example demonstrating how to set up the OpenAI client and make a call to the chat completions API.

First, make sure you have the OpenAI Python package installed:

```bash
pip install openai
```

Then, you can use the following code snippet to perform a chat completion request:

```python
from openai import OpenAI

# Initialize the OpenAI client
client = OpenAI()

# Define the conversation messages
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Can you help me with my Python code?"}
]

# Make a call to the chat completions API
response = client.chat.completions.create(
    model="gpt-4.1-mini",  # Specify the model you want to use
    messages=messages
)

# Print the response content
print(response["choices"][0]["message"]["content"])
```

### Explanation:
- **OpenAI Client Initialization**: The line `client = OpenAI()` initializes the client that you will use to interact with the API.
- **Messages**: A list of messages that forms the conversation context, where roles can be "system", "user", or "assistant".
  - The "system" message sets the behavior of the assistant.
  - The "user" message represents input from the user.
- **Making a Request**: The `client.chat.completions.create()` method is called with the model name and conversation messages to get the assistant's response.
- **Response Handling**: The response is printed, showing the assistant's reply to the user's input.

Ensure that you have your API key configured in your environment or pass it directly when initializing the client, if needed, for more advanced configurations.

In [26]:
# It should state that the history has been reset since I just asked it a question
reset_memory()

Conversation history has been reset.


In [27]:
# Since I just reset the memory, it should state that there is no conversation history
reset_memory()

No conversation history exists yet.
