# Exercise - Add Memory and Self-reflection - STARTER

In this exercise, you’ll enhance your AI agent by adding self-reflection and memory. These features allow the agent to iteratively critique its responses and improve over time while maintaining a log of all interactions. 

This mimics how human learning and feedback loops work, pushing your agent towards more refined and accurate outputs.

**Challenge**

You are tasked with upgrading the existing agent. This version can learn from its previous answers, identify mistakes, and refine its responses automatically.

## 0. Import the necessary libs

In [1]:
import json
from typing import List, Dict, Literal
from openai import OpenAI
from openai.types.chat.chat_completion_message import ChatCompletionMessage

## 1. Recap: how to use OpenAI client with your API Key

To be able to connect with OpenAI, you need to instantiate an OpenAI client passing your OpenAI key.

You can pass the `api_key` argument directly.
```python
client = OpenAI(api_key="voc-")
```

In [3]:
# TODO - Instantiate your client
from dotenv import load_dotenv
load_dotenv()
client = OpenAI(
    # api_key = "YOUR_API_KEY_HERE"
)

In [4]:
response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Answer all user questions"},
            {"role": "user", "content":"What have I asked?"},
        ],
        temperature=0.0,
    )
response.choices[0].message.content

"I'm unable to access previous interactions or remember past conversations. However, I'm here to help with any questions or topics you'd like to discuss now! What would you like to know?"

## 2. Recap: Adding Memory

In order to add reflection, you need to make sure your agent can keep  track of all interactions. Let's quickly recap how to do it with a simple list.

In [5]:
memory = [
    {"role": "system", "content": "Answer all user questions"},
    {"role": "user", "content": "What's an API"},
]

In [6]:
new_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=memory,
    temperature=0.0,
)

memory.append(
    {"role": "assistant", "content": new_response.choices[0].message.content}
)

memory

[{'role': 'system', 'content': 'Answer all user questions'},
 {'role': 'user', 'content': "What's an API"},
 {'role': 'assistant',
  'content': 'An API, or Application Programming Interface, is a set of rules and protocols that allows different software applications to communicate with each other. It defines the methods and data formats that applications can use to request and exchange information. APIs enable developers to access the functionality of other software services, libraries, or platforms without needing to understand their internal workings.\n\nFor example, a weather application might use an API to retrieve weather data from a remote server. The API specifies how the application can request the data (e.g., through specific URLs and parameters) and what format the data will be returned in (e.g., JSON or XML). This allows developers to build applications that can leverage existing services and data, facilitating integration and enhancing functionality.'}]

In [7]:
memory.append(
    {"role": "user", "content": "What have I asked?"}
)

memory

[{'role': 'system', 'content': 'Answer all user questions'},
 {'role': 'user', 'content': "What's an API"},
 {'role': 'assistant',
  'content': 'An API, or Application Programming Interface, is a set of rules and protocols that allows different software applications to communicate with each other. It defines the methods and data formats that applications can use to request and exchange information. APIs enable developers to access the functionality of other software services, libraries, or platforms without needing to understand their internal workings.\n\nFor example, a weather application might use an API to retrieve weather data from a remote server. The API specifies how the application can request the data (e.g., through specific URLs and parameters) and what format the data will be returned in (e.g., JSON or XML). This allows developers to build applications that can leverage existing services and data, facilitating integration and enhancing functionality.'},
 {'role': 'user', 'c

In [8]:
new_response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=memory,
    temperature=0.0,
)

memory.append(
    {"role": "assistant", "content": new_response.choices[0].message.content}
)

memory

[{'role': 'system', 'content': 'Answer all user questions'},
 {'role': 'user', 'content': "What's an API"},
 {'role': 'assistant',
  'content': 'An API, or Application Programming Interface, is a set of rules and protocols that allows different software applications to communicate with each other. It defines the methods and data formats that applications can use to request and exchange information. APIs enable developers to access the functionality of other software services, libraries, or platforms without needing to understand their internal workings.\n\nFor example, a weather application might use an API to retrieve weather data from a remote server. The API specifies how the application can request the data (e.g., through specific URLs and parameters) and what format the data will be returned in (e.g., JSON or XML). This allows developers to build applications that can leverage existing services and data, facilitating integration and enhancing functionality.'},
 {'role': 'user', 'c

## 3. Create a memory layer

Now that you remember how to use a list of messages, it's recommended to have a proper class to deal with more complicated cases.

Create the Memory class and add the following methods to it:
- add_message
- get_messages
- last_message

In [10]:
class Memory:
    def __init__(self):
        self.messages: List[Dict[str, str]] = []

    def add_message(self, role: Literal['user', 'system', 'assistant'], content: str):
        self.messages.append({
            "role": role,
            "content": content
        })

    def get_messages(self) -> List[Dict[str, str]]:
        return self.messages

    def last_message(self) -> List[Dict[str, str]]:
        return self.messages[-1]

## 4. Update the Agent class

In this exercise, you will enhance the AI Agent with self-reflection capabilities, allowing it to critique its own responses and refine them iteratively. This feature enables the agent to evaluate its output and improve the response quality before delivering a final answer.

**Objective**

Your task is to modify the agent so that it can:

- Store conversation history – Implement a memory mechanism to track interactions.
- Generate an initial response – Process user input and return a response using the language model.
- Critique its own response when enabled – If self-reflection is activated, the agent should generate feedback on its own answer.
- Refine its response iteratively – Based on the self-critique, the agent should adjust its reply, improving clarity, accuracy, and relevance.

**Steps**

- Implement a memory layer to retain conversation history.
- Introduce a self-reflection mechanism that allows the agent to analyze its response and refine it.
- Limit the number of self-reflection iterations to prevent excessive loops (minimum 1, maximum 3).
- Ensure flexibility by allowing users to toggle self-reflection on or off.

**Considerations**

- The agent should always generate at least one response before self-reflection.
- If self-reflection is enabled, it should run at least once more to critique and improve its output.
- The number of iterations should be controlled and not exceed three refinements.
- Implement logging functionality (verbose mode) to track the refinement process.

**Invoke**

Refactor `invoke()` method. This method now should include:
- self_reflection paramenter (default: False);
- max_iter parameter (default: 1);

If self_reflection is set to True, it should use a loop to generate an initial response. Then critiquing and refining the response in subsequent iterations up to the number of iterations defined in max_iter.

Use the self.memory to store each step.

Rules for self-reflection:

- Don't allow values less than 1
- Don't allow values greater than 3
- Max iter is controlled by self_reflection flag. 
- If set to true, it needs to call the LLM at least once more for the criticism

Your self critique prompt should start with something like: `Reflect on your previous response`.
Extend it to make sure it identifies errors and provides a revised version.

In [13]:
# TODO - Create your critique prompt
SELF_CRITIQUE_PROMPT = f"""nstructions: Use this framework to analyze your previous work version ([INSERT_VERSION_IDENTIFIER HERE]). Answer the following questions with rigorous honesty. The goal is not to justify past decisions but to understand their outcomes and chart a more effective path forward toward the ultimate objective: [STATE_YOUR_PRIMARY_END_GOAL_HERE].

Part 1: Context and Objective Alignment
Re-state the Primary Objective: What was the specific, measurable goal of the previous version? (e.g., "Increase user engagement by 15%," "Finalize the literature review," "Build a working prototype of feature X").

Strategic Intent: Beyond the immediate goal, what was the broader strategic intent? How was this version supposed to move us closer to the final end goal?

Part 2: Critical Analysis of Outcomes vs. Expectations
Success Audit:

What specific aspects of the previous version succeeded or exceeded expectations? Provide evidence or examples.

Why did these elements work well? (e.g., Was it due to a specific method, a resource, a insight?)

Gap Analysis:

Where did the previous version fall short or fail to meet its specific goal? Be precise.

What was the measurable delta between the expected result and the actual result?

What were the immediate, observable consequences of this shortfall?

Unintended Consequences: Did the previous version produce any significant unexpected outcomes (positive or negative)? Did it create new problems or opportunities you hadn't foreseen?

Part 3: Root Cause Investigation
Assumptions Check: What key assumptions did you make during the creation of the previous version that turned out to be incorrect or incomplete?

Constraint Analysis: What were the most impactful constraints (time, resources, knowledge, technology) that limited the outcome? Were any of these constraints self-imposed or misjudged?

Decision Reconciliation: Review 2-3 critical decisions you made. With the benefit of hindsight, were these the right decisions? If not, what would have been a better choice and why?

Part 4: Progress Assessment Towards the End Goal
Vector Check: Are we closer to the final end goal now than before we started the previous version?

If yes: What tangible evidence do we have? What new ground have we gained? (e.g., "We now have validated data on X," "We've eliminated a non-viable path," "We've built a foundational component.")

If no or neutral: Why not? Did we waste effort, go down a dead end, or simply maintain the status quo?

Learning Value: Regardless of success or failure, what is the single most important lesson learned from this cycle? What do you now know that you didn't know before?

Part 5: Synthesis and Forward Strategy
Path Correction: Based on this analysis, what is the most critical adjustment that needs to be made to the plan, strategy, or approach for the next version?

Actionable Next Steps: List the top 3 concrete, actionable steps for the next iteration. These should directly address the gaps and root causes identified above.

Step 1:

Step 2:

Step 3:

Final Confidence Assessment: On a scale of 1-10, how confident are you that the revised path will more effectively advance us toward the end goal? Justify your rating."""

In [22]:
class Agent:
    """A self-reflection AI Agent"""

    def __init__(
        self,
        name:str = "Agent", 
        role:str = "Personal Assistant",
        instructions:str = "Help users with any question",
        model:str = "gpt-4o-mini",
        temperature:float = 0.0,
    ):
        self.name = name
        self.role = role
        self.instructions = instructions
        self.model = model
        self.temperature = temperature

        # TODO - Instantiate your client properly
        self.client = OpenAI(api_key="YOUR_API_KEY_HERE")

        # TODO - Create your memory layer
        self.memory = Memory()
        self.memory.add_message(role="system", content='You are a ai agent , your role is {}  and you need to {}'.format(self.role, self.instructions))

        # TODO - Create your critique prompt
        self.critique_prompt = SELF_CRITIQUE_PROMPT


    def chat(self, role:str, user_message:str=None)->str:

        self.memory.add_message(role=role, content=user_message)
        response = client.chat.completions.create(
            model = "gpt-4o-mini",
            temperature = 0.0,
            messages = self.memory.get_messages(),
        )
        #ChatCompletionMessage
        ai_message = response.choices[0].message.content
        self.memory.add_message(role="assistant", content=ai_message)

        return ai_message

    def invoke(self, 
               user_message: str, 
               self_reflection: bool = False, 
               max_iter: int = 1, 
               verbose: bool = False) -> str:
        # TODO - refactor the invoke method to add self-reflection
        # Rules
        # - Don't allow values less than 1 V
        # - Don't allow values greater than 3 V
        # - Max iter is controlled by self_reflection flag. 
        # - If set to true, it needs to call the LLM at least once more for the criticism
        if max_iter>3 or max_iter<1:
            raise ValueError("max_iter must be between 1 and 3")

        ai_message = self.chat('user', user_message)

        if not self_reflection:
            if verbose:
                print(ai_message)
            return ai_message
        else:
            for i in range(max_iter-1):
                #ask gpt to self reflect and get a new ai_message
                ai_message = self.chat('user', SELF_CRITIQUE_PROMPT)
                if verbose:
                    print(i)
                    print(ai_message)
        return ai_message


## 5. Build some agents and have fun

Create some specific agents and invoke them with self_reflection = True

In [23]:
# TODO - create a default agent with role and instructions
# Then ask it a subjective question like:  
# "Pick only one. Who is the best character in Game of Thrones?"
agent = Agent(
        name= "Agent",
        role= "Personal Assistant",
        instructions= "You are games of thrones expert",
        model="gpt-4o-mini",
        temperature= 0.5)

In [24]:
agent.invoke("Pick only one. Who is the best character in Game of Thrones?",True,2,True)

KeyboardInterrupt: 

In [None]:
 agent.memory.get_messages()

In [None]:
json.loads(agent.memory.last_message()["content"])["updated_response"]

## 6. Experiment

Now that you understood how it works, experiment with new things.

- Experiment new critique prompts
- What happens when you increase the number of iterations?
- Try adding an argument to invoke() method to inspect it (verbose=True)
- What else can you try?