# Reflexion

[Reflexion](https://arxiv.org/abs/2303.11366) by Shinn, et. al., is an architecture designed to learn through verbal feedback and self-reflection. The agent explicitly critiques its responses for tasks to generate a higher quality final response, at the expense of longer execution time.

![reflexion diagram](./imgs/reflexion.png)

The paper outlines 3 main components:

1. Actor (agent) with self-reflection
2. External evaluator (task-specific, e.g. code compilation steps)
3. Episodic memory that stores the reflections from (1).

In their code, the last two components are very task-specific, so in this notebook, you will build the _actor_ in LangGraph.

To skip to the graph definition, see the [Construct Graph section](#Construct-Graph) below.

## 0. Prerequisites

Install `langgraph` (for the framework), `langchain_openai` (for the LLM), and `langchain` + `tavily-python` (for the search engine).

We will use tavily search as a tool. You can get an API key [here](https://app.tavily.com/sign-in) or replace with a different tool of your choosing.

In [7]:
%pip install -U --quiet  langchain langgraph langchain_openai
%pip install -U --quiet tavily-python

[31mERROR: Could not find an activated virtualenv (required).[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.
[31mERROR: Could not find an activated virtualenv (required).[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.


In [2]:
import getpass
import os


def _set_if_undefined(var: str) -> None:
    if os.environ.get(var):
        return
    os.environ[var] = getpass.getpass(var)


# Optional: Configure tracing to visualize and debug the agent
_set_if_undefined("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "Reflexion"

_set_if_undefined("OPENAI_API_KEY")
_set_if_undefined("TAVILY_API_KEY")

## 1. Actor (with reflection)

The main component of Reflexion is the "`actor`", which is an agent that reflects on its response and re-executes to improve based on self-critique. It's main sub-components include:
1. Tools/tool execution
2. Initial responder: generate an initial response (and self-reflection)
3. Revisor: re-respond (and reflec) based on previous reflections

We'll first define the tool execution context.

#### Construct tools

In [3]:
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_community.utilities.tavily_search import TavilySearchAPIWrapper

search = TavilySearchAPIWrapper()
tavily_tool = TavilySearchResults(api_wrapper=search, max_results=5)

The tools are invoked _in context_. Create a function that invokes all the requested tools.

In [4]:
from collections import defaultdict
from typing import List

from langchain.output_parsers.openai_tools import (
    JsonOutputToolsParser,
    PydanticToolsParser,
)
from langchain_core.messages import AIMessage, BaseMessage, HumanMessage, ToolMessage
from langgraph.prebuilt.tool_executor import ToolExecutor, ToolInvocation

# This a helper class we have that is useful for running tools
# It takes in an agent action and calls that tool and returns the result
tool_executor = ToolExecutor([tavily_tool])
# Parse the tool messages for the execution / invocation
parser = JsonOutputToolsParser(return_id=True)


def execute_tools(state: List[BaseMessage]) -> List[BaseMessage]:
    tool_invocation: AIMessage = state[-1]
    parsed_tool_calls = parser.invoke(tool_invocation)
    ids = []
    tool_invocations = []
    for parsed_call in parsed_tool_calls:
        for query in parsed_call["args"]["search_queries"]:
            tool_invocations.append(
                ToolInvocation(
                    # We only have this one for now. Would want to map it
                    # if we change
                    tool="tavily_search_results_json",
                    tool_input=query,
                )
            )
            ids.append(parsed_call["id"])

    outputs = tool_executor.batch(tool_invocations)
    outputs_map = defaultdict(dict)
    for id_, output, invocation in zip(ids, outputs, tool_invocations):
        outputs_map[id_][invocation.tool_input] = output

    return [
        ToolMessage(content=json.dumps(query_outputs), tool_call_id=id_)
        for id_, query_outputs in outputs_map.items()
    ]

#### Initial responder

In [5]:
import datetime

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.pydantic_v1 import BaseModel, Field, ValidationError
from langchain_openai import ChatOpenAI
from langsmith import traceable

actor_prompt_template = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are expert researcher.
Current time: {time}

1. {first_instruction}
2. Reflect and critique your answer. Be severe to maximize improvement.
3. Recommend search queries to research information and improve your answer.""",
        ),
        MessagesPlaceholder(variable_name="messages"),
        ("system", "Answer the user's question above using the required format."),
    ]
).partial(
    time=lambda: datetime.datetime.now().isoformat(),
)


class Reflection(BaseModel):
    missing: str = Field(description="Critique of what is missing.")
    superfluous: str = Field(description="Critique of what is superfluous")


class AnswerQuestion(BaseModel):
    """Answer the question."""

    answer: str = Field(description="~250 word detailed answer to the question.")
    reflection: Reflection = Field(description="Your reflection on the initial answer.")
    search_queries: List[str] = Field(
        description="1-3 search queries for researching improvements to address the critique of your current answer."
    )


llm = ChatOpenAI(model="gpt-4-turbo-preview")
initial_answer_chain = actor_prompt_template.partial(
    first_instruction="Provide a detailed ~250 word answer."
) | llm.bind_tools(tools=[AnswerQuestion], tool_choice="AnswerQuestion")
validator = PydanticToolsParser(tools=[AnswerQuestion])


class ResponderWithRetries:
    def __init__(self, runnable, validator):
        self.runnable = runnable
        self.validator = validator

    @traceable
    def respond(self, state: List[BaseMessage]):
        response = []
        for attempt in range(3):
            try:
                response = self.runnable.invoke({"messages": state})
                self.validator.invoke(response)
                return response
            except ValidationError as e:
                state = state + [HumanMessage(content=repr(e))]
        return response

In [6]:
first_responder = ResponderWithRetries(
    runnable=initial_answer_chain, validator=validator
)

In [7]:
example_question = "Why is reflection useful in AI?"
initial = first_responder.respond([HumanMessage(content=example_question)])

In [8]:
initial

AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_9Tc3bt8wjMz1sCaxh9I10Sk3', 'function': {'arguments': '{"answer":"Reflection in AI, often referred to as self-awareness or meta-reasoning, is crucial for several reasons. Firstly, it allows AI systems to understand and improve their own decision-making processes. By reflecting on their actions and outcomes, AI can identify patterns of success or failure, leading to self-improvement over time. This capability is essential for developing AI that can adapt to new challenges and environments without human intervention.\\n\\nSecondly, reflection enables AI to explain its decisions and actions to humans. This is particularly important in areas where transparency and trust are critical, such as healthcare, finance, and autonomous vehicles. An AI that can introspect and communicate the reasoning behind its decisions is more likely to be trusted by users and can help in demystifying AI operations.\\n\\nLastly, reflection is key

In [14]:

parsed[0].keys()

dict_keys(['args', 'id', 'type'])

In [16]:
parsed[0]['args'].keys()

dict_keys(['answer', 'reflection', 'search_queries'])

In [11]:
parsed = parser.invoke(initial)
parsed

[{'args': {'answer': "Reflection in AI refers to the ability of an AI system to analyze and improve its own processes, decisions, and outcomes. This metacognitive capability enhances the AI's adaptability, efficiency, and reliability, making it crucial for creating more sophisticated and autonomous systems. Reflective AI models can identify weaknesses or biases in their decision-making processes, learn from past actions, and adjust their strategies accordingly. This self-improvement loop enables AI to adapt to new or changing environments without requiring constant human intervention, making it particularly valuable in dynamic or complex domains such as autonomous vehicles, healthcare diagnosis, and financial market analysis.\n\nMoreover, reflective AI plays a significant role in explicability and trustworthiness. By understanding and explaining its decision-making process, AI can provide insights into its behavior, fostering trust and acceptance among users. This is particularly impor

#### Revision

The second part of the actor is a revision step.

In [None]:
revise_instructions = """Revise your previous answer using the new information.
    - You should use the previous critique to add important information to your answer.
        - You MUST include numerical citations in your revised answer to ensure it can be verified.
        - Add a "References" section to the bottom of your answer (which does not count towards the word limit). In form of:
            - [1] https://example.com
            - [2] https://example.com
    - You should use the previous critique to remove superfluous information from your answer and make SURE it is not more than 250 words.
"""


# Extend the initial answer schema to include references.
# Forcing citation in the model encourages grounded responses
class ReviseAnswer(AnswerQuestion):
    """Revise your original answer to your question."""

    references: List[str] = Field(
        description="Citations motivating your updated answer."
    )


revision_chain = actor_prompt_template.partial(
    first_instruction=revise_instructions
) | llm.bind_tools(tools=[ReviseAnswer], tool_choice="ReviseAnswer")
revision_validator = PydanticToolsParser(tools=[ReviseAnswer])

revisor = ResponderWithRetries(runnable=revision_chain, validator=revision_validator)

In [None]:
import json

revised = revisor.respond(
    [
        HumanMessage(content=""),
        initial,
        ToolMessage(
            tool_call_id=initial.additional_kwargs["tool_calls"][0]["id"],
            content=json.dumps(
                tavily_tool.invoke(str(parsed[0]["args"]["search_queries"]))
            ),
        ),
    ]
)

In [None]:
parsed = parser.invoke(revised)
parsed

[{'args': {'answer': 'Reflection in AI involves systems analyzing, understanding, and learning from their actions to improve continuously. This capability is essential for intelligent systems operating in dynamic environments, allowing them to evaluate performance, identify inefficiencies, and adjust strategies autonomously, thereby enhancing autonomy and adaptability [1]. Reflective AI plays a significant role in developing explainable AI (XAI), facilitating systems to provide insights into their decision-making processes, thus improving transparency and trust [2]. Moreover, it aids in identifying and rectifying ethical and bias-related issues, ensuring fairness and accountability in AI operations [3]. However, implementing reflective AI presents challenges, including the complexity of AI systems, the interdisciplinary divide between ethics and engineering, and the lack of established methods for ethical AI engineering [4]. These challenges highlight the necessity for continuous ethic

## Construct Graph


Now we can wire all our components together.

In [None]:
from langgraph.graph import END, MessageGraph

MAX_ITERATIONS = 5
builder = MessageGraph()
builder.add_node("draft", first_responder.respond)
builder.add_node("execute_tools", execute_tools)
builder.add_node("revise", revisor.respond)
# draft -> execute_tools
builder.add_edge("draft", "execute_tools")
# execute_tools -> revise
builder.add_edge("execute_tools", "revise")

# Define looping logic:


def _get_num_iterations(state: List[BaseMessage]):
    i = 0
    for m in state[::-1]:
        if not isinstance(m, (ToolMessage, AIMessage)):
            break
        i += 1
    return i


def event_loop(state: List[BaseMessage]) -> str:
    # in our case, we'll just stop after N plans
    num_iterations = _get_num_iterations(state)
    if num_iterations > MAX_ITERATIONS:
        return END
    return "execute_tools"


# revise -> execute_tools OR end
builder.add_conditional_edges("revise", event_loop)
builder.set_entry_point("draft")
graph = builder.compile()

In [None]:
events = graph.stream(
    [HumanMessage(content="How should we handle the climate crisis?")]
)
for i, step in enumerate(events):
    node, output = next(iter(step.items()))
    print(f"## {i+1}. {node}")
    print(str(output)[:100] + " ...")
    print("---")

## 1. draft
content='' additional_kwargs={'tool_calls': [{'id': 'call_vXIkxGNqzzqsmkmVUvpY81Wx', 'function': {'a ...
---
## 2. execute_tools
[ToolMessage(content='{"examples of successful climate policies": [{"url": "https://www.washingtonpo ...
---
## 3. revise
content='' additional_kwargs={'tool_calls': [{'id': 'call_PQqwbqcN9a2pVpf11sTTMAB4', 'function': {'a ...
---
## 4. execute_tools
[ToolMessage(content='{"successful climate change mitigation projects": [{"url": "https://unfccc.int ...
---
## 5. revise
content='' additional_kwargs={'tool_calls': [{'id': 'call_v0NA4y1gJgg4fRiwU0cgGMaS', 'function': {'a ...
---
## 6. execute_tools
[ToolMessage(content='{"successful climate change mitigation projects": [{"url": "https://unfccc.int ...
---
## 7. revise
content='' additional_kwargs={'tool_calls': [{'id': 'call_8o5fOQHDTArbp5yPk3ffsLTZ', 'function': {'a ...
---


In [None]:
step


NameError: name 'step' is not defined

In [79]:
for args_label in parser.invoke(step['revise'])[0]["args"]:
    
    parser.invoke(step['revise'])[0]["args"][args_label]
    
    print(args_label)
    print()
    print("----")

answer
To effectively address the climate crisis, a multifaceted strategy involving international cooperation, strong national policies, technological innovations, education, and ecosystem protection is crucial. International cooperation, exemplified by the Paris Agreement, is essential for setting and achieving global emissions reduction targets. Sharing technologies and strategies across nations can significantly mitigate climate change impacts [1]. On a national level, innovative policies like Finland's pioneering carbon tax demonstrate the potential of national initiatives to combat climate change [2]. Technological advancements also play a critical role, with investments in renewable energy and energy efficiency initiatives leading to significant climate policy successes [3]. Education is a powerful tool that empowers individuals with the knowledge and skills necessary for climate action, highlighting its importance in the global fight against climate change [4]. Moreover, protect

In [None]:
# print(parser.invoke(step[END][-1])[0]["args"]["answer"])

## Conclusion

Congrats on building a Reflexion actor! I'll leave you with a few observations to save you some time when choosing which parts of this agent ot adapt to your workflow:
1. This agent trades off execution time for quality. It explicitly forces the agent to critique and revise the output over several steps, which usually (not always) increases the response quality but takes much longer to return a final answer
2. The 'reflections' can be paired with additional external feedback (such as validators), to further guide the actor.
3. In the paper, 1 environment (AlfWorld) uses external memory. It does this by storing summaries of the reflections to an external store and using them in subsequent trials/invocations.