In [3]:
from langchain_community.chat_models import ChatOpenAI
import dotenv
import os
from discussion_agents.cog.agent.reflexion import ReflexionReActAgent

dotenv.load_dotenv("../.env")
openai_api_key = os.getenv("OPENAI_API_KEY")
llm = ChatOpenAI(model_name="gpt-3.5-turbo-0125", openai_api_key=openai_api_key)


Given we change the examples/prompts for agent/reflect, what changes must be made?
- the states of the agent

The states of the reflexion react agent are: reflector and memory
- memory stores scratchpad info only
- reflector stores reflections only

Given the above context, if we modify the examples, prompt, reflect_examples, and reflect_prompt, what happens to the agent states?
- if we modify both/either examples or prompt, the input to the llm for prompting the agent is diff but
the outputs won't differ; memory stores scratchpad (output) only so no change here; reflector does not change in this case
- if we modify both/either examples or prompt, the input to the llm reflect is diff but 
the outputs won't differ; reflector stores reflections (output) only so no change here; memory does not store reflections

So memory and reflection don't bleed into each other. They are mutually exclusive? Yes.

If I make changes in the prompt for the agent (examples/prompt), should these changes be reflected in the prompt for reflection?
For example, if I include "insights" or fewshot examples in the prompt for the agent, shouldn't these also be present in the 
prompt input during reflection?

Reflection takes in "question", "examples", "reflections". In this case, examples refers to fewshot reflection examples not the ones used 
in the prompting for the agent. I notice in original react and reflexion, they don't include these fewshot examples in reflection and it
wouldn't make much sense to anyways. It's not relevant context for the sake of reflection.

What about insights? That's slightly different from reflections, shouldn't that be included in the prompts for reflection (since it's
used for prompting the agent). This does make slight sense, except these insights are meant to be fixed and not updated (unless it's done by
the expel insight extraction stage 2 process) by the llm during reflection. So if it were to be included during reflexion, it would be static/fixed.

Ok, but then if it's static/fixed in the reflection prompt, don't you think it'll help out the reflection process? 

The ExpeL paper uses basic ReAct during stage 3 evaluation. I have no idea how they would go about it with reflexion + react. That being said,
generally, nothing in the input to the prompt agent is used as input to the reflect component of reflexion. This is true for CoT and react.

Hmmmmmm, then what's stopping you from incorporating these insights into the prompt? Well, first off, we know that these insights shouldn't be 
part of the reflection output/reflector class state. But now it begs the question: should it be part of the input to the reflection component?

My answer is a bit mixed on this. It makes sense to include it, but generally you don't include input to prompt agent as the input to the reflection process.

It kinda makes sense that the insights would aid in the reflection process, but then how would this even look? Well, the reflect prompt
would probably have either a new argument (unlikely) or the insights are appended to the examples (probably the case). The problem is
the reflector should be focused on the reflection process. These insights aid in inference not reflection. That's what they're geared for.

I'm leaning towards no. if it is yes, then it would have to be appended to the examples or we could have kwargs (but this gets complicated
very quickly; let's stay away from this).

So if we do do it, it must be through appending the examples. Though I don't think this matters because we won't do this. It wouldn't make
sense. The insights are for inference not for reflection, but then again, the insights would be partially responsible for inference output.

Ok, I think I won't do it. But, to be comprehensive, is there harm in providing the option to do that? No. Let's just implement something and see where this takes us. oK


In [12]:
import joblib 

hotpotqa = joblib.load('../../../tests/assets/hotpotqa/hotpot-qa-distractor-sample.joblib')
experiences = joblib.load('../../../tests/assets/expel/expel_experiences_10_fake.joblib')


In [None]:
agent = ExpeLAgent(
    llm=llm,
    self_reflect_llm=llm,
    action_llm=llm
)

In [None]:
from discussion_agents.cog.prompts.react import REACT_WEBTHINK_SIMPLE6_FEWSHOT_EXAMPLES

In [None]:
a == REACT_WEBTHINK_SIMPLE6_FEWSHOT_EXAMPLES + "\n(END OF EXAMPLES)\n"

In [None]:
a = """
Here are some examples:
{examples}
(END OF EXAMPLES)
"""

b = f"""
Here are some examples:
{REACT_WEBTHINK_SIMPLE6_FEWSHOT_EXAMPLES}
""" + \
"(END OF EXAMPLES)\n"

a.format(examples=REACT_WEBTHINK_SIMPLE6_FEWSHOT_EXAMPLES) == b

In [None]:
a

In [None]:
from discussion_agents.cog.agent.reflexion import ReflexionReActAgent

reflexion_agent = ReflexionReActAgent(
    self_reflect_llm=llm,
    action_llm=llm
)

In [None]:
reflexion_agent.generate(
    hotpotqa.iloc[0].question,
    hotpotqa.iloc[0].answer
)

In [None]:
agent.insight_memory.insights

In [None]:
agent.gather_experience(
    questions=hotpotqa.question.values[:1],
    keys=hotpotqa.answer.values[:1]
)

In [None]:
agent.insight_memory.insights

In [None]:
agent.experience_memory.experiences['trajectories'][0]

In [None]:
agent.experience_memory.success_traj_docs

In [None]:
agent.experience_memory.vectorstore

In [None]:
agent.insight_memory.insights