# Questioning Barbie and Oppenheimer Through the Use of Agents

In the following notebook we will build an application that queries both the Barbie and Oppenheimer movies Wikipedia pages, as well as their reviews. 

The main focus of this notebook is to showcase a brief introduction to Agents.

## Build 🏗️

There are 3 main tasks in this notebook:

1. Contruct a Barbie retriever
2. Construct an Oppenheimer retriever
3. Combine the two and allow users to query both resources from a single input through the use of Agents

## Ship 🚢

Based on Tuesday's session - construct a Chainlit (or Gradio) application that allows users to interface with the application.

## Share 🚀

Make a social media post about your final application.

### Dependencies

As always, let's start with some dependencies!

In [31]:
!pip install -q -U langchain openai

In [32]:
import getpass
import os
openai_api_key = getpass.getpass("Enter your OpenAI API Key: ")
os.environ["OPENAI_API_KEY"] = openai_api_key

### LLM 

We will be leveraging OpenAI's `gpt-3.5-turbo` throughout the notebook, and we can keep it consistent throughout!

In [33]:
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature = 0)

### Data Collection and Transformation

We'll be leveraging the `WikipediaLoader` tool to collect information from Wikipedia. 

Be sure to set the `doc_content_chars_max` parameter so that you capture the *entire* Wikipedia article content.

In [34]:
!pip install -q -U wikipedia

In [35]:
from langchain.document_loaders import WikipediaLoader, CSVLoader

barbie_wikipedia_docs = WikipediaLoader(
    query="Barbie (film)", 
    load_max_docs=2,# YOUR CODE HERE, 
    doc_content_chars_max= 500### YOUR CODE HERE
    ).load()

barbie_csv_docs = CSVLoader(
    file_path='barbie_data/barbie.csv', ### YOUR CODE HERE, 
    source_column='Review_Url'### YOUR CODE HERE
    ).load()

Since we'll be using same format source documentation separated by topic, we can save ourselves some extra effort and set up our splitters once. 

We're going to leverage the `RecursiveCharacterTextSplitter` again, this time paying close attention to the format our Wikipedia articles and reviews are in so we can be sure to chunk them appropritately. 

> HINT: You can pass a list of separators when you intialize your `RecursiveTextSplitter`! They are acted on in order of element 0 -> element len(list).

RELEVANT DOCS:
- [`RecursiveCharacterTextSplitter`](https://api.python.langchain.com/en/latest/text_splitter/langchain.text_splitter.RecursiveCharacterTextSplitter.html#langchain.text_splitter.RecursiveCharacterTextSplitter)

In [36]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

wikipedia_text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 500,
    chunk_overlap = 40,
    length_function = len,
    is_separator_regex= False,
    separators = ['\n==', '\n'] ### YOUR CODE HERE # keep headings, then paragraphs, then sentences
)

csv_text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 500, 
    chunk_overlap = 40,
    length_function = len,
    is_separator_regex= False,
    separators = ['\n'] ### YOUR CODE HERE # keep paragraphs, then sentences
)
print(barbie_csv_docs)
chunked_barbie_wikipedia_docs =  wikipedia_text_splitter.split_documents(barbie_wikipedia_docs) ### YOUR CODE HERE
chunked_barbie_csv_docs = csv_text_splitter.split_documents(barbie_csv_docs) ### YOUR CODE HERE

print("Chunked Barbie csv Docs: ", len(chunked_barbie_csv_docs), chunked_barbie_csv_docs)

[Document(page_content=': 0\nReview_Date: 21 July 2023\nAuthor: LoveofLegacy\nRating: 6\nReview_Title: Beautiful film, but so preachy\nReview: Margot does the best with what she\'s given, but this film was very disappointing to me. It was marketed as a fun, quirky satire with homages to other movies. It started that way, but ended with over-dramatized speeches and an ending that clearly tried to make the audience feel something, but left everyone just feeling confused. And before you say I\'m a crotchety old man, I\'m a woman in my 20s, so I\'m pretty sure I\'m this movie\'s target audience. The saddest part is there were parents with their kids in the theater that were victims of the poor marketing, because this is not a kid\'s movie. Overall, the humor was fun on occasion and the film is beautiful to look at, but the whole concept falls apart in the second half of the film and becomes a pity party for the "strong" woman.\nReview_Url: /review/rw9199947/?ref_=tt_urv', metadata={'source

#### Retrieval and Embedding Strategy

We've already discussed the useful application of `CacheBackedEmbeddings`, so let's do it again!

RELEVANT DOCS:
- [`CacheBackedEmbeddings`](https://api.python.langchain.com/en/latest/embeddings/langchain.embeddings.cache.CacheBackedEmbeddings.html#langchain-embeddings-cache-cachebackedembeddings)

In [37]:
!pip install -q -U rank_bm25 tiktoken faiss-cpu

In [38]:
from langchain.retrievers import BM25Retriever, EnsembleRetriever
from langchain.vectorstores import FAISS
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings import CacheBackedEmbeddings
from langchain.storage import LocalFileStore

# set up cached embeddings store
store = LocalFileStore('./barbie_data/')### YOUR CODE HERE

core_embeddings_model = OpenAIEmbeddings()### YOUR CODE HERE

embedder = CacheBackedEmbeddings.from_bytes_store(core_embeddings_model, store, namespace=core_embeddings_model.model)  #CacheBackedEmbeddings(core_embeddings_model, store,)### YOUR CODE HERE

We'll implement a `FAISS` vectorstore, and create a retriever from it.

In [39]:
barbie_csv_faiss_retriever = FAISS.from_documents(chunked_barbie_csv_docs, embedder).as_retriever()


There are a number of excellent options to retrieve documents - we'll be looking at an additional example today, which is called the `EnsembleRetriever`.

The method this is using is outlined in [this paper](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf).

The brief explanation is:

1. We collect results from two different retrieval methods over the same corpus
2. We apply a reranking algorithm to rerank our source documents to be the *most relevant* without losing specific or potentially low-ranked information rich documents
3. We feed the top-k results into the LLM with our query as context.

> HINT: Your weight list should be of type `List[float]` and the `sum(List[float])` should be `1`.

In [40]:
# set up BM25 retriever
barbie_wikipedia_bm25_retriever = BM25Retriever.from_documents(
    chunked_barbie_wikipedia_docs
)
barbie_wikipedia_bm25_retriever.k = 1

# set up FAISS vector store
barbie_wikipedia_faiss_store = FAISS.from_documents(
    chunked_barbie_wikipedia_docs,
    embedder
)
barbie_wikipedia_faiss_retriever = barbie_wikipedia_faiss_store.as_retriever(search_kwargs={"k": 1})

# set up ensemble retriever
barbie_ensemble_retriever = EnsembleRetriever(
    retrievers=[barbie_csv_faiss_retriever, barbie_wikipedia_bm25_retriever],
    weights= [0.5, 0.5]  # should sum to 1
)

#### Retrieval Agent

We can create a simple conversational retrieval Agent by using the built-ins provided by LangChain!

> HINT: Be sure to provide good natural language descriptions of what the tool should be used for to get the best results.

RELEVANT DOCS:
- [`create_retriever_tool`](https://api.python.langchain.com/en/latest/agents/langchain.agents.agent_toolkits.conversational_retrieval.tool.create_retriever_tool.html#langchain.agents.agent_toolkits.conversational_retrieval.tool.create_retriever_tool)

In [41]:
from langchain.agents.agent_toolkits import create_retriever_tool

barbie_wikipedia_retrieval_tool = create_retriever_tool(
    barbie_wikipedia_faiss_retriever,
    "barbie-wikipedia-retriever-tool",
    "Uses CSV FAISS retriever and barbie wikipedia retriever to retrieve barbie wikipedia articles."
)

barbie_csv_retrieval_tool = create_retriever_tool(
    barbie_csv_faiss_retriever,
    "barbie-csv-retriever-tool",
    "Uses CSV FAISS retriever to retrieve barbie reviews."
)

barbie_ensemble_retriever_tool = create_retriever_tool(
    barbie_ensemble_retriever,
    "barbie-ensemble-retriever-tool",
    "Uses ensemble FAISS retriever to retrieve barbie reviews."
)

barbie_retriever_tools = [barbie_ensemble_retriever_tool, barbie_csv_retrieval_tool, barbie_wikipedia_retrieval_tool]

Now that we've created our tools, we can combined them into an agent!

RELEVANT DOCS:
- [`create_conversational_retrieval_agent`](https://api.python.langchain.com/en/latest/agents/langchain.agents.agent_toolkits.conversational_retrieval.openai_functions.create_conversational_retrieval_agent.html#langchain.agents.agent_toolkits.conversational_retrieval.openai_functions.create_conversational_retrieval_agent)

In [42]:
from langchain.agents.agent_toolkits import create_conversational_retrieval_agent

barbie_retriever_agent_executor =  create_conversational_retrieval_agent(
    llm=llm,
    tools=barbie_retriever_tools,
    remember_intermediate_steps=False,
    # memory_key="bb-retriever-memory-key",
    # system_message=""
    verbose=True,
    # max_token_limit=2000,


)

In [43]:
barbie_retriever_agent_executor({"input" : "Did people like Barbie, or did they find it too Philosphical? If they did, can you tell me why the movie is so Philosophical?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `barbie-ensemble-retriever-tool` with `Did people like Barbie?`


[0m[36;1m[1;3m[Document(page_content='A complete list of CGI or computer-animated feature films and streaming television films starring or featuring Barbie, a fashion doll manufactured by American toy and entertainment company Mattel, is as follows.\nIn response to a growing rise of digital and interactive media as well as the gradual decline of the sales in dolls, toys and accessories in the 1980s, Mattel partnered with animation studios to produce films which were broadcast on Nickelodeon in the United States from 2002 and released', metadata={'title': 'List of Barbie animated films', 'summary': 'A complete list of CGI or computer-animated feature films and streaming television films starring or featuring Barbie, a fashion doll manufactured by American toy and entertainment company Mattel, is as follows.\nIn response to a growing rise of digital

{'input': 'Did people like Barbie, or did they find it too Philosphical? If they did, can you tell me why the movie is so Philosophical?',
 'chat_history': [HumanMessage(content='Did people like Barbie, or did they find it too Philosphical? If they did, can you tell me why the movie is so Philosophical?', additional_kwargs={}, example=False),
  AIMessage(content='Based on the reviews and information available, it seems that people have different opinions about Barbie. Some people enjoyed the Barbie movies and found them entertaining, colorful, and fun for all ages. They appreciated the doll-like elements in the movies, such as the plastic food and drink, gliding cars, and doll-like movements of the actors. These viewers found the movies to be enjoyable and emotionally engaging at times.\n\nOn the other hand, there are also viewers who found the Barbie movies to be more philosophical and profound than expected. They were surprised by the depth of the messages conveyed in the movies and 

In [44]:
barbie_retriever_agent_executor({"input" : "What is a very quick summary of the plot of the Barbie movie?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `barbie-wikipedia-retriever-tool` with `Barbie (film)`


[0m[38;5;200m[1;3m[Document(page_content='Barbie is a 2023 American fantasy comedy film directed by Greta Gerwig from a screenplay she wrote with Noah Baumbach. Based on the Barbie fashion dolls by Mattel, it is the first live-action Barbie film after numerous computer-animated films and specials. The film follows Barbie (Margot Robbie) and Ken (Ryan Gosling) on a journey of self-discovery following an existential crisis. It features an ensemble supporting cast, including America Ferrera, Kate McKinnon, Helen Mirren, Issa Rae, Simu Liu,', metadata={'title': 'Barbie (film)', 'summary': 'Barbie is a 2023 American fantasy comedy film directed by Greta Gerwig from a screenplay she wrote with Noah Baumbach. Based on the Barbie fashion dolls by Mattel, it is the first live-action Barbie film after numerous computer-animated films and specials. The film follows B

{'input': 'What is a very quick summary of the plot of the Barbie movie?',
 'chat_history': [HumanMessage(content='Did people like Barbie, or did they find it too Philosphical? If they did, can you tell me why the movie is so Philosophical?', additional_kwargs={}, example=False),
  AIMessage(content='Based on the reviews and information available, it seems that people have different opinions about Barbie. Some people enjoyed the Barbie movies and found them entertaining, colorful, and fun for all ages. They appreciated the doll-like elements in the movies, such as the plastic food and drink, gliding cars, and doll-like movements of the actors. These viewers found the movies to be enjoyable and emotionally engaging at times.\n\nOn the other hand, there are also viewers who found the Barbie movies to be more philosophical and profound than expected. They were surprised by the depth of the messages conveyed in the movies and appreciated the thought-provoking moments. For example, one revi

### Oppenheimer Retrieval System

We're going to repourpose some of what we created previously, but this time we'll explore a different multi-source retrieval system.

In [45]:
oppenheimer_wikipedia_docs = WikipediaLoader(
    query="Oppenheimer (film)", 
    load_max_docs=2,
    doc_content_chars_max= 2000
    ).load()

oppenheimer_csv_docs = CSVLoader(
    file_path='oppenheimer_data/oppenheimer.csv', ### YOUR CODE HERE, 
    source_column='Review_Url'### YOUR CODE HERE
    ).load()



In [46]:

chunked_opp_wikipedia_docs =  wikipedia_text_splitter.split_documents(oppenheimer_wikipedia_docs)
chunked_opp_csv_docs = csv_text_splitter.split_documents(oppenheimer_csv_docs)


In [47]:
opp_csv_faiss_retriever = FAISS.from_documents(chunked_opp_csv_docs, embedder).as_retriever()

# set up BM25 retriever
opp_wikipedia_bm25_retriever = BM25Retriever.from_documents(
    chunked_opp_wikipedia_docs
)
opp_wikipedia_bm25_retriever.k = 1

# set up FAISS vector store
opp_wikipedia_faiss_store = FAISS.from_documents(
    chunked_opp_wikipedia_docs,
    embedder
)
opp_wikipedia_faiss_retriever = opp_wikipedia_faiss_store.as_retriever(search_kwargs={"k": 1})

# set up ensemble retriever
opp_ensemble_retriever = EnsembleRetriever(
    retrievers=[opp_wikipedia_faiss_retriever, opp_wikipedia_bm25_retriever],
    weights= [0.5, 0.5]  # should sum to 1
)

#### Multi-source chain

We're going to allow the LLM to decide which information is most -> least valuable.

The way we'll do this is with LangChain's rather powerful "Expression Language"!

> HINT: You can leverage [this](https://python.langchain.com/docs/use_cases/question_answering/how_to/multiple_retrieval) resource if you get stuck - but experiment with different prompts/formats.

In [48]:
from langchain.prompts import ChatPromptTemplate

system_message = """Use the information from the below two sources to answer any questions.

Source 1: public user reviews about the Oppenheimer movie
<source1>
{source1}
</source1>

Source 2: the wikipedia page for the Oppenheimer movie including the plot summary, cast, and production information
<source2>
{source2}
</source2>
"""

prompt = ChatPromptTemplate.from_messages([("system", system_message), ("human", "{question}")])

In [49]:
oppenheimer_multisource_chain = {
    "source1": (lambda x: x["question"]) | opp_csv_faiss_retriever,
    "source2": (lambda x: x["question"]) | opp_ensemble_retriever,
    "question": lambda x: x["question"],
} | prompt | llm

In [50]:
oppenheimer_multisource_chain.invoke({"question" : "What did people think of the Oppenheimer movie?"})

AIMessage(content="Based on the public user reviews, opinions about the Oppenheimer movie are mixed. One reviewer mentioned that they found the movie to be thought-provoking and visually stunning, but felt that the writing could have been smoother. Another reviewer had a great experience watching the movie, praising the compelling storytelling, engaging performances, and beautiful cinematography. They also mentioned that the film sparked an interest in history. However, another reviewer felt that the film missed the mark, despite its technical achievements and outstanding acting. They mentioned that the movie focused more on Oppenheimer's personal life rather than the creation of the atomic bomb, and they were not a fan of the non-linear storytelling. Overall, opinions vary, with some praising the movie and others expressing disappointment.", additional_kwargs={}, example=False)

# Agent Creation

Now we can finally start building our Agent!

The first thing we'll need to do is provide our Agent a Toolbelt. (list of tools). Much like Batman, our LLM-powered Agent can use these tools as it sees fit. 

While the examples we're constructing in this notebook are straightforward for brevity and simplicities sake - there is no limit to what you can build with Agents, as we'll see as we progress through the program.

So, let's begin by setting up our Tools!

You'll notice that we have to set up a function to allow our `OppenheimerInfo` tool to interface with the Agent - this is due to it have a specific required input. Creating custom tools is a pattern that you'll want to grow acustomed to as you use LangChain more and more.

Now that we've set up our Agents toolbelt, let's set up the LLM that will be powering it!

I would suggest playing around with these prompts - and experiments to find what works best for you.

RELEVANT DOCS:
- [`ZeroShotAgent`](https://api.python.langchain.com/en/latest/agents/langchain.agents.mrkl.base.ZeroShotAgent.html#langchain-agents-mrkl-base-zeroshotagent)

In [52]:
from langchain.agents import ZeroShotAgent, AgentExecutor

prefix = """Have a conversation with a human, answering the following questions as best you can. You have access to the following tools:"""
suffix = """Begin!"

Question: {input}
{agent_scratchpad}"""

prompt = ZeroShotAgent.create_prompt(tools, prefix, suffix, input_variables=["input", "agent_scratchpad"])

In [53]:
from langchain import LLMChain

llm_chain = LLMChain(llm=llm, prompt=prompt)

All that's left to do now is create our `ZeroShotAgent` and our `AgentExecutor`, which are the "reasoner" and "actor" halfs of the `ReAct` method of Agent implementation.

Read all about the `ReAct` framework [here](https://react-lm.github.io/)

In [54]:
barbenheimer_agent = ZeroShotAgent(
    llm_chain = llm_chain,
    tools = tools,
    verbose = True
)

barbenheimer_agent_chain = AgentExecutor.from_agent_and_tools(
    agent = barbenheimer_agent, 
    tools = tools, 
    verbose = True
)

## Conclusion

All that is left to do now, is feed inputs to your Agent and watch it work!

Remember to use the `{"input" : "YOUR QUERY HERE"}` format when prompting the Agent.

In [59]:
# barbenheimer_agent_chain.invoke({"input" : "What did people like about the Barbie movie?"})
barbenheimer_agent_chain.invoke({"input" : "What do people think about feminist movies?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I don't have enough knowledge about people's opinions on feminist movies.
Action: query-oppenheimer
Action Input: "What are some common opinions about feminist movies?"[0m
Observation: [36;1m[1;3mcontent='The provided sources do not contain any information about common opinions about feminist movies.' additional_kwargs={} example=False[0m
Thought:[32;1m[1;3mI couldn't find any information about common opinions on feminist movies using the Oppenheimer model.
Action: query-barbie
Action Input: "What are some common opinions about feminist movies?"[0m

[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mOpinions about feminist movies can vary widely, as people have different perspectives and beliefs. Here are some common opinions that people may have about feminist movies:

1. Empowering and Inspiring: Many people view feminist movies as empowering and inspiring, as they often highlight the strength, resil

OutputParserException: Could not parse LLM output: `Based on the information provided by the Barbie agent, here are some common opinions about feminist movies:

1. Empowering and Inspiring: Many people view feminist movies as empowering and inspiring, as they often highlight the strength, resilience, and agency of female characters. These movies can serve as a source of inspiration for women and girls, promoting gender equality and challenging traditional gender roles.

2. Representation and Diversity: Feminist movies are often praised for their efforts to represent and amplify the voices of women from diverse backgrounds. They can provide a platform for underrepresented groups and shed light on the experiences and challenges faced by women of different races, ethnicities, sexual orientations, and abilities.

3. Social Commentary: Feminist movies often tackle social issues and provide social commentary on topics such as sexism, misogyny, gender inequality, and women's rights. They can spark important conversations and raise awareness about these issues, encouraging viewers to critically examine societal norms and expectations.

4. Subversion of Stereotypes: Feminist movies often challenge and subvert traditional gender stereotypes, presenting female characters who defy expectations and break free from societal limitations. This can be seen as a refreshing and progressive approach that promotes gender equality and encourages viewers to question and challenge existing norms.

5. Criticism and Controversy: Not all opinions about feminist movies are positive. Some people may criticize feminist movies for being too political or agenda-driven. They may argue that these movies prioritize a specific ideology over storytelling or that they are overly focused on gender-related themes at the expense of other aspects of filmmaking.

6. Impact on Society: Some people believe that feminist movies have the potential to create positive social change by challenging patriarchal systems and promoting gender equality. They see these movies as important tools for cultural transformation and the advancement of women's rights.

It's important to note that these opinions are not exhaustive, and individuals may have their own unique perspectives on feminist movies based on their personal beliefs, experiences, and values.`

In [58]:
barbenheimer_agent_chain.run({"input" : "What did people like about the Oppenheimer movie?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I don't have any information about the Oppenheimer movie, so I should use the query-oppenheimer tool to find out what people liked about it.
Action: query-oppenheimer
Action Input: "What did people like about the Oppenheimer movie?"[0m
Observation: [36;1m[1;3mcontent="Based on the public user reviews, people liked the following aspects of the Oppenheimer movie:\n\n- The film was engaging and never boring, driven by dialogue and character drama.\n- The actors' performances, including Cillian Murphy, Robert Downey Jr., and Cillian Murphy, were praised.\n- The cinematography was beautifully crafted.\n- The movie sparked an interest in history and made viewers want to learn more.\n- The film was visually stunning and thought-provoking.\n- The story focused on Oppenheimer's personal life and his role in creating the atomic bomb.\n- Despite not understanding the scientific aspects, viewers were able to understand the st

'People liked the Oppenheimer movie for its engaging storytelling, performances, and visual appeal.'

In [60]:
barbenheimer_agent_chain.run({"input" : "Did the movies Barbie and Oppenheimer share similar themes or ideas?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I'm not sure about the themes or ideas of the movies Barbie and Oppenheimer. I should use the available tools to gather information.
Action: query-oppenheimer
Action Input: "What are the themes or ideas in the movie Oppenheimer?"[0m
Observation: [36;1m[1;3mcontent="The themes and ideas in the movie Oppenheimer include:\n\n1. Ethical dilemmas: The movie explores the ethical implications of J. Robert Oppenheimer's involvement in the creation of the atomic bomb. It raises questions about the morality of developing such a destructive weapon and the consequences of its use.\n\n2. Personal conflicts: Oppenheimer's internal struggles and personal matters are depicted in the film. It delves into his psyche and the psychological impact of his choices, highlighting the pressure and conflicts he faced.\n\n3. McCarthyism and paranoia: The movie touches on the era of the red scare and McCarthyism, portraying the shadow hanging

InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4500 tokens (4371 in the messages, 129 in the functions). Please reduce the length of the messages or functions.

## Next Steps

It's time to build a Chainlit (or Gradio) application and host it on Hugging Face Spaces! :ship: