[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/docs/langchain-retrieval-agent.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/docs/langchain-retrieval-agent.ipynb)

#### [LangChain Handbook](https://pinecone.io/learn/langchain)

# Retrieval Agents

We've seen in previous chapters how powerful [retrieval augmentation](https://www.pinecone.io/learn/langchain-retrieval-augmentation/) and [conversational agents](https://www.pinecone.io/learn/langchain-agents/) can be. They become even more impressive when we begin using them together.

Conversational agents can struggle with data freshness, knowledge about specific domains, or accessing internal documentation. By coupling agents with retrieval augmentation tools we no longer have these problems.

One the other side, using "naive" retrieval augmentation without the use of an agent means we will retrieve contexts with *every* query. Again, this isn't always ideal as not every query requires access to external knowledge.

Merging these methods gives us the best of both worlds. In this notebook we'll learn how to do this.

[![Open full notebook](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/full-link.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/generation/langchain/handbook/08-langchain-retrieval-agent.ipynb)

To begin, we must install the prerequisite libraries that we will be using in this notebook.

In [1]:
!pip install -qU \
    openai==0.27.7 \
    pinecone-client=="3.0.0.dev8" \
    git+https://github.com/pinecone-io/pinecone-datasets.git \
    langchain==0.0.162 \
    tiktoken==0.4.0

## Building the Knowledge Base

We will download a pre-embedded dataset from `pinecone-datasets`. Allowing us to skip the embedding and preprocessing steps, if you'd rather work through those steps you can find the [full notebook here](https://github.com/pinecone-io/examples/blob/master/generation/langchain/handbook/08-langchain-retrieval-agent.ipynb).

In [2]:
from pinecone_datasets import load_dataset

dataset = load_dataset("squad-text-embedding-ada-002")
dataset.head()

Unnamed: 0,id,values,sparse_values,metadata,blob
0,5733be284776f41900661182,"[-0.010262451963272523, 0.02222637996192584, -...",,"{'text': 'Architecturally, the school has a Ca...",
1,5733bf84d058e614000b61be,"[-0.009786712423983223, -0.013988726438873078,...",,"{'text': 'As at most other universities, Notre...",
2,5733bed24776f41900661188,"[0.013343917696606181, -0.0007001232846109822,...",,{'text': 'The university is the major seat of ...,
3,5733a6424776f41900660f51,"[-0.0085222901071539, 0.004399558219521822, -0...",,{'text': 'The College of Engineering was estab...,
4,5733a70c4776f41900660f64,"[-0.006695996885869355, -0.02067068565761649, ...",,{'text': 'All of Notre Dame's undergraduate st...,


In [3]:
len(dataset)

18891

We'll format the dataset ready for upsert and reduce what we use to a subset of the full dataset.

In [4]:
# we drop sparse_values as they are not needed for this example
dataset.documents.drop(['sparse_values', 'blob'], axis=1, inplace=True)

dataset.head()

Unnamed: 0,id,values,metadata
0,5733be284776f41900661182,"[-0.010262451963272523, 0.02222637996192584, -...","{'text': 'Architecturally, the school has a Ca..."
1,5733bf84d058e614000b61be,"[-0.009786712423983223, -0.013988726438873078,...","{'text': 'As at most other universities, Notre..."
2,5733bed24776f41900661188,"[0.013343917696606181, -0.0007001232846109822,...",{'text': 'The university is the major seat of ...
3,5733a6424776f41900660f51,"[-0.0085222901071539, 0.004399558219521822, -0...",{'text': 'The College of Engineering was estab...
4,5733a70c4776f41900660f64,"[-0.006695996885869355, -0.02067068565761649, ...",{'text': 'All of Notre Dame's undergraduate st...


## Vector Database

### Serverless or Pod-based?

Before getting started, decide whether to use serverless or pod-based index.

In [None]:
import os

use_serverless = os.environ.get("USE_SERVERLESS", "False").lower() == "true"

## Creating an Index

Now the data is ready, we can set up our index to store it.

We begin by initializing our connection to Pinecone. To do this we need a [free API key](https://app.pinecone.io).

In [None]:
from pinecone import Pinecone

# initialize connection to pinecone (get API key at app.pc.io)
api_key = os.environ.get('PINECONE_API_KEY') or 'PINECONE_API_KEY'
environment = os.environ.get('PINECONE_ENVIRONMENT') or 'PINECONE_ENVIRONMENT'

# configure client
pc = Pinecone(api_key=api_key)

Now we setup our index specification, this allows us to define the cloud provider and region where we want to deploy our index. You can find a list of all [available providers and regions here](https://docs.pinecone.io/docs/projects).

In [None]:
from pinecone import ServerlessSpec, PodSpec

if use_serverless:
    spec = ServerlessSpec(cloud='aws', region='us-west-2')
else:
    spec = PodSpec(environment=environment)

In [5]:
index_name = 'langchain-retrieval-agent-fast'

In [9]:
import time

if index_name in pc.list_indexes().names():
    pc.delete_index(index_name)

# we create a new index
pc.create_index(
        index_name,
        dimension=1536,  # dimensionality of text-embedding-ada-002
        metric='dotproduct',
        spec=spec
    )

# wait for index to be initialized
while not pc.describe_index(index_name).status['ready']:
    time.sleep(1)

Then connect to the index:

In [10]:
index = pc.Index(index_name)
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

We should see that the new Pinecone index has a `total_vector_count` of `0`, as we haven't added any vectors yet.

Now we upsert the data to Pinecone:

In [11]:
index.upsert_from_dataframe(dataset.documents, batch_size=100)

sending upsert requests:   0%|          | 0/18891 [00:00<?, ?it/s]

collecting async responses:   0%|          | 0/189 [00:00<?, ?it/s]

upserted_count: 18891

We've indexed everything, now we can check the number of vectors in our index like so:

In [12]:
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 18891}},
 'total_vector_count': 18891}

## Creating a Vector Store and Querying

In [13]:
from langchain.embeddings.openai import OpenAIEmbeddings

openai_api_key = os.environ.get('OPENAI_API_KEY') or 'OPENAI_API_KEY'
model_name = 'text-embedding-ada-002'

embed = OpenAIEmbeddings(
    model=model_name,
    openai_api_key=openai_api_key
)

Now that we've build our index we can switch back over to LangChain. We start by initializing a vector store using the same index we just built. We do that like so:

In [14]:
from langchain.vectorstores import Pinecone

text_field = "text"

# switch back to normal index for langchain
index = pc.Index(index_name)

vectorstore = Pinecone(
    index, embed.embed_query, text_field
)

As in previous examples, we can use the `similarity_search` method to do a pure semantic search (without the generation component).

In [15]:
query = "when was the college of engineering in the University of Notre Dame established?"

vectorstore.similarity_search(
    query,  # our search query
    k=3  # return 3 most relevant docs
)

[Document(page_content="In 1919 Father James Burns became president of Notre Dame, and in three years he produced an academic revolution that brought the school up to national standards by adopting the elective system and moving away from the university's traditional scholastic and classical emphasis. By contrast, the Jesuit colleges, bastions of academic conservatism, were reluctant to move to a system of electives. Their graduates were shut out of Harvard Law School for that reason. Notre Dame continued to grow over the years, adding more colleges, programs, and sports teams. By 1921, with the addition of the College of Commerce, Notre Dame had grown from a small college to a university with five colleges and a professional law school. The university continued to expand and add new residence halls and buildings with each subsequent president.", metadata={'title': 'University_of_Notre_Dame'}),
 Document(page_content='The College of Engineering was established in 1920, however, early c

Looks like we're getting good results. Let's take a look at how we can begin integrating this into a conversational agent.

## Initializing the Conversational Agent

Our conversational agent needs a Chat LLM, conversational memory, and a `RetrievalQA` chain to initialize. We create these using:

In [16]:
from langchain.chat_models import ChatOpenAI
from langchain.chains.conversation.memory import ConversationBufferWindowMemory
from langchain.chains import RetrievalQA

# chat completion llm
llm = ChatOpenAI(
    openai_api_key=openai_api_key,
    model_name='gpt-3.5-turbo',
    temperature=0.0
)
# conversational memory
conversational_memory = ConversationBufferWindowMemory(
    memory_key='chat_history',
    k=5,
    return_messages=True
)
# retrieval qa chain
qa = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

Using these we can generate an answer using the `run` method:

In [17]:
qa.run(query)

'The College of Engineering at the University of Notre Dame was established in 1920.'

But this isn't yet ready for our conversational agent. For that we need to convert this retrieval chain into a tool. We do that like so:

In [18]:
from langchain.agents import Tool

tools = [
    Tool(
        name='Knowledge Base',
        func=qa.run,
        description=(
            'use this tool when answering general knowledge queries to get '
            'more information about the topic'
        )
    )
]

Now we can initialize the agent like so:

In [19]:
from langchain.agents import initialize_agent

agent = initialize_agent(
    agent='chat-conversational-react-description',
    tools=tools,
    llm=llm,
    verbose=True,
    max_iterations=3,
    early_stopping_method='generate',
    memory=conversational_memory
)

With that our retrieval augmented conversational agent is ready and we can begin using it.

### Using the Conversational Agent

To make queries we simply call the `agent` directly.

In [20]:
agent(query)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "Knowledge Base",
    "action_input": "When was the College of Engineering in the University of Notre Dame established?"
}[0m
Observation: [36;1m[1;3mThe College of Engineering at the University of Notre Dame was established in 1920.[0m
Thought:[32;1m[1;3m{
    "action": "Final Answer",
    "action_input": "The College of Engineering at the University of Notre Dame was established in 1920."
}[0m

[1m> Finished chain.[0m


{'input': 'when was the college of engineering in the University of Notre Dame established?',
 'chat_history': [],
 'output': 'The College of Engineering at the University of Notre Dame was established in 1920.'}

Looks great, now what if we ask it a non-general knowledge question?

In [21]:
agent("what is 2 * 7?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "Final Answer",
    "action_input": "The product of 2 multiplied by 7 is 14."
}[0m

[1m> Finished chain.[0m


{'input': 'what is 2 * 7?',
 'chat_history': [HumanMessage(content='when was the college of engineering in the University of Notre Dame established?', additional_kwargs={}, example=False),
  AIMessage(content='The College of Engineering at the University of Notre Dame was established in 1920.', additional_kwargs={}, example=False)],
 'output': 'The product of 2 multiplied by 7 is 14.'}

Perfect, the agent is able to recognize that it doesn't need to refer to it's general knowledge tool for that question. Let's try some more questions.

In [22]:
agent("can you tell me some facts about the University of Notre Dame?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "Knowledge Base",
    "action_input": "Facts about the University of Notre Dame"
}[0m
Observation: [36;1m[1;3m- The University of Notre Dame is a Catholic research university located in South Bend, Indiana, USA.
- The university's main campus covers 1,250 acres and is known for its recognizable landmarks such as the Golden Dome, the "Word of Life" mural (Touchdown Jesus), and the Basilica.
- Notre Dame is consistently ranked among the top twenty universities in the United States and is considered a major global university.
- The undergraduate component of the university is organized into four colleges: Arts and Letters, Science, Engineering, and Business, along with the Architecture School.
- The university offers over 50 master's, doctoral, and professional degree programs through its five schools, including the Notre Dame Law School and a MD-PhD program in collaboration with IU Medical School.
- Notre Dam

{'input': 'can you tell me some facts about the University of Notre Dame?',
 'chat_history': [HumanMessage(content='when was the college of engineering in the University of Notre Dame established?', additional_kwargs={}, example=False),
  AIMessage(content='The College of Engineering at the University of Notre Dame was established in 1920.', additional_kwargs={}, example=False),
  HumanMessage(content='what is 2 * 7?', additional_kwargs={}, example=False),
  AIMessage(content='The product of 2 multiplied by 7 is 14.', additional_kwargs={}, example=False)],
 'output': 'The University of Notre Dame is a Catholic research university located in South Bend, Indiana, USA. It is consistently ranked among the top twenty universities in the United States and is known for its recognizable landmarks such as the Golden Dome and the Basilica. The university offers a wide range of undergraduate and graduate programs, and has a strong emphasis on research. It has a diverse student body and a strong a

In [23]:
agent("can you summarize these facts in two short sentences")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m{
    "action": "Final Answer",
    "action_input": "The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It is known for its strong academic programs and iconic landmarks."
}[0m

[1m> Finished chain.[0m


{'input': 'can you summarize these facts in two short sentences',
 'chat_history': [HumanMessage(content='when was the college of engineering in the University of Notre Dame established?', additional_kwargs={}, example=False),
  AIMessage(content='The College of Engineering at the University of Notre Dame was established in 1920.', additional_kwargs={}, example=False),
  HumanMessage(content='what is 2 * 7?', additional_kwargs={}, example=False),
  AIMessage(content='The product of 2 multiplied by 7 is 14.', additional_kwargs={}, example=False),
  HumanMessage(content='can you tell me some facts about the University of Notre Dame?', additional_kwargs={}, example=False),
  AIMessage(content='The University of Notre Dame is a Catholic research university located in South Bend, Indiana, USA. It is consistently ranked among the top twenty universities in the United States and is known for its recognizable landmarks such as the Golden Dome and the Basilica. The university offers a wide rang

Looks great! We're also able to ask questions that refer to previous interactions in the conversation and the agent is able to refer to the conversation history to as a source of information.

That's all for this example of building a retrieval augmented conversational agent with OpenAI and Pinecone (the OP stack) and LangChain.

Once finished, we delete the Pinecone index to save resources:

In [24]:
pc.delete_index(index_name)

---