[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/docs/langchain-retrieval-agent.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/docs/langchain-retrieval-agent.ipynb)

#### [LangChain Handbook](https://pinecone.io/learn/langchain)

# Retrieval Agents

Conversational agents can struggle with data freshness, knowledge about specific domains, or accessing internal documentation. By coupling agents with retrieval augmentation tools we no longer have these problems.

One the other side, using "naive" retrieval augmentation without the use of an agent means we will retrieve contexts with *every* query. Again, this isn't always ideal as not every query requires access to external knowledge.

Merging these methods gives us the best of both worlds. In this notebook we'll learn how to do this.

To begin, we must install the prerequisite libraries that we will be using in this notebook.

In [1]:
!pip install -qU \
    openai==0.27.7 \
    pinecone-client==3.1.0 \
    pinecone-datasets==0.7.0 \
    langchain==0.1.1 \
    langchain-community==0.0.13 \
    tiktoken==0.4.0

## Building the Knowledge Base

We will download a pre-embedded dataset from `pinecone-datasets`. Allowing us to skip the embedding and preprocessing steps, if you'd rather work through those steps you can find the [full notebook here](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/08-langchain-retrieval-agent.ipynb).

In [2]:
from pinecone_datasets import load_dataset

dataset = load_dataset("langchain-python-docs-text-embedding-ada-002")
dataset.head()

Unnamed: 0,id,values,sparse_values,metadata,blob
0.0,417ede5d-39be-498f-b518-f47ed4e53b90,"[0.005949743557721376, 0.01983247883617878, -0...",,,"{'chunk': 0, 'text': '.rst .pdf Welcome to Lan..."
1.0,110f550d-110b-4378-b95e-141397fa21bc,"[0.009401749819517136, 0.02443608082830906, 0....",,,"{'chunk': 1, 'text': 'Use Cases# Best practice..."
2.0,d5f00f02-3295-4567-b297-5e3262dc2728,"[-0.005517194513231516, 0.0208403542637825, 0....",,,"{'chunk': 2, 'text': 'Gallery: A collection of..."
3.0,0b6fe3c6-1f0e-4608-a950-43231e46b08a,"[-0.006499645300209522, 0.0011573900701478124,...",,,"{'chunk': 0, 'text': 'Search Error Please acti..."
4.0,39d5f15f-b973-42c0-8c9b-a2df49b627dc,"[-0.005658374633640051, 0.00817849114537239, 0...",,,"{'chunk': 0, 'text': '.md .pdf Dependents Depe..."


In [3]:
len(dataset)

6952

We'll format the dataset ready for upsert and reduce what we use to a subset of the full dataset.

In [4]:
# we drop sparse_values as they are not needed for this example 
# wee also rename the blob column to be our metadata since it contains useful information
# TODO:
...
# TODO:
...

dataset.documents

Unnamed: 0,id,values,metadata
0.0,417ede5d-39be-498f-b518-f47ed4e53b90,"[0.005949743557721376, 0.01983247883617878, -0...","{'chunk': 0, 'text': '.rst .pdf Welcome to Lan..."
1.0,110f550d-110b-4378-b95e-141397fa21bc,"[0.009401749819517136, 0.02443608082830906, 0....","{'chunk': 1, 'text': 'Use Cases# Best practice..."
2.0,d5f00f02-3295-4567-b297-5e3262dc2728,"[-0.005517194513231516, 0.0208403542637825, 0....","{'chunk': 2, 'text': 'Gallery: A collection of..."
3.0,0b6fe3c6-1f0e-4608-a950-43231e46b08a,"[-0.006499645300209522, 0.0011573900701478124,...","{'chunk': 0, 'text': 'Search Error Please acti..."
4.0,39d5f15f-b973-42c0-8c9b-a2df49b627dc,"[-0.005658374633640051, 0.00817849114537239, 0...","{'chunk': 0, 'text': '.md .pdf Dependents Depe..."
...,...,...,...
,8019ac6d-786e-4802-9c31-ed11b87c362b,"[0.003017181996256113, 0.01940435729920864, -0...","{'chunk': 2, 'text': '[{'text': '\n\nEnvironme..."
,05087c76-15e4-4abc-b523-2898ba13918a,"[0.008789504878222942, 0.004951421171426773, -...","{'chunk': 3, 'text': 'of any programming langu..."
,7f0ebe3b-bec3-4910-8fda-0c6b7276da09,"[0.0019375714473426342, 0.004722204990684986, ...","{'chunk': 4, 'text': 'important when it comes ..."
,819c1e3e-5cbe-40f2-aaf4-e2196fcf8b19,"[0.01285293698310852, 0.0072921025566756725, 0...","{'chunk': 5, 'text': 'explore both of these op..."


## Creating an Index

Now the data is ready, we can set up our index to store it.

We begin by initializing our connection to Pinecone. To do this we need a [free API key](https://app.pinecone.io).

In [5]:
import os
from pinecone import Pinecone

# initialize connection to pinecone (get API key at app.pinecone.io)
api_key = ""

# TODO:
# configure client
pc = ...

Now we setup our index specification, this allows us to define the cloud provider and region where we want to deploy our index. You can find a list of all [available providers and regions here](https://docs.pinecone.io/docs/projects).

In [6]:
from pinecone import ServerlessSpec

cloud = os.environ.get('PINECONE_CLOUD') or 'aws'
region = os.environ.get('PINECONE_REGION') or 'us-east-1'

spec = ServerlessSpec(cloud=cloud, region=region)

In [7]:
index_name = 'langchain-retrieval-agent-fast'

In [8]:
import time

if index_name in pc.list_indexes().names():
    pc.delete_index(index_name)

# TODO:
# we create a new index
pc.create_index(
        index_name,
        dimension = 1536,  # dimensionality of langchain-python-docs-text-embedding-ada-002
        metric = ...,
        spec = spec
    )

# wait for index to be initialized
while not pc.describe_index(index_name).status['ready']:
    time.sleep(1)

Then connect to the index:

In [9]:
# TODO:
# Connect the index
index = ...
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

We should see that the new Pinecone index has a `total_vector_count` of `0`, as we haven't added any vectors yet.

Now we upsert the data to Pinecone:

In [10]:
# TODO:
# Upsert the data to Pinecone
...

sending upsert requests:   0%|          | 0/6952 [00:00<?, ?it/s]

{'upserted_count': 6952}

We've indexed everything, now we can check the number of vectors in our index like so:

In [11]:
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 3476}},
 'total_vector_count': 3476}

## Creating a Vector Store and Querying

In [12]:
from langchain.embeddings.openai import OpenAIEmbeddings

openai_api_key = ""
model_name = 'text-embedding-ada-002'

embed = ...

  warn_deprecated(


Now that we've build our index we can switch back over to LangChain. We start by initializing a vector store using the same index we just built. We do that like so:

In [13]:
from langchain.vectorstores import Pinecone

text_field = "text"

# switch back to normal index for langchain
index = pc.Index(index_name)

# TODO:
# Initialize the vectorstore
vectorstore = ...



As in previous examples, we can use the `similarity_search` method to do a pure semantic search (without the generation component).

In [14]:
# Perform a similarity search using the vectorstore
query = "What are some of the features of Langchain?"

# TODO
# Perform a similarity search
...

[Document(page_content='LangChain is an intuitive framework created to assist in developing applications driven by a language model, such as OpenAI or Hugging Face. Missing: decentralized | Must include:decentralized. LangChain, created by Harrison Chase, is a Python library that provides out-of-the-box support to build NLP applications using LLMs. Missing: decentralized | Must include:decentralized. LangChain provides a standard interface for chains, enabling developers to create sequences of calls that go beyond a single LLM call. Chains ... Missing: decentralized platform natural. LangChain is a powerful framework that simplifies the process of building advanced language model applications. Missing: platform | Must include:platform. Are your language models ignoring previous instructions ... Duration: 32:23. Posted: Feb 21, 2023. LangChain is a framework that enables quick and easy development of applications ... Prompting is the new way of programming NLP models. Missing: decentral

Once finished, we delete the Pinecone index to save resources:

In [22]:
# TODO:
# Delete the index from Pinecone
...

---