# Retrieval

This notebook covers basic walkthrough of retrieval functionality in LangChain. For more information, see:

- [Retrieval Documentation](https://python.langchain.com/docs/modules/data_connection/)

- [Advanced Retrieval Types](https://python.langchain.com/docs/modules/data_connection/retrievers/)

- [QA with RAG Use Case Documentation](https://python.langchain.com/docs/use_cases/question_answering/)

## Load Documents

In [1]:
import logging
from dotenv import load_dotenv

In [2]:
logging.basicConfig(level=logging.INFO)
log = logging.getLogger(__name__)

load_dotenv()

True

In [3]:
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://docs.smith.langchain.com/overview")

docs = loader.load()

INFO:langchain_community.document_loaders.web_base:fake_useragent not found, using default user agent.To get a realistic header for requests, `pip install fake_useragent`.


In [4]:
len(docs)

1

## Split documents

In [5]:
from langchain.text_splitter import RecursiveCharacterTextSplitter


text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)

In [6]:
len(documents)

5

## Index Documents

In [7]:
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS

embeddings = OpenAIEmbeddings()
vector = FAISS.from_documents(documents, embeddings)

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:faiss.loader:Loading faiss.
INFO:faiss.loader:Successfully loaded faiss.


## Query Documents


In [8]:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")
llm = ChatOpenAI()

document_chain = create_stuff_documents_chain(llm, prompt)

In [9]:
from langchain.chains import create_retrieval_chain

retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

In [10]:
response = retrieval_chain.invoke({"input": "how can langsmith help with testing?"})
print(response["answer"])

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


LangSmith can help with testing by providing the following features:

1. Dataset Uploading: LangSmith simplifies the process of uploading datasets for testing changes to a prompt or chain.

2. Running Chains and Agents: LangSmith allows running chains over data points and visualizing the outputs. This can be done client-side using the LangSmith client.

3. Logging Results: When running chains over data points, LangSmith logs the results to a new project associated with the dataset, making it easy to review them.

4. Assigning Feedback: LangSmith enables assigning feedback to runs and marking them as correct or incorrect directly in the web app. This helps in reviewing and analyzing the results.

5. Evaluators: LangSmith provides a set of evaluators in the open-source LangChain library. These evaluators can be specified when initiating a test run and will evaluate the results once the test run completes. While these evaluators are not perfect, they can guide the user to examples that re

## Advanced Retrieval

In [11]:
from langchain.retrievers import MultiQueryRetriever

advanced_retriever = MultiQueryRetriever.from_llm(retriever=retriever, llm=llm)

In [12]:
retrieval_chain = create_retrieval_chain(advanced_retriever, document_chain)

In [13]:
response = retrieval_chain.invoke({"input": "how can langsmith help with testing?"})
print(response["answer"])

INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
INFO:langchain.retrievers.multi_query:Generated queries: ['1. What are the benefits of using Langsmith for testing purposes?', '2. In what ways does Langsmith assist with testing?', '3. Can you explain how Langsmith contributes to the testing process?']
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


LangSmith can help with testing by providing various features and functionalities. It allows users to:

1. Easily curate datasets: LangSmith makes it easy to create and manage datasets that can be used for testing and evaluation purposes. These datasets can be exported for use in other contexts.

2. Run chains over data points: Users can run chains over data points and visualize the outputs. This helps in testing and analyzing the chain's behavior and performance.

3. Assign feedback programmatically: LangSmith allows users to associate feedback programmatically with runs. This is useful for tracking performance over time and identifying underperforming data points.

4. Extract insights from logged runs: LangSmith provides examples and guidance on extracting insights from logged runs. It helps in understanding the sequence of events, identifying the exact input and output of LLM calls, and troubleshooting specific issues.

5. Evaluate runs with evaluators: LangSmith offers a set of eva