# Retrieval

This notebook covers basic walkthrough of retrieval functionality in LangChain. For more information, see:

- [Retrieval Documentation](https://js.langchain.com/docs/modules/data_connection/)
- [Advanced Retrieval Types](https://js.langchain.com/docs/modules/data_connection/retrievers/)
- [QA with RAG Use Case Documentation](https://js.langchain.com/docs/use_cases/question_answering/)

### Setup

This notebook uses the `CheerioWebBaseLoader` which requires installing `cheerio`:

```bash
npm install cheerio
```

In [1]:
import "cheerio";
import { CheerioWebBaseLoader } from "langchain/document_loaders/web/cheerio";

const loader = new CheerioWebBaseLoader(
  "https://docs.smith.langchain.com/overview"
);

const docs = await loader.load();

[Module: null prototype] {
  contains: [36m[Function: contains][39m,
  default: [Function: initialize] {
    contains: [36m[Function: contains][39m,
    html: [36m[Function: html][39m,
    merge: [36m[Function: merge][39m,
    parseHTML: [36m[Function: parseHTML][39m,
    root: [36m[Function: root][39m,
    text: [36m[Function: text][39m,
    xml: [36m[Function: xml][39m,
    load: [36m[Function: load][39m,
    _root: Document {
      parent: [1mnull[22m,
      prev: [1mnull[22m,
      next: [1mnull[22m,
      startIndex: [1mnull[22m,
      endIndex: [1mnull[22m,
      children: [],
      type: [32m"root"[39m
    },
    _options: { xml: [33mfalse[39m, decodeEntities: [33mtrue[39m },
    fn: Cheerio {}
  },
  html: [36m[Function: html][39m,
  load: [36m[Function: load][39m,
  merge: [36m[Function: merge][39m,
  parseHTML: [36m[Function: parseHTML][39m,
  root: [36m[Function: root][39m,
  text: [36m[Function: text][39m,
  xml: [36m[Function:

## Split documents


In [2]:
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";

const textSplitter = new RecursiveCharacterTextSplitter();
const documents = await textSplitter.splitDocuments(docs);

In [3]:
console.log(documents.length)

60


## Index Documents

In [4]:
import { OpenAIEmbeddings } from "@langchain/openai";
import { MemoryVectorStore } from "langchain/vectorstores/memory";

const embeddings = new OpenAIEmbeddings();
const vector = await MemoryVectorStore.fromDocuments(documents, embeddings);

## Query Documents

In [5]:
import { createStuffDocumentsChain } from "langchain/chains/combine_documents";
import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate } from "@langchain/core/prompts";

const prompt = ChatPromptTemplate.fromTemplate(`Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}`);
const llm = new ChatOpenAI({
  modelName: "gpt-3.5-turbo",
  temperature: 0,
});

const documentChain = await createStuffDocumentsChain({
  llm,
  prompt,
});

In [6]:
import { createRetrievalChain } from "langchain/chains/retrieval";

const retriever = vector.asRetriever();
const retrievalChain = await createRetrievalChain({
  retriever,
  combineDocsChain: documentChain,
});

In [7]:
const response = await retrievalChain.invoke({ input: "how can langsmith help with testing?" });
console.log(response.answer);

LangSmith can help with testing by allowing users to quickly edit examples and add them to datasets. This helps expand the surface area of evaluation sets and fine-tune models for improved quality or reduced costs. Additionally, LangSmith can be used to monitor applications, log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Users can also associate feedback programmatically with runs, track performance over time, and pinpoint underperforming data points.


## Advanced Retrieval

In [8]:
import { MultiQueryRetriever } from "langchain/retrievers/multi_query";

const advancedRetriever = MultiQueryRetriever.fromLLM({ llm, retriever })

In [9]:
const retrievalChain = await createRetrievalChain({
  retriever: advancedRetriever,
  combineDocsChain: documentChain,
});

In [10]:
const response = await retrievalChain.invoke({ input: "how can langsmith help with testing?" });
console.log(response.answer);

LangSmith can help with testing by allowing users to quickly edit examples and add them to datasets, expanding the surface area of evaluation sets. It can also be used to fine-tune a model for improved quality or reduced costs. Additionally, LangSmith provides the ability to monitor applications, log traces, visualize latency and token usage statistics, and troubleshoot specific issues. Users can also associate feedback programmatically with runs, track performance over time, and pinpoint underperforming data points. LangSmith simplifies debugging by providing insights into unexpected end results, agent looping, chain speed, and token usage. It also allows for the construction and export of datasets for future testing and evaluation.
