# Building with LangChain

LangChain enables building application that connect external sources of data and computation to LLMs. In this quickstart, we will walk through a few different ways of doing that. We will start with a simple LLM chain, which just relies on information in the prompt template to respond. Next, we will build a retrieval chain, which fetches data from a separate database and passes that into the prompt template. We will then add in chat history, to create a conversation retrieval chain. This allows you to interact in a chat manner with this LLM, so it remembers previous questions. Finally, we will build an agent - which utilizes an LLM to determine whether or not it needs to fetch data to answer questions. We will cover these at a high level, but there are lot of details to all of these! We will link to relevant docs.

## LLM Chain

Ollama allows you to run open-source large language models, such as Llama 2, locally.

First, follow [these instructions](https://github.com/jmorganca/ollama) to set up and run a local Ollama instance:

* [Download](https://ollama.ai/download)
* Fetch a model via `ollama pull llama2`

Then, make sure the Ollama server is running with `ollama serve`. After that, you can do:

In [1]:
from langchain_community.llms import Ollama

llm = Ollama(model="llama3")

Once you've installed and initialized the LLM of your choice, we can try using it! Let's ask it what LangSmith is - this is something that wasn't present in the training data so it shouldn't have a very good response.

In [2]:
llm.invoke("How can LangSmith help with testing?")

"Langsmith is a powerful AI-powered language understanding and generation tool that can significantly assist with testing in various ways. Here are some examples:\n\n1. **Automated Testing**: Langsmith's API allows you to integrate its natural language processing capabilities into your automated testing frameworks, such as Selenium or Appium. This enables you to test the functionality of text-based user interfaces, like chatbots, IVR systems, or websites with complex text inputs.\n\n2. **Text-Based Test Data Generation**: Langsmith can generate realistic and diverse test data for text-based tests. For instance, you can use it to create test cases that simulate different types of customer inquiries or product reviews.\n\n3. **Natural Language Understanding (NLU) Testing**: Langsmith's NLU capabilities allow you to test the comprehension and interpretation of natural language inputs by your application or system. This is particularly useful for testing voice assistants, chatbots, or othe

We can also guide its response with a prompt template. Prompt templates convert raw user input to better input to the LLM.

In [3]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
  ("system", "You are a world class technical documentation writer."),
  ("user", "{input}")
])

We can now combine these into a simple LLM chain:

In [4]:
chain = prompt | llm

We can now invoke it and ask the same question. It still won't know the answer, but it should respond in a more proper tone for a technical writer!

In [5]:
chain.invoke({
  "input": "How can LangSmith help with testing?"
})

"As a world-class technical documentation writer, I'm delighted to discuss how LangSmith can assist with testing.\n\nLangSmith is an AI-powered language generation platform that specializes in producing high-quality, human-like content. When it comes to testing, Langsmith can be leveraged in several ways:\n\n1. **Test Case Generation**: Langsmith can generate test cases based on the requirements of a feature or functionality. This helps ensure that all possible scenarios are covered during testing.\n2. **Test Data Generation**: The platform can create realistic test data, such as user inputs, responses, and expected outcomes, making it easier to test various aspects of an application or system.\n3. **Error Message Generation**: Langsmith can generate error messages for different scenarios, helping you simulate and test error handling mechanisms in your software.\n4. **Test Scripts**: The platform can assist in creating test scripts by generating pre-defined steps and expected results f

The output of a ChatModel (and therefore, of this chain) is a message. However, it's often much more convenient to work with strings. Let's add a simple output parser to convert the chat message to a string.

In [6]:
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

In [7]:
chain = prompt | llm | output_parser

In [8]:
chain.invoke({
  "input": "How can LangSmith help with testing?"
})

"As a technical documentation writer, I'd be happy to highlight how Langsmith can facilitate effective testing.\n\nLangsmith's language translation capabilities can significantly streamline the testing process by:\n\n1. **Automated Translation**: Langsmith's AI-powered translations ensure that test cases, test scripts, and test data are accurately translated for different languages, reducing manual errors and increasing efficiency.\n2. **Consistent Test Data**: By translating test data, such as input values or expected results, Langsmith helps maintain consistency across languages, ensuring that tests remain relevant and reliable.\n3. **Reduced Manual Effort**: With Langsmith's automated translations, testers can focus on testing rather than manually translating test scripts and data.\n4. **Improved Collaboration**: Langsmith's real-time translation capabilities enable cross-cultural collaboration among teams, allowing developers, QA engineers, and other stakeholders to work together s

## Retrieval Chain

To properly answer the original question ("how can langsmith help with testing?"), we need to provide additional context to the LLM. We can do this via *retrieval*. Retrieval is useful when you have **too much data** to pass to the LLM directly. You can then use a retriever to fetch only the most relevant pieces and pass those in.

In this process, we will look up relevant documents from a Retriever and then pass them into the prompt. A Retriever can be backed by anything - a SQL table, the internet, etc - but in this instance we will populate a vector store and use that as a retriever. For more information on vectorstores, see [this documentation](https://python.langchain.com/v0.1/docs/modules/data_connection/vectorstores/).

First, we need to load the data that we want to index. To do this, we will use the WebBaseLoader. This requires installing BeautifulSoup.

```sh
pip install beautifulsoup4
```

After that, we can import and use WebBaseLoader.

In [13]:
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")

docs = loader.load()

Next, we need to index it into a vectorstore. This requires a few components, namely an [embedding model](https://python.langchain.com/v0.1/docs/modules/data_connection/text_embedding/) and a [vectorstore](https://python.langchain.com/v0.1/docs/modules/data_connection/vectorstores/).

For embedding models, we once again provide examples for accessing via API or by running local models.

In [10]:
from langchain_community.embeddings import OllamaEmbeddings

embeddings = OllamaEmbeddings(model="llama3")

Now, we can use this embedding model to ingest documents into a vectorstore. We will use a simple local vectorstore, FAISS, for simplicity's sake.

First we need to install the required packages for that:

```sh
pip install faiss-cpu
```

Then we can build our index:

In [11]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)

In [12]:
from langchain_community.vectorstores.faiss import FAISS

vector = FAISS.from_documents(
  documents,
  embeddings,
)

Now that we have this data indexed in a vectorstore, we will create a retrieval chain. This chain will take an incoming question, look up relevant documents, then pass those documents along with the original question into an LLM and ask it to answer the original question.

First, let's set up the chain that takes a question and the retrieved documents and generates an answer.

In [14]:
from langchain.chains.combine_documents import create_stuff_documents_chain

prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

document_chain = create_stuff_documents_chain(llm, prompt)

If we wanted to, we could run this ourselves by passing in documents directly:

In [15]:
from langchain_core.documents import Document

document_chain.invoke({
    "input": "How can LangSmith help with testing?",
    "context": [Document(page_content="LangSmith can let you visualize test results")]
})

'According to the provided context, LangSmith can let you visualize test results.'

However, we want the documents to first come from the retriever we just set up. That way, we can use the retriever to *dynamically select the most relevant documents* and pass those in for a given question.

In [16]:
from langchain.chains import create_retrieval_chain

retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

Note that the documents are stored in the vectorstore, which is always only stored in memory.

We can now invoke this chain. This returns a dictionary - the response from the LLM is in the `answer` key

In [17]:
response = retrieval_chain.invoke({"input": "How can LangSmith help with testing?"})
response

{'input': 'How can LangSmith help with testing?',
 'context': [Document(page_content='Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookThis is outdated documentation for 🦜️🛠️ LangSmith, which is no longer actively maintained.For up-to-date documentation, see the latest version.User GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping\u200bPrototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.\nThe ability to rapidly und

In [18]:
response["answer"]

'According to the provided context, LangSmith can help with testing in several ways:\n\n1. **Prototyping**: LangSmith allows for rapid experimentation between prompts, model types, retrieval strategy, and other parameters.\n2. **Test Driven Approach**: Developers can create datasets, which are collections of inputs and reference outputs, and use these to run tests on their LLM applications.\n3. **Native Rendering**: Langsmith provides native rendering of chat messages, functions, and retrieve documents for debugging purposes.\n4. **Comparison View**: Users can view results for different configurations on the same datapoints side-by-side, allowing for easy comparison and diagnosis of regressions in test scores.\n5. **Test Case Management**: LangSmith allows users to create test cases, upload them in bulk, or export them from application traces.\n6. **Evaluation and Scoring**: Users can run custom evaluations (both LLM and heuristic-based) to score test results.\n7. **Automated Testing**