# Quickstart
https://python.langchain.com/docs/get_started/quickstart/

In this quickstart we'll show you how to:

- Get setup with LangChain, LangSmith and LangServe
- Use the most basic and common components of LangChain: prompt templates, models, and output parsers
- Use LangChain Expression Language, the protocol that LangChain is built on and which facilitates component chaining
- Build a simple application with LangChain
- Trace your application with LangSmith
- Serve your application with LangServe

In [1]:
from IPython.display import display, Markdown

from langchain import hub

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.prompts import MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain_core.documents import Document
from langchain_core.messages import HumanMessage, AIMessage

from langchain_community.llms import Ollama
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import FAISS

from langchain_text_splitters import RecursiveCharacterTextSplitter

from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain.chains import create_history_aware_retriever

from langchain.agents import create_openai_functions_agent
from langchain.agents import AgentExecutor

from langchain.tools.retriever import create_retriever_tool
from langchain_community.tools.tavily_search import TavilySearchResults

from langchain_openai import ChatOpenAI

from dotenv import load_dotenv
load_dotenv()

llm = Ollama(model="llama3:instruct")

# LLM Chain

In [2]:
# Once you've installed and initialized the LLM of your choice, we can try using it! 
# Let's ask it what LangSmith is - this is something that wasn't present in the training data so it shouldn't have a very good response.
markdown_text = llm.invoke("how can langsmith help with testing?")
display(Markdown(markdown_text))

Langsmith, a language model and AI-powered tool, can indeed assist in testing, particularly in the realm of natural language processing (NLP) and text-based applications. Here are some ways Langsmith can help:

1. **Automated Testing**: Langsmith can be used to generate test data for NLP models, such as language understanding, sentiment analysis, or named entity recognition. This helps ensure your model is functioning correctly and consistently.
2. **Error Detection**: By analyzing input text, Langsmith can identify potential errors in grammar, syntax, or semantic meaning, which can help detect issues in your application's logic.
3. **Conversational Flow Testing**: Langsmith can simulate conversations with users, allowing you to test the flow of interactions and detect any inconsistencies or errors in the dialogue.
4. **Text Classification**: Langsmith can classify text into predefined categories (e.g., positive/negative sentiment, spam/not spam) to help evaluate the accuracy of your application's text classification capabilities.
5. **NLU Benchmarking**: Langsmith provides a benchmark for evaluating NLP models and APIs, enabling you to compare the performance of different models and identify areas for improvement.

Some examples of how Langsmith can be used in testing include:

* Testing chatbots or virtual assistants: Langsmith can simulate conversations with users, helping you ensure your AI-powered bot responds accurately and consistently.
* Evaluating sentiment analysis models: Langsmith can generate text with varying sentiment (positive, negative, neutral) to test the accuracy of your sentiment analysis model.
* Developing and testing language translation APIs: Langsmith can provide test data for translating different languages, ensuring your API produces accurate translations.

By leveraging Langsmith in your testing process, you can:

* Improve the reliability and consistency of your NLP-based applications
* Identify potential issues before they impact users
* Optimize the performance of your AI-powered systems

If you're interested in learning more about how Langsmith can support testing for your project or application, feel free to ask!

In [3]:
# We can also guide its response with a prompt template. Prompt templates convert raw user input to better input to the LLM.
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a world class technical documentation writer."),
    ("user", "{input}")
])

# We can now combine these into a simple LLM chain:
chain = prompt | llm

# We can now invoke it and ask the same question. 
# It still won't know the answer, but it should respond in a more proper tone for a technical writer!
display(Markdown(chain.invoke({"input": "how can langsmith help with testing?"})))

As a world-class technical documentation writer, I'd be delighted to explore the possibilities of Langsmith in testing!

Langsmith's AI-powered capabilities can significantly enhance the testing process by:

1. **Automated Testing**: Langsmith can generate test scenarios and scripts based on your existing documentation, freeing up your team to focus on higher-level testing tasks.
2. **Error Detection**: By analyzing code snippets, Langsmith can identify potential errors or inconsistencies, reducing the likelihood of defects making it into production.
3. **Code Review**: Langsmith's AI can review code for best practices, syntax, and semantic accuracy, providing valuable insights to improve coding quality.
4. **Test Case Generation**: Langsmith can generate test cases based on your documentation, ensuring that critical scenarios are covered and reducing the risk of missing important tests.
5. **Documentation Insights**: By analyzing existing documentation, Langsmith can identify areas where clarification or updates are needed, improving overall documentation quality.

To get started with Langsmith in testing, I recommend:

1. Identifying specific areas of your software or application that require testing (e.g., API endpoints, UI components).
2. Using Langsmith's built-in tools to generate test scenarios and scripts based on your existing documentation.
3. Reviewing the generated test cases to ensure they accurately reflect the expected behavior of your system.
4. Integrating Langsmith with your existing testing frameworks or tools to streamline the testing process.

By leveraging Langsmith in testing, you'll be able to:

* Increase testing efficiency
* Reduce the risk of errors and defects
* Improve overall code quality and maintainability

How can I help you get started with Langsmith in testing?

In [4]:
# Let's add a simple output parser to convert the chat message to a string.
output_parser = StrOutputParser()

# We can now add this to the previous chain:
chain = prompt | llm | output_parser

# We can now invoke it and ask the same question. The answer will now be a string (rather than a ChatMessage).
display(Markdown(chain.invoke({"input": "how can langsmith help with testing?"})))

What a great question!

As a technical documentation writer, I've worked closely with developers and testers to ensure that our software products are not only well-documented but also thoroughly tested.

Langsmith, as a testing framework, can help in several ways:

1. **Automated testing**: Langsmith allows you to write tests in a human-readable format (like Markdown) rather than traditional programming languages like Java or Python. This makes it easier for non-technical team members to write and maintain tests.
2. **Test-driven development**: By using Langsmith, developers can write automated tests alongside their code, ensuring that new features are thoroughly tested before they're released.
3. **Easy test maintenance**: When requirements change or bugs need to be fixed, Langsmith's simple syntax makes it easy to update existing tests without needing extensive programming knowledge.
4. **Integrations with CI/CD pipelines**: Langsmith can integrate seamlessly with continuous integration and delivery (CI/CD) pipelines, allowing for automated testing and deployment of code changes.

Some specific ways Langsmith can help with testing include:

* Writing unit tests for individual components or features
* Creating acceptance tests to verify that the software meets user requirements
* Developing integration tests to ensure that different parts of the system work together correctly
* Running regression tests to catch any breaking changes or bugs introduced by new code

By leveraging Langsmith, you can write better tests faster and easier than ever before!

# Retrieval Chain
To properly answer the original question ("how can langsmith help with testing?"), we need to provide additional context to the LLM. We can do this via retrieval. Retrieval is useful when you have too much data to pass to the LLM directly. You can then use a retriever to fetch only the most relevant pieces and pass those in.

In this process, we will look up relevant documents from a Retriever and then pass them into the prompt. A Retriever can be backed by anything - a SQL table, the internet, etc - but in this instance we will populate a vector store and use that as a retriever. For more information on vectorstores, see this documentation.

First, we need to load the data that we want to index. To do this, we will use the WebBaseLoader. This requires installing BeautifulSoup:

In [5]:
loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")
docs = loader.load()

embeddings = OllamaEmbeddings(model="mxbai-embed-large:latest")


# Let's build our index
text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)
vector = FAISS.from_documents(documents, embeddings)

In [6]:
# First, let's set up the chain that takes a question and the retrieved documents and generates an answer.
prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

document_chain = create_stuff_documents_chain(llm, prompt)

# If we wanted to, we could run this ourselves by passing in documents directly:
print(document_chain.invoke({
    "input": "how can langsmith help with testing?",
    "context": [Document(page_content="langsmith can let you visualize test results")]
}))


According to the context, Langsmith can help with testing by letting you "visualize" test results.


In [7]:
# However, we want the documents to first come from the retriever we just set up. 
retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

# We can now invoke this chain. This returns a dictionary - the response from the LLM is in the answer key.
response = retrieval_chain.invoke({"input": "how can langsmith help with testing?"})
display(Markdown(response["answer"]))


Based on the provided context, LangSmith can help with testing in the following ways:

1. **Test Driven Approach**: LangSmith allows developers to create datasets, which are collections of inputs and reference outputs, and use these to run tests on their LLM applications.
2. **Native Rendering of Chat Messages**: LangSmith provides native rendering of chat messages, functions, and retrieve documents, making it easier to track the performance of and annotate your application across multiple turns.
3. **Comparison View**: LangSmith's comparison view allows you to view results for different configurations on the same datapoints side-by-side, helping you diagnose regressions in test scores across multiple revisions of your application.
4. **Playground Environment**: The playground environment enables rapid iteration and experimentation, allowing you to quickly test out different prompts and models.
5. **Beta Testing**: LangSmith supports beta testing by allowing developers to collect more data on how their LLM applications are performing in real-world scenarios, including capturing feedback and annotating traces.
6. **Adding Runs to a Dataset**: Langsmith enables users to add runs as examples to datasets, expanding test coverage on real-world scenarios.

Overall, LangSmith provides various features that facilitate testing, experimentation, and evaluation of LLM applications, making it easier to develop, refine, and improve these applications.

# Conversation Retrieval Chain

The chain we've created so far can only answer single questions. One of the main types of LLM applications that people are building are chat bots. So how do we turn this chain into one that can answer follow up questions?

We can still use the create_retrieval_chain function, but we need to change two things:

1. The retrieval method should now not just work on the most recent input, but rather should take the whole history into account.
2. The final LLM chain should likewise take the whole history into account

**Updating Retrieval**

In order to update retrieval, we will create a new chain. This chain will take in the most recent input (input) and the conversation history (chat_history) and use an LLM to generate a search query.

In [8]:
# First we need a prompt that we can pass into an LLM to generate this search query
prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
    ("user", "Given the above conversation, generate a search query to look up to get information relevant to the conversation")
])
retriever_chain = create_history_aware_retriever(llm, retriever, prompt)

# We can test this out by passing in an instance where the user asks a follow-up question.
chat_history = [HumanMessage(content="Can LangSmith help test my LLM applications?"), AIMessage(content="Yes!")]
retriever_chain.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})


[Document(page_content='Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookUser GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping\u200bPrototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.\nThe ability to rapidly understand how the model is performing — and debug where it is failing — is incredibly important for this phase.Debugging\u200bWhen developing new LLM applications, we suggest having LangSmith tracing enabled

You should see that this returns documents about testing in LangSmith. This is because the LLM generated a new query, combining the chat history with the follow-up question.

Now that we have this new retriever, we can create a new chain to continue the conversation with these retrieved documents in mind.


In [9]:
prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer the user's questions based on the below context:\n\n{context}"),
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
])
document_chain = create_stuff_documents_chain(llm, prompt)

retrieval_chain = create_retrieval_chain(retriever_chain, document_chain)

# We can now test this out end-to-end:
chat_history = [HumanMessage(content="Can LangSmith help test my LLM applications?"), AIMessage(content="Yes!")]
response = retrieval_chain.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})

display(Markdown(response['answer']))

LangSmith is a platform for LLM (Large Language Model) application development, monitoring, and testing. It provides various workflows to support different stages of the application development lifecycle.

**Prototyping**: LangSmith allows you to rapidly experiment with different prompts, model types, retrieval strategies, and other parameters. You can quickly test out different scenarios and debug issues by looking at the trace information.

**Debugging**: When things go wrong, Langsmith gives you clear visibility and debugging information for each step of an LLM sequence. This makes it easier to identify and root-cause issues.

**Initial Test Set**: You can create datasets (collections of inputs and reference outputs) and use them to run tests on your LLM applications. These test cases can be uploaded in bulk, created on the fly, or exported from application traces.

**Comparison View**: When prototyping different versions of your applications, Langsmith provides a comparison view to track and diagnose regressions in test scores across multiple revisions of your application.

**Playground**: The playground environment allows you to quickly test out different prompts and models. Every playground run is logged in the system and can be used to create test cases or compare with other runs.

In addition, Langsmith provides various features for testing and monitoring LLM applications:

* **Beta Testing**: Collect more data on how your application is performing in real-world scenarios.
* **Capturing Feedback**: Gather human feedback on the responses your application produces.
* **Annotating Traces**: Send runs to annotation queues, where annotators can closely inspect interesting traces and annotate them with respect to different criteria.
* **Adding Runs to a Dataset**: Expand your test coverage by adding runs as examples to datasets.
* **Production Monitoring and A/B Testing**: Track key metrics over time, drill down into specific data points, and perform A/B testing for changes in prompt, model, or retrieval strategy.

Langsmith also supports automations, which allow you to perform actions on traces in near real-time. This can be used to automatically score traces, send them to annotation queues, or add them to datasets.

Overall, LangSmith provides a comprehensive platform for developing, testing, and monitoring LLM applications.

# Agent

We've so far created examples of chains - where each step is known ahead of time. The final thing we will create is an agent - where the LLM decides what steps to take.

**NOTE: for this example we will only show how to create an agent using OpenAI models, as local models are not reliable enough yet.**

One of the first things to do when building an agent is to decide what tools it should have access to. For this example, we will give the agent access to two tools:

1. The retriever we just created. This will let it easily answer questions about LangSmith
2. A search tool. This will let it easily answer questions that require up-to-date information.
First, let's set up a tool for the retriever we just created:

In [10]:
retriever_tool = create_retriever_tool(
    retriever,
    "langsmith_search",
    "Search for information about LangSmith. For any questions about LangSmith, you must use this tool!",
)

# The search tool that we will use is Tavily. This will require an API key (they have generous free tier). 
# After creating it on their platform, you need to set it as an environment variable export TAVILY_API_KEY=...
search = TavilySearchResults()

# We can now create a list of the tools we want to work with:
tools = [retriever_tool, search]

# Now that we have the tools, we can create an agent to use them. 
# We will go over this pretty quickly - for a deeper dive into what exactly is going on, check out the Agent's Getting Started documentation

# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")

llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

We can now invoke the agent and see how it responds! We can ask it questions about LangSmith:


In [11]:
# We can now invoke the agent and see how it responds! We can ask it questions about LangSmith:
display(Markdown(agent_executor.invoke({"input": "how can langsmith help with testing?"})['output']))



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `langsmith_search` with `{'query': 'how can LangSmith help with testing'}`


[0m[36;1m[1;3mLangSmith User Guide | 🦜️🛠️ LangSmith

meaning that they involve a series of interactions between the user and the application. LangSmith provides a threads view that groups traces from a single conversation together, making it easier to track the performance of and annotate your application across multiple turns.Was this page helpful?PreviousQuick StartNextOverviewPrototypingBeta TestingProductionCommunityDiscordTwitterGitHubDocs CodeLangSmith SDKPythonJS/TSMoreHomepageBlogLangChain Python DocsLangChain JS/TS DocsCopyright © 2024 LangChain, Inc.

Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookUser GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, a

LangSmith can help with testing in the following ways:

1. **Debugging**: LangSmith provides clear visibility and debugging information at each step of an LLM sequence, making it easier to identify and root-cause issues when things go wrong during testing.

2. **Initial Test Set**: Developers can create datasets with collections of inputs and reference outputs to run tests on their LLM applications. LangSmith allows for running custom evaluations to score test results.

3. **Comparison View**: It's important to see if there are regressions in test cases when prototyping different versions of applications. LangSmith offers a user-friendly comparison view to track and diagnose regressions in test scores across multiple revisions.

4. **Playground**: LangSmith provides a playground environment for rapid iteration and experimentation, allowing developers to test out different prompts and models quickly.

5. **Beta Testing**: During beta testing, LangSmith helps in collecting data on how LLM applications perform in real-world scenarios. It supports feedback collection, run annotation, and development of automatic evaluations to track regressions and improvements.

6. **Monitoring and A/B Testing**: LangSmith provides monitoring charts to track key metrics over time, allowing for A/B testing changes in prompt, model, or retrieval strategy.

7. **Automations**: Automations in LangSmith enable actions to be performed on traces in near real-time, such as automatically scoring traces, sending them to annotation queues, or adding them to datasets.

8. **Threads**: LangSmith provides a threads view that groups traces from a single conversation together, making it easier to track the performance of applications across multiple turns during testing.

We can ask it about the weather:



In [13]:
display(Markdown(agent_executor.invoke({"input": "what is the weather in Antwerp in celcius?"})['output']))



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `tavily_search_results_json` with `{'query': 'weather in Antwerp in Celsius'}`


[0m[33;1m[1;3m[{'url': 'https://www.weatherapi.com/', 'content': "{'location': {'name': 'Antwerp', 'region': '', 'country': 'Belgium', 'lat': 51.22, 'lon': 4.42, 'tz_id': 'Europe/Brussels', 'localtime_epoch': 1713905217, 'localtime': '2024-04-23 22:46'}, 'current': {'last_updated_epoch': 1713905100, 'last_updated': '2024-04-23 22:45', 'temp_c': 7.0, 'temp_f': 44.6, 'is_day': 0, 'condition': {'text': 'Light rain', 'icon': '//cdn.weatherapi.com/weather/64x64/night/296.png', 'code': 1183}, 'wind_mph': 5.6, 'wind_kph': 9.0, 'wind_degree': 310, 'wind_dir': 'NW', 'pressure_mb': 1016.0, 'pressure_in': 30.0, 'precip_mm': 0.04, 'precip_in': 0.0, 'humidity': 87, 'cloud': 75, 'feelslike_c': 3.8, 'feelslike_f': 38.9, 'vis_km': 10.0, 'vis_miles': 6.0, 'uv': 1.0, 'gust_mph': 17.3, 'gust_kph': 27.9}}"}, {'url': 'https://www.timeanddate.com/weather

The current weather in Antwerp, Belgium is 7.0°C with light rain.

We can have conversations with it:

In [15]:
chat_history = [HumanMessage(content="Can LangSmith help test my LLM applications?"), AIMessage(content="Yes!")]
display(Markdown(agent_executor.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})['output']))



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `langsmith_search` with `{'query': 'LangSmith LLM application testing services'}`


[0m[36;1m[1;3mSkip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookUser GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping​Prototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.
The ability to rapidly understand how the model is performing — and debug where it is fa

LangSmith provides a platform for LLM application development, monitoring, and testing. Here are some ways LangSmith can help test your LLM applications:

1. Prototyping: LangSmith supports quick experimentation between prompts, model types, retrieval strategies, and other parameters for prototyping LLM applications.

2. Debugging: LangSmith offers tracing capabilities to help developers debug issues in their LLM applications, providing clear visibility and debugging information at each step of an LLM sequence.

3. Initial Test Set: Developers can create datasets of inputs and reference outputs to run tests on their LLM applications, allowing for custom evaluations to score test results.

4. Comparison View: LangSmith provides a comparison view for test runs to track and diagnose regressions in test scores across multiple revisions of an application.

5. Playground: LangSmith offers a playground environment for rapid iteration and experimentation, allowing developers to test out different prompts and models quickly.

6. Beta Testing: Developers can use LangSmith for beta testing to collect data on how their LLM applications perform in real-world scenarios, capture feedback, annotate traces, and add runs to datasets for refinement and improvement.

7. Production Monitoring: LangSmith enables developers to closely inspect key data points, monitor application performance metrics over time, and conduct A/B testing to track changes in prompt, model, or retrieval strategy.

8. Automations: LangSmith's automation feature allows developers to perform actions on traces in near real-time, such as automatically scoring traces, sending them to annotation queues, or adding them to datasets.

Overall, LangSmith offers a comprehensive set of tools and features to support the testing and development of LLM applications.

# Serving with LangServe

Now that we've built an application, we need to serve it. That's where LangServe comes in. LangServe helps developers deploy LangChain chains as a REST API. You do not need to use LangServe to use LangChain, but in this guide we'll show how you can deploy your app with LangServe.

While the first part of this guide was intended to be run in a Jupyter Notebook, we will now move out of that. We will be creating a Python file and then interacting with it from the command line.

Install with:

```bash
pip install "langserve[all]"
```

## Server
To create a server for our application we'll make a serve.py file. This will contain our logic for serving our application. It consists of three things:

1. The definition of our chain that we just built above
2. Our FastAPI app
3. A definition of a route from which to serve the chain, which is done with langserve.add_routes


### Playground 
Every LangServe service comes with a simple built-in UI for configuring and invoking the application with streaming output and visibility into intermediate steps. Head to http://localhost:8000/agent/playground/ to try it out! Pass in the same question as before - "how can langsmith help with testing?" - and it should respond same as before.

### Client
Now let's set up a client for programmatically interacting with our service. We can easily do this with the [langserve.RemoteRunnable](/docs/langserve#client). Using this, we can interact with the served chain as if it were running client-side.

In [17]:
from langserve import RemoteRunnable

remote_chain = RemoteRunnable("http://localhost:8000/agent/")
display(Markdown(remote_chain.invoke({
    "input": "how can langsmith help with testing?",
    "chat_history": []  # Providing an empty list as this is the first call
})['output']))

LangSmith can help with testing in various ways throughout the application development lifecycle. Here are some key ways LangSmith can assist with testing:

1. Prototyping: LangSmith supports quick experimentation between prompts, model types, retrieval strategies, and other parameters for LLM applications.
2. Debugging: LangSmith provides tracing capabilities to help developers debug issues in LLM applications by offering clear visibility and debugging information at each step of an LLM sequence.
3. Initial Test Set: Developers can create datasets of inputs and reference outputs to run tests on LLM applications, allowing for test cases to be uploaded, created, or exported from application traces.
4. Comparison View: LangSmith offers a comparison view for test runs to track and diagnose regressions in test scores across multiple revisions of an application.
5. Playground: LangSmith provides a playground environment for rapid iteration and experimentation with different prompts and models.
6. Beta Testing: Developers can collect data on how LLM applications perform in real-world scenarios, gather feedback, and develop automatic evaluations during beta testing.
7. Capturing Feedback: LangSmith allows developers to gather human feedback on application responses and attach feedback scores to logged traces for analysis.
8. Annotating Traces: Users can send runs to annotation queues for close inspection and annotation by annotators to catch regressions across important evaluation criteria.
9. Production Monitoring: LangSmith enables users to closely inspect key data points, grow benchmarking datasets, and monitor application performance with respect to latency, cost, and feedback scores in production.
10. A/B Testing: LangSmith supports A/B testing by providing monitoring charts to track key metrics over time and allowing for tag and metadata grouping to compare different versions of applications.
11. Automations: Users can define automations in LangSmith to perform actions on traces in near real-time, such as scoring traces, sending them to annotation queues, or adding them to datasets.
12. Threads: LangSmith provides a threads view to group traces from a single conversation together, making it easier to track the performance of multi-turn LLM applications.

These features in LangSmith can enhance the testing process and help developers ensure the quality and performance of their LLM applications.