In [1]:
# From langchain tutorial https://python.langchain.com/docs/get_started/quickstart/#client
from dotenv import load_dotenv
# Load the .env file
load_dotenv()

True

In [2]:
# Once you've installed and initialized the LLM of your choice, we can try using it!
# Let's ask it what LangSmith is - this is something that wasn't present in the training data so it shouldn't have a very good response.

In [3]:
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI()
llm.invoke("Hello, world!")
llm.invoke("how can langsmith help with testing?")

  warn_deprecated(


AIMessage(content='Langsmith can help with testing in the following ways:\n\n1. Automated testing: Langsmith can be used to create test scripts and automate the testing process, making it faster and more reliable.\n\n2. Test data generation: Langsmith can be used to generate test data for various scenarios, helping to ensure comprehensive test coverage.\n\n3. Test case management: Langsmith can be used to organize and manage test cases, making it easier to track and document test results.\n\n4. Integration testing: Langsmith can be used to conduct integration testing to ensure that different components of a system work together seamlessly.\n\n5. Performance testing: Langsmith can be used to conduct performance testing to evaluate the speed, scalability, and stability of a system under different conditions.\n\nOverall, Langsmith can help streamline the testing process and improve the quality of software products.', response_metadata={'token_usage': {'completion_tokens': 166, 'prompt_tok

In [4]:
# We can also guide its response with a prompt template.
# Prompt templates convert raw user input to better input to the LLM.

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are world class technical documentation writer."),
    ("user", "{input}")
])

In [5]:
#We can now combine these into a simple LLM chain:

In [6]:
chain = prompt | llm

In [7]:
#We can now invoke it and ask the same question. It still won't know the answer, but it should respond in a more proper tone for a technical writer!

In [8]:
chain.invoke({"input": "how can langsmith help with testing?"})

AIMessage(content="Langsmith is a powerful tool that can greatly assist with testing by providing automated language testing capabilities. Here are some ways in which Langsmith can help with testing:\n\n1. Language compatibility testing: Langsmith can be used to automatically test compatibility of code written in different programming languages or versions. This can help identify potential issues early on in the development process.\n\n2. Language syntax testing: Langsmith can be used to check the syntax of code written in various programming languages, ensuring that it follows the correct rules and conventions.\n\n3. Language error detection: Langsmith can help detect errors in code written in different programming languages, providing valuable feedback to developers for debugging and troubleshooting.\n\n4. Language performance testing: Langsmith can be used to test the performance of code written in different programming languages, helping developers optimize their code for better sp

In [9]:
#The output of a ChatModel (and therefore, of this chain) is a message. However, it's often much more convenient to work with strings.
#Let's add a simple output parser to convert the chat message to a string.

In [10]:
from langchain_core.output_parsers import StrOutputParser

output_parser = StrOutputParser()

# We can add this to the previosu chain
chain = prompt | llm | output_parser

chain.invoke({"input": "how can langsmith help with testing?"})

'Langsmith can help with testing in several ways:\n\n1. **Automated Testing:** Langsmith can be used to generate test cases automatically, which can help in testing the functionality of the software. By generating a wide variety of test cases, Langsmith can help in ensuring comprehensive test coverage.\n\n2. **Test Data Generation:** Langsmith can generate realistic and diverse test data that can be used for testing different scenarios. This can help in identifying edge cases and potential issues that may not be caught with limited or synthetic test data.\n\n3. **Load Testing:** Langsmith can be used to generate a large volume of test data to simulate real-world usage scenarios and evaluate the performance of the system under different load conditions. This can help in identifying bottlenecks and optimizing system performance.\n\n4. **Regression Testing:** Langsmith can generate test cases that can be used for regression testing to ensure that new code changes do not introduce any unin

In [11]:
#To properly answer the original question ("how can langsmith help with testing?"), we need to provide additional context to the LLM. We can do this via retrieval. Retrieval is useful when you have too much data to pass to the LLM directly.
#You can then use a retriever to fetch only the most relevant pieces and pass those in.
#In this process, we will look up relevant documents from a Retriever and then pass them into the prompt.
#A Retriever can be backed by anything - a SQL table, the internet, etc - but in this instance we will populate a vector store and use that as a retriever.
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")

docs = loader.load()

In [12]:
#Next, we need to index it into a vectorstore. This requires a few components, namely an embedding model and a vectorstore.
#For embedding models, we once again provide examples for accessing via API or by running local models.

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

In [13]:
#Now, we can use this embedding model to ingest documents into a vectorstore. We will use a simple local vectorstore, FAISS, for simplicity's sake.
#pip install faiss-cpu
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter


text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)
vector = FAISS.from_documents(documents, embeddings)

In [14]:
#Now that we have this data indexed in a vectorstore, we will create a retrieval chain.
#This chain will take an incoming question, look up relevant documents, then pass those documents along with the original question into an LLM and ask it to answer the original question.
from langchain.chains.combine_documents import create_stuff_documents_chain

prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

document_chain = create_stuff_documents_chain(llm, prompt)

In [15]:
#If we wanted to, we could run this ourselves by passing in documents directly:

from langchain_core.documents import Document

document_chain.invoke({
    "input": "how can langsmith help with testing?",
    "context": [Document(page_content="langsmith can let you visualize test results")]
})

'Langsmith can help with testing by allowing users to visualize test results.'

In [16]:
#However, we want the documents to first come from the retriever we just set up.
# That way, we can use the retriever to dynamically select the most relevant documents and pass those in for a given question.

from langchain.chains import create_retrieval_chain

retriever = vector.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

In [17]:
#We can now invoke this chain. This returns a dictionary - the response from the LLM is in the answer key

response = retrieval_chain.invoke({"input": "how can langsmith help with testing?"})
print(response["answer"])

LangSmith can help with testing by allowing developers to create datasets, run tests on their LLM applications using these datasets, and easily run custom evaluations to score test results. Developers can also upload test cases in bulk, create them on the fly, or export them from application traces. LangSmith also provides a comparison view for tracking and diagnosing regressions in test scores across multiple revisions of an application. Additionally, LangSmith offers a playground environment for rapid iteration and experimentation, as well as the ability to capture feedback from users and annotate traces for further analysis.


In [18]:
#Conversation Retrieval Chain
#The chain we've created so far can only answer single questions. One of the main types of LLM applications that people are building are chat bots.
# So how do we turn this chain into one that can answer follow up questions?

#We can still use the create_retrieval_chain function, but we need to change two things:
#The retrieval method should now not just work on the most recent input, but rather should take the whole history into account.
#The final LLM chain should likewise take the whole history into account

In [19]:
# Updating Retrieval

# In order to update retrieval, we will create a new chain.
#This chain will take in the most recent input (input) and the conversation history (chat_history) and use an LLM to generate a search query.

from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder

# First we need a prompt that we can pass into an LLM to generate this search query

prompt = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
    ("user", "Given the above conversation, generate a search query to look up to get information relevant to the conversation")
])
retriever_chain = create_history_aware_retriever(llm, retriever, prompt)

In [20]:
#We can test this out by passing in an instance where the user asks a follow-up question.

from langchain_core.messages import HumanMessage, AIMessage

chat_history = [HumanMessage(content="Can LangSmith help test my LLM applications?"), AIMessage(content="Yes!")]
retriever_chain.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})

#You should see that this returns documents about testing in LangSmith. This is because the LLM generated a new query, combining the chat history with the follow-up question.

[Document(page_content='Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookUser GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping\u200bPrototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.\nThe ability to rapidly understand how the model is performing — and debug where it is failing — is incredibly important for this phase.Debugging\u200bWhen developing new LLM applications, we suggest having LangSmith tracing enabled

In [21]:
#Now that we have this new retriever, we can create a new chain to continue the conversation with these retrieved documents in mind.
prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer the user's questions based on the below context:\n\n{context}"),
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
])
document_chain = create_stuff_documents_chain(llm, prompt)

retrieval_chain = create_retrieval_chain(retriever_chain, document_chain)

In [22]:
#We can now test this out end-to-end:

chat_history = [HumanMessage(content="Can LangSmith help test my LLM applications?"), AIMessage(content="Yes!")]
retrieval_chain.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})

{'chat_history': [HumanMessage(content='Can LangSmith help test my LLM applications?'),
  AIMessage(content='Yes!')],
 'input': 'Tell me how',
 'context': [Document(page_content='Skip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookUser GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping\u200bPrototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.\nThe ability to rapidly understand how the model is performing — and debug wh

In [23]:
#Agent

#We've so far created examples of chains - where each step is known ahead of time
#The final thing we will create is an agent - where the LLM decides what steps to take.

In [24]:
#One of the first things to do when building an agent is to decide what tools it should have access to.
# For this example, we will give the agent access to two tools:

# 1. The retriever we just created. This will let it easily answer questions about LangSmith
# 2. A search tool. This will let it easily answer questions that require up-to-date information.

In [25]:
#First, let's set up a tool for the retriever we just created:

from langchain.tools.retriever import create_retriever_tool

retriever_tool = create_retriever_tool(
    retriever,
    "langsmith_search",
    "Search for information about LangSmith. For any questions about LangSmith, you must use this tool!",
)

In [26]:
#The search tool that we will use is Tavily. This will require an API key (they have generous free tier).
#After creating it on their platform, you need to set it as an environment variable:
from langchain_community.tools.tavily_search import TavilySearchResults

search = TavilySearchResults()

In [27]:
#We can now create a list of the tools we want to work with:
tools = [retriever_tool, search]

In [29]:
#Now we can use it to get a predefined prompt
from langchain_openai import ChatOpenAI
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain.agents import AgentExecutor

# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")

# You need to set OPENAI_API_KEY environment variable or pass it as argument `api_key`.
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

In [30]:
#We can now invoke the agent and see how it responds! We can ask it questions about LangSmith:

agent_executor.invoke({"input": "how can LangSmith help with testing?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `langsmith_search` with `{'query': 'How can LangSmith help with testing?'}`


[0m[36;1m[1;3mSkip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookUser GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping​Prototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.
The ability to rapidly understand how the model is performing — and debug where it is failing 

{'input': 'how can LangSmith help with testing?',
 'output': 'LangSmith can help with testing in various ways throughout the application development lifecycle. Here are some key ways LangSmith can assist with testing:\n\n1. Prototyping: LangSmith supports quick experimentation between prompts, model types, retrieval strategies, and other parameters to help developers prototype LLM applications efficiently.\n\n2. Debugging: LangSmith provides tracing capabilities that give clear visibility and debugging information at each step of an LLM sequence, making it easier to identify and root-cause issues when things go wrong.\n\n3. Initial Test Set: Developers can create datasets of inputs and reference outputs to run tests on LLM applications. LangSmith allows for running custom evaluations to score test results.\n\n4. Comparison View: LangSmith offers a user-friendly comparison view for test runs to track and diagnose regressions in test scores across multiple revisions of an application.\n\

In [31]:
#We can ask it about the weather:
agent_executor.invoke({"input": "what is the weather in SF?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `tavily_search_results_json` with `{'query': 'weather in San Francisco'}`


[0m[33;1m[1;3mHTTPError('400 Client Error: Bad Request for url: https://api.tavily.com/search')[0m[32;1m[1;3mI apologize for the inconvenience. It seems there was an error in retrieving the weather information for San Francisco. You may want to check a reliable weather website or app for the current weather in San Francisco.[0m

[1m> Finished chain.[0m


{'input': 'what is the weather in SF?',
 'output': 'I apologize for the inconvenience. It seems there was an error in retrieving the weather information for San Francisco. You may want to check a reliable weather website or app for the current weather in San Francisco.'}

In [32]:
#We can have conversations with it:

chat_history = [HumanMessage(content="Can LangSmith help test my LLM applications?"), AIMessage(content="Yes!")]
agent_executor.invoke({
    "chat_history": chat_history,
    "input": "Tell me how"
})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `langsmith_search` with `{'query': 'LangSmith LLM application testing services'}`


[0m[36;1m[1;3mSkip to main contentLangSmith API DocsSearchGo to AppQuick StartUser GuideTracingEvaluationProduction Monitoring & AutomationsPrompt HubProxyPricingSelf-HostingCookbookUser GuideOn this pageLangSmith User GuideLangSmith is a platform for LLM application development, monitoring, and testing. In this guide, we’ll highlight the breadth of workflows LangSmith supports and how they fit into each stage of the application development lifecycle. We hope this will inform users how to best utilize this powerful platform or give them something to consider if they’re just starting their journey.Prototyping​Prototyping LLM applications often involves quick experimentation between prompts, model types, retrieval strategy and other parameters.
The ability to rapidly understand how the model is performing — and debug where it is fa

{'chat_history': [HumanMessage(content='Can LangSmith help test my LLM applications?'),
  AIMessage(content='Yes!')],
 'input': 'Tell me how',
 'output': 'LangSmith is a platform that supports LLM application development, monitoring, and testing. Here are some ways LangSmith can help test your LLM applications:\n\n1. Prototyping: LangSmith allows for quick experimentation between prompts, model types, and retrieval strategies to understand how the model is performing during the prototyping phase.\n\n2. Debugging: LangSmith provides tracing capabilities to help debug issues such as unexpected end results, infinite loops, slow execution, and high token usage.\n\n3. Initial Test Set: Developers can create datasets of inputs and reference outputs to run tests on LLM applications and evaluate their performance.\n\n4. Comparison View: LangSmith offers a comparison view to track and diagnose regressions in test scores across different versions of your application.\n\n5. Playground: A playgrou