<a href="https://colab.research.google.com/github/jeffheaton/app_generative_ai/blob/main/t81_559_class_07_4_more_agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# T81-559: Applications of Generative Artificial Intelligence
**Module 7: LangChain: Agents**
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

# Module 7 Material

* Part 7.1: Introduction to LangChain Agents [[Video]](https://www.youtube.com/watch?v=J5Vr___lSSs) [[Notebook]](t81_559_class_07_1_agents.ipynb)
* Part 7.2: Understanding LangChain Agent Tools [[Video]](https://www.youtube.com/watch?v=qMquBmteYw4) [[Notebook]](t81_559_class_07_2_tools.ipynb)
* Part 7.3: LangChain Retrival and Search Tools [[Video]](https://www.youtube.com/watch?v=NB5qGPLoBBE) [[Notebook]](t81_559_class_07_3_search_tools.ipynb)
* **Part 7.4: Constructing LangChain Agents** [[Video]](https://www.youtube.com/watch?v=OJe5oHvrdHk) [[Notebook]](t81_559_class_07_4_more_agent.ipynb)
* Part 7.5: Custom Agents [[Video]](https://www.youtube.com/watch?v=IsJemVYSEdc) [[Notebook]](t81_559_class_07_5_custom_agent.ipynb)

# Google CoLab Instructions

The following code ensures that Google CoLab is running and maps Google Drive if needed.

In [None]:
import os

try:
    from google.colab import drive, userdata
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

# OpenAI Secrets
if COLAB:
    os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')
    os.environ["TAVILY_API_KEY"] = userdata.get('TAVILY_API_KEY')

# Install needed libraries in CoLab
if COLAB:
    !pip install langchain langchain_openai langchain_experimental duckduckgo-search langchainhub sentence-transformers faiss-cpu

# 7.4: Constructing LangChain Agents


In this chapter, we introduce a comprehensive module that showcases the capabilities of a complete LangChain agent equipped with web search, retrieval, and memory functionalities. This agent is designed to answer questions by leveraging the most up-to-date documentation about LangChain through Retrieval-Augmented Generation (RAG), surpassing the limitations of the foundational model's knowledge. By integrating real-time web search and retrieval, the agent can access and incorporate the latest information, ensuring its responses are accurate and current. Additionally, the memory component enables the agent to retain and recall pertinent information, enhancing its ability to provide informed and contextually relevant answers. Through this module, readers will learn how to create an intelligent agent that not only understands and processes queries but also continually updates its knowledge base to stay aligned with the latest advancements in LangChain.

We begin by importing the necessary libraries and modules required for our LangChain agent, including those for web search, retrieval, and memory management. Just as we have done in previous chapters, we will create an OpenAI language model (LLM) to serve as the foundation of our agent. This model will be pivotal in interpreting and generating human-like text based on the input it receives. By leveraging the OpenAI LLM, we ensure our agent has a robust and versatile language understanding capability, which will be further enhanced by integrating real-time web search and retrieval functionalities. This setup allows the agent to access and utilize the most recent and relevant information, providing comprehensive and accurate responses to user queries.


In [1]:
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.tools.retriever import create_retriever_tool
from langchain.agents import create_tool_calling_agent
from langchain.agents import AgentExecutor

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [None]:
MODEL = 'gpt-4o-mini'

llm = ChatOpenAI(
        model=MODEL,
        temperature=0.2,
        n=1
    )

In [39]:
from langchain_groq import ChatGroq

llm = ChatGroq(model="mixtral-8x7b-32768")

In [34]:
# from langchain_ollama import ChatOllama

# llm = ChatOllama(model="llama3.1:8b", base_url="http://localhost:11434")


In this section, we use Retrieval-Augmented Generation (RAG) to access the latest version of the LangChain documentation, which is more current than the data the foundational model was initially trained on. We start by creating a WebBaseLoader instance to load content from the URL https://docs.smith.langchain.com/overview, ensuring we are working with the most recent documentation. Once the content is loaded, we use RecursiveCharacterTextSplitter to break the text into smaller, manageable chunks of 1000 characters each, with an overlap of 200 characters to maintain context. These chunks are then converted into embeddings using OpenAIEmbeddings(), and we index these embeddings with FAISS (Facebook AI Similarity Search) for efficient similarity searching. Finally, we create a retriever from the FAISS index, enabling our LangChain agent to query the document embeddings and retrieve the most relevant chunks of documentation in response to user questions. This approach ensures that our agent can leverage the latest information, enhancing its ability to provide accurate and up-to-date answers.

In [3]:
from langchain_ollama import OllamaEmbeddings

embeddings = OllamaEmbeddings(model="mxbai-embed-large:latest", base_url="http://localhost:11434")

loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
docs = loader.load()
documents = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200
).split_documents(docs)
# vector = FAISS.from_documents(documents, OpenAIEmbeddings())
vector = FAISS.from_documents(documents, embeddings)
retriever = vector.as_retriever()

We now submit a simple query to the retriever to see what data it returns to answer the quest. The returned data is from the website, not yet processed by a LLM.

In [4]:
retriever.invoke("how to upload a dataset")[0]

Document(metadata={'source': 'https://docs.smith.langchain.com/overview', 'title': 'Get started with LangSmith | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. Use of LangChain is not necessary - LangSmith works on its own!', 'language': 'en'}, page_content='description="A sample dataset in LangSmith.")client.create_examples(    inputs=[        {"postfix": "to LangSmith"},        {"postfix": "to Evaluations in LangSmith"},    ],    outputs=[        {"output": "Welcome to LangSmith"},        {"output": "Welcome to Evaluations in LangSmith"},    ],    dataset_id=dataset.id,)# Define your evaluatordef exact_match(run, example):    return {"score": run.outputs["output"] == example.outputs["output"]}experiment_results = evaluate(    lambda input: "Welcome " + input[\'postfix\'], # Your AI system goes here    data=dataset_name,

To provide access to the web data we just downloaded, we use an agent retriever tool. This tool is created by wrapping the retriever we previously built with the VectorStoreRetrieverTool class, specifying "Documentation" as the name and a brief description to indicate its purpose. By integrating this agent retriever tool, our LangChain agent can effectively utilize the web data for generating responses. The ZeroShotAgent is then configured with this tool, and the agent is initialized with AgentExecutor. This setup empowers our agent to retrieve and process the most relevant information from the latest LangChain documentation, ensuring it can answer queries accurately and comprehensively.

In [5]:
retriever_tool = create_retriever_tool(
    retriever,
    "langsmith_search",
    "Search for information about LangSmith. For any questions about LangSmith, you must use this tool!",
)

We also demonstrate a query to the Tavily search, this will allow us to augment the downloaded data with adhoc web searches to augment both the foundation model and RAG data source.

In [6]:
search = TavilySearchResults()

search.invoke("what is the weather in SF")

[{'url': 'https://www.weatherapi.com/',
  'content': "{'location': {'name': 'San Francisco', 'region': 'California', 'country': 'United States of America', 'lat': 37.775, 'lon': -122.4183, 'tz_id': 'America/Los_Angeles', 'localtime_epoch': 1729026028, 'localtime': '2024-10-15 14:00'}, 'current': {'last_updated_epoch': 1729025100, 'last_updated': '2024-10-15 13:45', 'temp_c': 15.6, 'temp_f': 60.0, 'is_day': 1, 'condition': {'text': 'Cloudy', 'icon': '//cdn.weatherapi.com/weather/64x64/day/119.png', 'code': 1006}, 'wind_mph': 6.7, 'wind_kph': 10.8, 'wind_degree': 252, 'wind_dir': 'WSW', 'pressure_mb': 1018.0, 'pressure_in': 30.06, 'precip_mm': 0.0, 'precip_in': 0.0, 'humidity': 92, 'cloud': 83, 'feelslike_c': 15.5, 'feelslike_f': 59.9, 'windchill_c': 15.5, 'windchill_f': 59.9, 'heatindex_c': 15.6, 'heatindex_f': 60.0, 'dewpoint_c': 14.3, 'dewpoint_f': 57.7, 'vis_km': 10.0, 'vis_miles': 6.0, 'uv': 5.1, 'gust_mph': 9.2, 'gust_kph': 14.8}}"},
 {'url': 'https://www.timeanddate.com/weather/us

Next we set up the LangChain agent by integrating various tools and configuring the agent to use them effectively. We begin by defining a list of tools that includes our web search tool and the previously created retriever tool. Next, we pull a pre-defined prompt from the LangChain hub using hub.pull("hwchase17/openai-functions-agent"), which provides structured messages that guide the agent's interactions. Using this prompt, we create a tool-calling agent with the create_tool_calling_agent function, passing in our language model (LLM), the list of tools, and the prompt. Finally, we initialize the agent executor with AgentExecutor, specifying the agent and tools, and enabling verbose mode for detailed logging. This setup ensures that our LangChain agent is well-equipped to leverage both the web search and retrieval functionalities to provide accurate and up-to-date responses.

In [40]:
tools = [search, retriever_tool]

# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")
prompt.messages

agent = create_tool_calling_agent(llm, tools, prompt)

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

We can see how the agent responds to a simple trivial prompt.

In [13]:
agent_executor.invoke({"input": "hi!"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `langsmith_search` with `{'query': 'hi!'}`


[0m[33;1m[1;3mGet started with LangSmith | 🦜️🛠️ LangSmith

"revision_id": "beta"    },)import { Client, Run, Example } from "langsmith";import { evaluate } from "langsmith/evaluation";import { EvaluationResult } from "langsmith/evaluation";const client = new Client();// Define dataset: these are your test casesconst datasetName = "Sample Dataset";const dataset = await client.createDataset(datasetName, {  description: "A sample dataset in LangSmith.",});await client.createExamples({  inputs: [    { postfix: "to LangSmith" },    { postfix: "to Evaluations in LangSmith" },  ],  outputs: [    { output: "Welcome to LangSmith" },    { output: "Welcome to Evaluations in LangSmith" },  ],  datasetId: dataset.id,});// Define your evaluatorconst exactMatch = async (  run: Run,  example: Example): Promise<EvaluationResult> => {  return {    key: "exact_match",    score: run.outp

{'input': 'hi!',
 'output': 'Based on the tool call response, here\'s an answer to your original question:\n\n"How do I use Traceable in LangSmith?"\n\nYou can use Traceable in LangSmith by auto-tracing LLM calls using the `@traceable` decorator. This will enable automatic tracing of the function or pipeline.\n\nHere is some sample code:\n\n```python\nimport openai\nfrom langsmith.wrappers import wrap_openai\nfrom langsmith import traceable\n\n# Auto-trace LLM calls in-context\nclient = wrap_openai(openai.Client())\n\n@traceable # Auto-trace this function\ndef pipeline(user_input: str):\n    result = client.chat.completions.create(\n        messages=[{"role": "user", "content": user_input}],\n        model="gpt-3.5-turbo"\n    )\n    return result.choices[0].message.content\n\npipeline("Hello, world!")\n```\n\nThis will enable automatic tracing of the `pipeline` function.\n\nYou can also use Traceable in TypeScript by importing it from `langsmith/traceable` and using the `@traceable` d

Next we present it with a question about langsmith, we can see the agent utilizes several tools.

In [46]:
agent_executor.invoke({"input": "how can langsmith help with testing?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `langsmith_search` with `{'query': 'testing features in LangSmith'}`


[0m[33;1m[1;3m"revision_id": "beta"    },)import { Client, Run, Example } from "langsmith";import { evaluate } from "langsmith/evaluation";import { EvaluationResult } from "langsmith/evaluation";const client = new Client();// Define dataset: these are your test casesconst datasetName = "Sample Dataset";const dataset = await client.createDataset(datasetName, {  description: "A sample dataset in LangSmith.",});await client.createExamples({  inputs: [    { postfix: "to LangSmith" },    { postfix: "to Evaluations in LangSmith" },  ],  outputs: [    { output: "Welcome to LangSmith" },    { output: "Welcome to Evaluations in LangSmith" },  ],  datasetId: dataset.id,});// Define your evaluatorconst exactMatch = async (  run: Run,  example: Example): Promise<EvaluationResult> => {  return {    key: "exact_match",    score: run.outputs?.output === exa

{'input': 'how can langsmith help with testing?',
 'output': 'Based on the output of the provided tool, it appears that LangSmith is a platform for defining and testing AI systems. It includes features such as creating a dataset, defining evaluators, and running evaluations using that dataset and evaluators.\n\nIn this particular output, a sample dataset was created with the name "Sample Dataset". Two examples were added to the dataset, with inputs such as "to LangSmith" and "to Evaluations in LangSmith", and corresponding outputs such as "Welcome to LangSmith" and "Welcome to Evaluations in LangSmith". An evaluator was also defined, named "exact_match", which compares the output of a run to the expected output and returns a score of 1 if they match exactly.\n\nFinally, an evaluation was run using this dataset and evaluator, and the results were stored in a variable named "experiment\\_results".\n\nIt\'s worth noting that the tool output appears to be incomplete, as it seems to be cut 

We can also ask it a current events question, causing the agent to go to the search tool.

In [15]:
agent_executor.invoke({"input": "whats the weather in sf?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `tavily_search_results_json` with `{'query': 'weather in sf'}`


[0m[36;1m[1;3m[{'url': 'https://forecast.weather.gov/zipcity.php?inputstring=94110', 'content': 'Local Forecast Office More Local Wx 3 Day History Hourly Weather Forecast. Extended Forecast for San Francisco CA . This Afternoon. High: 73 °F. Slight Chance Drizzle. Tonight. Low: 60 °F. Chance Drizzle. Sunday. High: 72 °F. ... San Francisco CA 37.75°N 122.4°W (Elev. 197 ft) Last Update: 12:30 pm PDT Oct 12, 2024. Forecast Valid: 1pm PDT ...'}, {'url': 'https://www.weather.gov/mtr/?os=shmmfp&ref=app', 'content': 'NWS Forecast Office San Francisco, CA. Weather.gov > San Francisco Bay Area, CA Current Hazards. Current Outlooks; Daily Briefing; Submit Report; Detailed Hazards; Tsunami ... National Weather Service San Francisco Bay Area, CA 21 Grace Hopper Ave, Stop 5 Monterey, CA 93943-5505 (831) 656-1725 Comments? Questions? Please Contact Us. ...'}, {'

{'input': 'whats the weather in sf?',
 'output': 'The current weather in San Francisco is mostly cloudy, with a high of 73°F and a low of 60°F. There is a slight chance of drizzle this afternoon and tonight, and a chance of rain showers on Sunday. Beach Hazards Statement is also in effect from October 13 to October 16.'}

## Adding Chat History/Memory

To enhance our LangChain agent's capabilities, we are adding chat memory to enable it to remember previous interactions and maintain context across a conversation. This addition allows the agent to provide more coherent and contextually aware responses. We start by importing necessary modules for managing chat message histories. We then create an instance of ChatMessageHistory to store the messages exchanged during the conversation. By wrapping our existing agent executor with RunnableWithMessageHistory, we integrate the chat memory functionality. This setup ensures that each session can maintain a history of messages, facilitating smoother and more natural interactions with the agent. The code snippet below illustrates this implementation:

In [41]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

message_history = ChatMessageHistory()

agent_with_chat_history = RunnableWithMessageHistory(
    agent_executor,
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    lambda session_id: message_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

We will now test the memory, I begin by introducing myself.

In [43]:
agent_with_chat_history.invoke(
    {"input": "hi! I'm Jeff"},
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    config={"configurable": {"session_id": "d123"}},
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mHello Jeff! How can I help you today? If needed, I can use tools to provide you with a more accurate and comprehensive response.[0m

[1m> Finished chain.[0m


{'input': "hi! I'm Jeff",
 'chat_history': [HumanMessage(content="hi! I'm Jeff", additional_kwargs={}, response_metadata={}),
  AIMessage(content="Based on the search results, it seems that LangSmith is an open-source AI startup that provides a unified platform for developers to track the system-level performance of their applications, model/chain performance, debug issues, and establish a broad understanding of how users are interacting with their applications. It helps teams in five core ways: debugging, providing a unified platform, tracing and evaluating complex agent prompt chains, and reducing the time required to debug and refine prompts. LangSmith's ability to track down to a single API call makes it easier to see how different components fit together. It also provides a simple UI for understanding complex workflows and logging historical data, reducing barriers to entry for those without a software background.", additional_kwargs={}, response_metadata={})],
 'output': 'Hello J

Next I ask it a question, we can see that it recalls who I am.

In [44]:
agent_with_chat_history.invoke(
    {"input": "how can LangSmith help with testing"},
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    config={"configurable": {"session_id": "d123"}},
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mLangSmith can help with testing by providing a unified platform for tracking system-level performance of applications, which includes model/chain performance and debugging issues. LangSmith's tracing and evaluation capabilities can assist in understanding how users interact with applications and help to debug and refine prompts more efficiently. Additionally, LangSmith's ability to track down to a single API call makes it easier to see how different components fit together during testing. It also provides a simple UI for understanding complex workflows and logging historical data, which can be useful during the testing process.[0m

[1m> Finished chain.[0m


{'input': 'how can LangSmith help with testing',
 'chat_history': [HumanMessage(content="hi! I'm Jeff", additional_kwargs={}, response_metadata={}),
  AIMessage(content="Based on the search results, it seems that LangSmith is an open-source AI startup that provides a unified platform for developers to track the system-level performance of their applications, model/chain performance, debug issues, and establish a broad understanding of how users are interacting with their applications. It helps teams in five core ways: debugging, providing a unified platform, tracing and evaluating complex agent prompt chains, and reducing the time required to debug and refine prompts. LangSmith's ability to track down to a single API call makes it easier to see how different components fit together. It also provides a simple UI for understanding complex workflows and logging historical data, reducing barriers to entry for those without a software background.", additional_kwargs={}, response_metadata={}

We can further review and test the memory.

In [45]:
agent_with_chat_history.invoke(
    {"input": "Who am I? What did I just ask you about?!"},
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    config={"configurable": {"session_id": "d123"}},
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mHello! You are Jeff, and you asked me about how LangSmith can help with testing. I provided a response explaining that LangSmith can help with testing by providing a unified platform for tracking system-level performance of applications, including model/chain performance and debugging issues. LangSmith's tracing and evaluation capabilities can assist in understanding how users interact with applications and help to debug and refine prompts more efficiently. Additionally, LangSmith's ability to track down to a single API call and provide a simple UI for understanding complex workflows and logging historical data can be useful during the testing process.[0m

[1m> Finished chain.[0m


{'input': 'Who am I? What did I just ask you about?!',
 'chat_history': [HumanMessage(content="hi! I'm Jeff", additional_kwargs={}, response_metadata={}),
  AIMessage(content="Based on the search results, it seems that LangSmith is an open-source AI startup that provides a unified platform for developers to track the system-level performance of their applications, model/chain performance, debug issues, and establish a broad understanding of how users are interacting with their applications. It helps teams in five core ways: debugging, providing a unified platform, tracing and evaluating complex agent prompt chains, and reducing the time required to debug and refine prompts. LangSmith's ability to track down to a single API call makes it easier to see how different components fit together. It also provides a simple UI for understanding complex workflows and logging historical data, reducing barriers to entry for those without a software background.", additional_kwargs={}, response_metad