<a href="https://colab.research.google.com/github/jeffheaton/app_generative_ai/blob/main/t81_559_class_07_4_more_agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# T81-559: Applications of Generative Artificial Intelligence
**Module 7: LangChain: Agents**
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

# Module 7 Material

* Part 7.1: Introduction to LangChain Agents [[Video]](https://www.youtube.com/watch?v=J5Vr___lSSs) [[Notebook]](t81_559_class_07_1_agents.ipynb)
* Part 7.2: Understanding LangChain Agent Tools [[Video]](https://www.youtube.com/watch?v=qMquBmteYw4) [[Notebook]](t81_559_class_07_2_tools.ipynb)
* Part 7.3: LangChain Retrival and Search Tools [[Video]](https://www.youtube.com/watch?v=NB5qGPLoBBE) [[Notebook]](t81_559_class_07_3_search_tools.ipynb)
* **Part 7.4: Constructing LangChain Agents** [[Video]](https://www.youtube.com/watch?v=OJe5oHvrdHk) [[Notebook]](t81_559_class_07_4_more_agent.ipynb)
* Part 7.5: Custom Agents [[Video]](https://www.youtube.com/watch?v=IsJemVYSEdc) [[Notebook]](t81_559_class_07_5_custom_agent.ipynb)

# Google CoLab Instructions

The following code ensures that Google CoLab is running and maps Google Drive if needed.

In [15]:
import os

try:
    from google.colab import drive, userdata
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

# OpenAI Secrets
if COLAB:
    os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')
    os.environ["TAVILY_API_KEY"] = userdata.get('TAVILY_API_KEY')

# Install needed libraries in CoLab
if COLAB:
    !pip install langchain langchain_openai langchain_experimental duckduckgo-search langchainhub sentence-transformers faiss-cpu

Note: using Google CoLab


# 7.4: Constructing LangChain Agents


In this chapter, we introduce a comprehensive module that showcases the capabilities of a complete LangChain agent equipped with web search, retrieval, and memory functionalities. This agent is designed to answer questions by leveraging the most up-to-date documentation about LangChain through Retrieval-Augmented Generation (RAG), surpassing the limitations of the foundational model's knowledge. By integrating real-time web search and retrieval, the agent can access and incorporate the latest information, ensuring its responses are accurate and current. Additionally, the memory component enables the agent to retain and recall pertinent information, enhancing its ability to provide informed and contextually relevant answers. Through this module, readers will learn how to create an intelligent agent that not only understands and processes queries but also continually updates its knowledge base to stay aligned with the latest advancements in LangChain.

We begin by importing the necessary libraries and modules required for our LangChain agent, including those for web search, retrieval, and memory management. Just as we have done in previous chapters, we will create an OpenAI language model (LLM) to serve as the foundation of our agent. This model will be pivotal in interpreting and generating human-like text based on the input it receives. By leveraging the OpenAI LLM, we ensure our agent has a robust and versatile language understanding capability, which will be further enhanced by integrating real-time web search and retrieval functionalities. This setup allows the agent to access and utilize the most recent and relevant information, providing comprehensive and accurate responses to user queries.


In [16]:
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.tools.retriever import create_retriever_tool
from langchain.agents import create_tool_calling_agent
from langchain.agents import AgentExecutor

# You can switch this to gpt-5-mini; however OpenAI will require you to validate
# your account.
# MODEL = 'gpt-5-mini'
MODEL = 'gpt-4o-mini'

llm = ChatOpenAI(
        model=MODEL,
        temperature=0.2,
        n=1
    )


In this section, we use Retrieval-Augmented Generation (RAG) to access the latest version of the LangChain documentation, which is more current than the data the foundational model was initially trained on. We start by creating a WebBaseLoader instance to load content from the URL https://docs.smith.langchain.com/overview, ensuring we are working with the most recent documentation. Once the content is loaded, we use RecursiveCharacterTextSplitter to break the text into smaller, manageable chunks of 1000 characters each, with an overlap of 200 characters to maintain context. These chunks are then converted into embeddings using OpenAIEmbeddings(), and we index these embeddings with FAISS (Facebook AI Similarity Search) for efficient similarity searching. Finally, we create a retriever from the FAISS index, enabling our LangChain agent to query the document embeddings and retrieve the most relevant chunks of documentation in response to user questions. This approach ensures that our agent can leverage the latest information, enhancing its ability to provide accurate and up-to-date answers.

In [17]:
loader = WebBaseLoader("https://docs.smith.langchain.com/")
docs = loader.load()
documents = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200
).split_documents(docs)
vector = FAISS.from_documents(documents, OpenAIEmbeddings())
retriever = vector.as_retriever()

We now submit a simple query to the retriever to see what data it returns to answer the quest. The returned data is from the website, not yet processed by a LLM.

In [18]:
retriever.invoke("how to upload a dataset")[0]

Document(id='880936a2-dd57-4377-9580-86f42bd1a325', metadata={'source': 'https://docs.smith.langchain.com/', 'title': 'Get started with LangSmith - Docs by LangChain', 'language': 'en'}, page_content='Start tracingGain visibility into each step your application takes when handling a request to debug faster.Learn moreEvaluate your applicationMeasure quality of your applications over time to build more reliable AI applications.Learn moreTest your promptsIterate on prompts, with automatic version control and collaboration features.Learn moreSet up your workspaceSet up your workspace, configure admin settings, and invite your team to collaborate.Learn moreTrace an applicationAssistantResponses are generated using AI and may contain mistakes.Docs by LangChain home pagegithubxlinkedinyoutubeResourcesChangelogLangChain AcademyTrust CenterCompanyAboutCareersBloggithubxlinkedinyoutubePowered by Mintlify')

To provide access to the web data we just downloaded, we use an agent retriever tool. This tool is created by wrapping the retriever we previously built with the VectorStoreRetrieverTool class, specifying "Documentation" as the name and a brief description to indicate its purpose. By integrating this agent retriever tool, our LangChain agent can effectively utilize the web data for generating responses. The ZeroShotAgent is then configured with this tool, and the agent is initialized with AgentExecutor. This setup empowers our agent to retrieve and process the most relevant information from the latest LangChain documentation, ensuring it can answer queries accurately and comprehensively.

In [19]:
retriever_tool = create_retriever_tool(
    retriever,
    "langsmith_search",
    "Search for information about LangSmith. For any questions about LangSmith, you must use this tool!",
)

We also demonstrate a query to the Tavily search, this will allow us to augment the downloaded data with adhoc web searches to augment both the foundation model and RAG data source.

In [20]:
search = TavilySearchResults()

search.invoke("what is the weather in SF")

[{'title': 'Saturday, August 30, 2025. San Francisco, CA - Weather Forecast',
  'url': 'https://weathershogun.com/weather/usa/ca/san-francisco/480/august/2025-08-30',
  'score': 0.9551898},
 {'title': 'Weather San Francisco in August 2025: Temperature & Climate',
  'url': 'https://en.climate-data.org/north-america/united-states-of-america/california/san-francisco-385/t/august-8/',
  'content': '| Humidity |\n ---  ---  ---  --- |\n| 28. August | Few clouds | 20 °C 68 °F | 15 °C 59 °F | 0 % | 13 km/h 8 mph | 0mm 0 in | 80% |\n| 29. August | Scattered clouds | 19 °C 66.2 °F | 14 °C 57.2 °F | 0 % | 13 km/h 8 mph | 0mm 0 in | 84% |\n| 30. August | Few clouds | 22 °C 71.6 °F | 14 °C 57.2 °F | 0 % | 12 km/h 7 mph | 0mm 0 in | 78% |\n| 31. August | Few clouds | 21 °C 69.8 °F | 14 °C 57.2 °F | 0 % | 13 km/h 8 mph | 0mm 0 in | 80% | [...] | 29. August | 18 °C | 64 °F | 24 °C | 74 °F | 14 °C | 57 °F | 14 °C | 58 °F | 0.0 mm | 0.0 inch. |\n| 30. August | 17 °C | 63 °F | 23 °C | 74 °F | 14 °C | 56

Next we set up the LangChain agent by integrating various tools and configuring the agent to use them effectively. We begin by defining a list of tools that includes our web search tool and the previously created retriever tool. Next, we pull a pre-defined prompt from the LangChain hub using hub.pull("hwchase17/openai-functions-agent"), which provides structured messages that guide the agent's interactions. Using this prompt, we create a tool-calling agent with the create_tool_calling_agent function, passing in our language model (LLM), the list of tools, and the prompt. Finally, we initialize the agent executor with AgentExecutor, specifying the agent and tools, and enabling verbose mode for detailed logging. This setup ensures that our LangChain agent is well-equipped to leverage both the web search and retrieval functionalities to provide accurate and up-to-date responses.

In [21]:
tools = [search, retriever_tool]

# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")
prompt.messages

agent = create_tool_calling_agent(llm, tools, prompt)

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)



We can see how the agent responds to a simple trivial prompt.

In [22]:
agent_executor.invoke({"input": "hi!"})





[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mHello! How can I assist you today?[0m

[1m> Finished chain.[0m


{'input': 'hi!', 'output': 'Hello! How can I assist you today?'}

Next we present it with a question about langsmith, we can see the agent utilizes several tools.

In [23]:
agent_executor.invoke({"input": "how can langsmith help with testing?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `langsmith_search` with `{'query': 'how can LangSmith help with testing'}`


[0m[33;1m[1;3mLangSmith is framework agnostic — you can use it with or without LangChain’s open source frameworks langchain and langgraph.

Get started with LangSmith - Docs by LangChainOur new LangChain Academy Course Deep Research with LangGraph is now live! Enroll for free.Docs by LangChain home pagePythonSearch...⌘KLangSmithPlatform for LLM observability and evaluationOverviewQuickstartsTrace an applicationEvaluate an applicationTest promptsAPI & SDKsAPI referencePython SDKJS/TS SDKPricingPlansPricing FAQOur new LangChain Academy Course Deep Research with LangGraph is now live! Enroll for free.Docs by LangChain home pagePythonSearch...⌘KAsk AIForumForumSearch...NavigationGet started with LangSmithGet startedObservabilityEvaluationPrompt engineeringSelf-hostingAdministrationGet startedObservabilityEvaluationPrompt engineeringSelf-hos

{'input': 'how can langsmith help with testing?',
 'output': 'LangSmith can assist with testing in several ways:\n\n1. **Testing Prompts**: LangSmith allows you to iterate on prompts with features like automatic version control and collaboration tools. This helps in refining and improving the prompts used in your applications.\n\n2. **Application Evaluation**: You can measure the quality of your applications over time, which helps in building more reliable AI applications. This evaluation process is crucial for ensuring that your application meets the desired performance standards.\n\n3. **Tracing Applications**: LangSmith provides visibility into each step your application takes when handling a request. This feature aids in debugging and helps you understand how your application behaves in real-time.\n\nOverall, LangSmith is designed to enhance the observability and evaluation of LLM (Large Language Model) applications, making it easier to test and ensure their reliability.'}

We can also ask it a current events question, causing the agent to go to the search tool.

In [24]:
agent_executor.invoke({"input": "whats the weather in sf?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `tavily_search_results_json` with `{'query': 'current weather in San Francisco'}`



In late August, San Francisco usually experiences mild temperatures, often ranging from the mid-60s to low 70s Fahrenheit (about 18-24°C). The weather can be foggy in the mornings, with clearer skies in the afternoons. 

For the most accurate and up-to-date weather information, I recommend checking a reliable weather website or app. Would you like me to look up more specific information?[0m

[1m> Finished chain.[0m


{'input': 'whats the weather in sf?',
 'output': 'It seems that the search results did not provide the current weather for San Francisco. However, I can provide a general idea of what the weather is typically like in San Francisco around this time of year.\n\nIn late August, San Francisco usually experiences mild temperatures, often ranging from the mid-60s to low 70s Fahrenheit (about 18-24°C). The weather can be foggy in the mornings, with clearer skies in the afternoons. \n\nFor the most accurate and up-to-date weather information, I recommend checking a reliable weather website or app. Would you like me to look up more specific information?'}

## Adding Chat History/Memory

To enhance our LangChain agent's capabilities, we are adding chat memory to enable it to remember previous interactions and maintain context across a conversation. This addition allows the agent to provide more coherent and contextually aware responses. We start by importing necessary modules for managing chat message histories. We then create an instance of ChatMessageHistory to store the messages exchanged during the conversation. By wrapping our existing agent executor with RunnableWithMessageHistory, we integrate the chat memory functionality. This setup ensures that each session can maintain a history of messages, facilitating smoother and more natural interactions with the agent. The code snippet below illustrates this implementation:

In [25]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

message_history = ChatMessageHistory()

agent_with_chat_history = RunnableWithMessageHistory(
    agent_executor,
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    lambda session_id: message_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

We will now test the memory, I begin by introducing myself.

In [26]:
agent_with_chat_history.invoke(
    {"input": "hi! I'm Jeff"},
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    config={"configurable": {"session_id": "x123"}},
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mHi Jeff! How can I assist you today?[0m

[1m> Finished chain.[0m


{'input': "hi! I'm Jeff",
 'chat_history': [],
 'output': 'Hi Jeff! How can I assist you today?'}

Next I ask it a question, we can see that it recalls who I am.

In [27]:
agent_with_chat_history.invoke(
    {"input": "how can LangSmith help with testing"},
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    config={"configurable": {"session_id": "x123"}},
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `langsmith_search` with `{'query': 'LangSmith testing'}`


[0m[33;1m[1;3mLangSmith is framework agnostic — you can use it with or without LangChain’s open source frameworks langchain and langgraph.

Get started with LangSmith - Docs by LangChainOur new LangChain Academy Course Deep Research with LangGraph is now live! Enroll for free.Docs by LangChain home pagePythonSearch...⌘KLangSmithPlatform for LLM observability and evaluationOverviewQuickstartsTrace an applicationEvaluate an applicationTest promptsAPI & SDKsAPI referencePython SDKJS/TS SDKPricingPlansPricing FAQOur new LangChain Academy Course Deep Research with LangGraph is now live! Enroll for free.Docs by LangChain home pagePythonSearch...⌘KAsk AIForumForumSearch...NavigationGet started with LangSmithGet startedObservabilityEvaluationPrompt engineeringSelf-hostingAdministrationGet startedObservabilityEvaluationPrompt engineeringSelf-hostingAdministration

{'input': 'how can LangSmith help with testing',
 'chat_history': [HumanMessage(content="hi! I'm Jeff", additional_kwargs={}, response_metadata={}),
  AIMessage(content='Hi Jeff! How can I assist you today?', additional_kwargs={}, response_metadata={})],
 'output': 'LangSmith is a platform designed to assist with testing and evaluating applications built with large language models (LLMs). Here are some key features that can help with testing:\n\n1. **Application Tracing**: LangSmith allows you to gain visibility into each step your application takes when handling a request. This helps in debugging and understanding the flow of your application.\n\n2. **Application Evaluation**: You can measure the quality of your applications over time, which is essential for building reliable AI applications. This feature helps ensure that your application meets performance standards.\n\n3. **Prompt Testing**: LangSmith provides tools to iterate on prompts, including automatic version control and coll

We can further review and test the memory.

In [28]:
agent_with_chat_history.invoke(
    {"input": "Who am I? What did I just ask you about?!"},
    # This is needed because in most real world scenarios, a session id is needed
    # It isn't really used here because we are using a simple in memory ChatMessageHistory
    config={"configurable": {"session_id": "x123"}},
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mYou are Jeff, and you just asked about how LangSmith can help with testing. If you have any more questions or need further information, feel free to ask![0m

[1m> Finished chain.[0m


{'input': 'Who am I? What did I just ask you about?!',
 'chat_history': [HumanMessage(content="hi! I'm Jeff", additional_kwargs={}, response_metadata={}),
  AIMessage(content='Hi Jeff! How can I assist you today?', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='how can LangSmith help with testing', additional_kwargs={}, response_metadata={}),
  AIMessage(content='LangSmith is a platform designed to assist with testing and evaluating applications built with large language models (LLMs). Here are some key features that can help with testing:\n\n1. **Application Tracing**: LangSmith allows you to gain visibility into each step your application takes when handling a request. This helps in debugging and understanding the flow of your application.\n\n2. **Application Evaluation**: You can measure the quality of your applications over time, which is essential for building reliable AI applications. This feature helps ensure that your application meets performance standard