# Build a Tool-Calling Agentic AI Research Assistant with LangChain

This demo will cover building AI Agents with the legacy LangChain `AgentExecutor`. These are fine for getting started, but for working with more advanced agents and having more finer control, LangChain recommends to use LangGraph, which we cover in other courses.

Agents are systems that use an LLM as a reasoning engine to determine which actions to take and what the inputs to those actions should be. The results of those actions can then be fed back into the agent and it determines whether more actions are needed, or whether it is okay to stop.

![](https://i.imgur.com/1uVnBAm.png)



## Install OpenAI, and LangChain dependencies

In [1]:
!pip install -qq langchain==0.3.14
!pip install -qq langchain-openai==0.3.0
!pip install -qq langchain-community==0.3.14

In [2]:
!pip install -qq markitdown

## Enter Open AI API Key

In [None]:
# from getpass import getpass

# OPENAI_KEY = getpass('Enter Open AI API Key: ')

Enter Open AI API Key: ··········


## Enter Tavily Search API Key

Get a free API key from [here](https://tavily.com/#api)

In [None]:
# TAVILY_API_KEY = getpass('Enter Tavily Search API Key: ')

Enter Tavily Search API Key: ··········


## Enter WeatherAPI API Key

Get a free API key from [here](https://www.weatherapi.com/signup.aspx)

In [None]:
# WEATHER_API_KEY = getpass('Enter WeatherAPI API Key: ')

Enter WeatherAPI API Key: ··········


## Setup Environment Variables

In [None]:
# import os

# os.environ['OPENAI_API_KEY'] = OPENAI_KEY
# os.environ['TAVILY_API_KEY'] = TAVILY_API_KEY

In [3]:
import os
from dotenv import load_dotenv
load_dotenv()

True

## Create Tools

Here we create two custom tools which are wrappers on top of the [Tavily API](https://tavily.com/#api) and [WeatherAPI](https://www.weatherapi.com/)

- Web Search tool with information extraction
- Weather tool

![](https://i.imgur.com/TyPAYXE.png)

In [38]:
WEATHER_API_KEY = os.getenv('WEATHER_API_KEY')

In [39]:
from langchain_core.tools import tool
from markitdown import MarkItDown
from langchain_community.tools.tavily_search import TavilySearchResults
from tqdm import tqdm
from concurrent.futures import ThreadPoolExecutor, TimeoutError
import requests
import json
from warnings import filterwarnings
filterwarnings('ignore')

tavily_tool = TavilySearchResults(max_results=5,
                                  search_depth='advanced',
                                  include_answer=False,
                                  include_raw_content=True)
# certain websites won't let you crawl them unless you specify a user-agent
session = requests.Session()
session.headers.update({
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36",
    "Accept-Language": "en-US,en;q=0.9",
    "Accept-Encoding": "gzip, deflate, br"
})
md = MarkItDown(requests_session=session)

@tool
def search_web_extract_info(query: str) -> list:
    """Search the web for a query and extracts useful information from the search links."""
    print('Calling web search tool')
    results = tavily_tool.invoke(query)
    docs = []

    def extract_content(url):
        """Helper function to extract content from a URL."""
        extracted_info = md.convert(url)
        text_title = extracted_info.title.strip()
        text_content = extracted_info.text_content.strip()
        return text_title + '\n' + text_content
    # parallelize execution of different urls
    with ThreadPoolExecutor() as executor:
        for result in tqdm(results):
            try:
                future = executor.submit(extract_content, result['url'])
                # Wait for up to 15 seconds for the task to complete
                content = future.result(timeout=15)
                docs.append(content)
            except TimeoutError:
                print(f"Extraction timed out for url: {result['url']}")
            except Exception as e:
                print(f"Error extracting from url: {result['url']} - {e}")

    return docs


@tool
def get_weather(query: str) -> list:
    """Search weatherapi to get the current weather of the queried location."""
    print('Calling weather tool')
    base_url = "http://api.weatherapi.com/v1/current.json"
    complete_url = f"{base_url}?key={WEATHER_API_KEY}&q={query}"

    response = requests.get(complete_url)
    data = response.json()
    if data.get("location"):
        return data
    else:
        return "Weather Data Not Found"

In [40]:
get_weather.invoke("weather in kolkata now")

Calling weather tool


{'location': {'name': 'Kolkata',
  'region': 'West Bengal',
  'country': 'India',
  'lat': 22.5697,
  'lon': 88.3697,
  'tz_id': 'Asia/Kolkata',
  'localtime_epoch': 1752708144,
  'localtime': '2025-07-17 04:52'},
 'current': {'last_updated_epoch': 1752707700,
  'last_updated': '2025-07-17 04:45',
  'temp_c': 28.4,
  'temp_f': 83.1,
  'is_day': 0,
  'condition': {'text': 'Mist',
   'icon': '//cdn.weatherapi.com/weather/64x64/night/143.png',
   'code': 1030},
  'wind_mph': 7.4,
  'wind_kph': 11.9,
  'wind_degree': 168,
  'wind_dir': 'SSE',
  'pressure_mb': 1004.0,
  'pressure_in': 29.65,
  'precip_mm': 0.0,
  'precip_in': 0.0,
  'humidity': 94,
  'cloud': 75,
  'feelslike_c': 34.8,
  'feelslike_f': 94.6,
  'windchill_c': 27.3,
  'windchill_f': 81.2,
  'heatindex_c': 32.1,
  'heatindex_f': 89.8,
  'dewpoint_c': 25.2,
  'dewpoint_f': 77.4,
  'vis_km': 3.2,
  'vis_miles': 1.0,
  'uv': 0.0,
  'gust_mph': 11.5,
  'gust_kph': 18.5}}

## Test Tool Calling with LLM

In [5]:
from langchain_openai import ChatOpenAI

chatgpt = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [search_web_extract_info, get_weather]

chatgpt_with_tools = chatgpt.bind_tools(tools)

In [6]:
prompt = "Get details of Microsoft's earnings call Q4 2024"
response = chatgpt_with_tools.invoke(prompt)
response.tool_calls

[{'name': 'search_web_extract_info',
  'args': {'query': 'Microsoft earnings call Q4 2024 details'},
  'id': 'call_GV8xEXfSjode2c5fseFaqwW7',
  'type': 'tool_call'}]

In [7]:
prompt = "how is the weather in Bangalore today"
response = chatgpt_with_tools.invoke(prompt)
response.tool_calls

[{'name': 'get_weather',
  'args': {'query': 'Bangalore'},
  'id': 'call_ZbP60Z86ldINMwKOZoXQPiUS',
  'type': 'tool_call'}]

## Build and Test AI Agent

Now that we have defined the tools and the LLM, we can create the agent. We will be using a tool calling agent to bind the tools to the agent with a prompt. We will also add in the capability to store historical conversations as memory

In [8]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

SYS_PROMPT = """Act as a helpful assistant.
                You run in a loop of Thought, Action, PAUSE, Observation.
                At the end of the loop, you output an Answer.
                Use Thought to describe your thoughts about the question you have been asked.
                Use Action to run one of the actions available to you - then return PAUSE.
                Observation will be the result of running those actions.
                Repeat till you get to the answer for the given user query.

                Use the following workflow format:
                  Question: the input task you must solve
                  Thought: you should always think about what to do
                  Action: the action to take which can be any of the following:
                            - break it into smaller steps if needed
                            - see if you can answer the given task with your trained knowledge
                            - call the most relevant tools at your disposal mentioned below in case you need more information
                  Action Input: the input to the action
                  Observation: the result of the action
                  ... (this Thought/Action/Action Input/Observation can repeat N times)
                  Thought: I now know the final answer
                  Final Answer: the final answer to the original input question

                Tools at your disposal to perform tasks as needed:
                  - get_weather: whenever user asks get the weather of a place.
                  - search_web_extract_info: whenever user asks for specific information or if you don't know the answer.
             """

prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", SYS_PROMPT),
        MessagesPlaceholder(variable_name="history", optional=True),
        ("human", "{query}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

prompt_template.messages

[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], input_types={}, partial_variables={}, template="Act as a helpful assistant.\n                You run in a loop of Thought, Action, PAUSE, Observation.\n                At the end of the loop, you output an Answer.\n                Use Thought to describe your thoughts about the question you have been asked.\n                Use Action to run one of the actions available to you - then return PAUSE.\n                Observation will be the result of running those actions.\n                Repeat till you get to the answer for the given user query.\n\n                Use the following workflow format:\n                  Question: the input task you must solve\n                  Thought: you should always think about what to do\n                  Action: the action to take which can be any of the following:\n                            - break it into smaller steps if needed\n                            - see if you can

Now, we can initalize the agent with the LLM, the prompt, and the tools.

The agent is responsible for taking in input and deciding what actions to take.

REMEMBER the Agent does not execute those actions - that is done by the AgentExecutor

Note that we are passing in the model `chatgpt`, not `chatgpt_with_tools`.

That is because `create_tool_calling_agent` will call `.bind_tools` for us under the hood.

This should ideally be used with an LLM which supports tool \ function calling

In [9]:
from langchain.agents import create_tool_calling_agent

chatgpt = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [search_web_extract_info, get_weather]
agent = create_tool_calling_agent(chatgpt, tools, prompt_template)

Finally, we combine the `agent` (the brains) with the `tools` inside the `AgentExecutor` (which will repeatedly call the agent and execute tools).

In [14]:
from langchain.agents import AgentExecutor

agent_executor = AgentExecutor(agent=agent,
                               tools=tools,
                               early_stopping_method='force',
                               max_iterations=5)

In [15]:
query = """Summarize the key points discussed in Nvidia's Q4 2024 earnings call"""
response = chatgpt.invoke(query)
print(response.content)

As of my last update, Nvidia's Q4 2024 earnings call has not occurred. However, I can provide a general idea of what typically might be discussed in such a call based on past earnings calls. Key points usually include:

1. **Financial Performance**: Discussion of revenue, net income, and earnings per share compared to previous quarters and year-over-year. This includes insights into the performance of different business segments such as gaming, data center, professional visualization, and automotive.

2. **Market Trends**: Analysis of current market conditions affecting Nvidia's business, including demand for GPUs, AI, and data center products.

3. **Product Updates**: Announcements or updates on new products, technologies, or services, and their expected impact on future growth.

4. **Strategic Initiatives**: Information on strategic partnerships, acquisitions, or investments that Nvidia is pursuing to enhance its market position.

5. **Guidance**: Forward-looking statements regarding

In [27]:
query = """Summarize the key points discussed in latest Databricks DAIS 2025"""
response = agent_executor.invoke({"query": query})

Calling web search tool


 40%|████      | 2/5 [00:01<00:02,  1.29it/s]Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
 60%|██████    | 3/5 [00:03<00:02,  1.35s/it]

Error extracting from url: https://www.databricks.com/dataaisummit - 'NoneType' object has no attribute 'strip'


Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
 80%|████████  | 4/5 [00:06<00:02,  2.04s/it]

Error extracting from url: https://www.databricks.com/blog/mosaic-ai-announcements-data-ai-summit-2025 - 'NoneType' object has no attribute 'strip'


Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
100%|██████████| 5/5 [00:07<00:00,  1.50s/it]


Error extracting from url: https://www.youtube.com/watch?v=Kqx4eeSDtAI - 'NoneType' object has no attribute 'strip'


In [28]:
response

{'query': 'Summarize the key points discussed in latest Databricks DAIS 2025',
 'output': "The Databricks Data + AI Summit 2025 highlighted several key innovations and strategic insights that are shaping the future of data and analytics. Here are the main points:\n\n1. **Strategic Implications for Data Leaders**:\n   - Acceleration of insight delivery through tools like Databricks One, Genie AI/BI, and Agent Bricks.\n   - Enhanced semantic consistency with Unity Catalog Metrics.\n   - Empowerment of self-service analytics via Lakeflow Designer and Databricks Apps.\n   - Streamlined data modernization with Lakebridge and Lakebase Architecture.\n   - Strategic resource allocation through no-code and low-code platforms.\n\n2. **Major Announcements**:\n   - **Lakebase Architecture**: A serverless, fully managed Postgres-compatible OLTP database integrated into the lakehouse.\n   - **Genie AI/BI and Deep Research**: Offering conversational analytics and multi-turn reasoning for deep analysi

In [29]:
from IPython.display import display, Markdown

display(Markdown(response['output']))

The Databricks Data + AI Summit 2025 highlighted several key innovations and strategic insights that are shaping the future of data and analytics. Here are the main points:

1. **Strategic Implications for Data Leaders**:
   - Acceleration of insight delivery through tools like Databricks One, Genie AI/BI, and Agent Bricks.
   - Enhanced semantic consistency with Unity Catalog Metrics.
   - Empowerment of self-service analytics via Lakeflow Designer and Databricks Apps.
   - Streamlined data modernization with Lakebridge and Lakebase Architecture.
   - Strategic resource allocation through no-code and low-code platforms.

2. **Major Announcements**:
   - **Lakebase Architecture**: A serverless, fully managed Postgres-compatible OLTP database integrated into the lakehouse.
   - **Genie AI/BI and Deep Research**: Offering conversational analytics and multi-turn reasoning for deep analysis.
   - **Databricks One**: A unified UI for business users to access various tools without technical friction.
   - **Unity Catalog Metrics**: Supports centrally defined business metrics and governance.
   - **Lakeflow Designer**: An AI-powered, no-code ETL builder.
   - **Lakebridge Migration Framework**: An open-source toolkit for data warehouse migration.
   - **Agent Bricks**: A framework for creating production-grade AI agents.
   - **Databricks Apps**: Enables secure app development with built-in governance.
   - **Free Edition of Databricks**: Offers core capabilities without initial cost, supported by a $100M investment.

3. **Overall Direction**:
   - The summit emphasized a shift towards unified, intuitive platforms powered by AI, aiming to simplify experiences, ensure governed access, and enhance intelligent data consumption.

These announcements and strategic directions reflect Databricks' commitment to democratizing data and AI, reducing fragmentation, and enabling more intelligent and efficient data-driven decision-making.

In [30]:
query = """Summarize the key points discussed in Snowflake latest Summit"""
response = agent_executor.invoke({"query": query})
display(Markdown(response['output']))

Calling web search tool


 40%|████      | 2/5 [00:03<00:04,  1.53s/it]Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.
 60%|██████    | 3/5 [00:04<00:02,  1.30s/it]

Error extracting from url: https://www.ahead.com/resources/snowflake-summit-2023-ahead-takeaways/ - 'NoneType' object has no attribute 'strip'


100%|██████████| 5/5 [00:10<00:00,  2.10s/it]


The Snowflake Summit 2023 focused on several key areas, including new features, partnerships, and community engagement. Here are the main highlights:

1. **New Features and Announcements**:
   - Introduction of the Native Applications Framework, Managed Iceberg Tables, Snowpark Container Services, Dynamic Tables, Snowpipe Streaming API, Document AI, Snowflake Copilot, SnowflakeBudgets, Warehouse Utilization, and improvements in GeoSpatial and Time Series Analytics.
   - Emphasis on AI and large language models (LLM), with a notable partnership between Snowflake and Nvidia to integrate AI capabilities into the Data Cloud.

2. **Community and Networking**:
   - Recognition of the Snowflake Data Superhero community and their contributions.
   - Opportunities for networking with industry peers, partners, and Snowflake founders.

3. **Industry Solutions and Use Cases**:
   - Focus on leveraging Snowflake to solve specific business and industry problems.
   - Encouragement for existing clients to explore new solutions beyond traditional data warehousing.

4. **Developer and Partner Engagement**:
   - Emphasis on diversifying skills and knowledge in areas like LLM, Python, and Data Apps.
   - Recognition of partners like kipi.bi, which received the Americas System Integrator Growth: Partner of the Year Award.

5. **Event Experience**:
   - The summit was larger and better organized than previous years, with over 12,000 attendees and more than 500 sessions.
   - Enhanced opportunities for hands-on labs and partner interactions.

Overall, the Snowflake Summit 2023 was a significant event for showcasing Snowflake's advancements in data cloud technology, AI integration, and community building.

In [33]:
query = """which company's as discussed above future outlook looks to be better?
        """
response = agent_executor.invoke({"query": query})
display(Markdown(response['output']))

Could you please provide more context or specify which companies you are referring to? Without additional information, I won't be able to determine which company's future outlook looks better.

In [41]:
query = """how is the weather in Bangalore today?
           show detailed statistics
        """
response = agent_executor.invoke({"query": query})
display(Markdown(response['output']))

Calling weather tool


The current weather in Bangalore is as follows:

- **Temperature**: 21.3°C (70.3°F)
- **Condition**: Light drizzle
- **Wind**: 9.4 mph (15.1 kph) from the WSW
- **Pressure**: 1011.0 mb (29.85 in)
- **Precipitation**: 0.38 mm (0.01 in)
- **Humidity**: 94%
- **Cloud Cover**: 75%
- **Feels Like**: 21.3°C (70.3°F)
- **Dew Point**: 19.1°C (66.4°F)
- **Visibility**: 5.0 km (3.0 miles)
- **UV Index**: 0.0
- **Wind Gusts**: 14.7 mph (23.6 kph)

The weather is currently overcast with light drizzle, and the humidity is quite high.

In [42]:
query = """how is the weather in Dubai today?
           show detailed statistics
        """
response = agent_executor.invoke({"query": query})
display(Markdown(response['output']))

Calling weather tool


The weather in Dubai today is clear. Here are the detailed statistics:

- **Temperature**: 34.2°C (93.6°F)
- **Feels Like**: 42.2°C (107.9°F)
- **Wind**: 6.3 mph (10.1 kph) from the SSE
- **Wind Gusts**: Up to 12.7 mph (20.4 kph)
- **Humidity**: 50%
- **Pressure**: 997.0 mb (29.44 in)
- **Precipitation**: 0.0 mm (0.0 in)
- **Cloud Cover**: 0%
- **Visibility**: 10.0 km (6.0 miles)
- **UV Index**: 0.0
- **Dew Point**: 23.6°C (74.6°F)

The current conditions are updated as of 03:15 local time.

In [43]:
query = """which city is hotter?
        """
response = agent_executor.invoke({"query": query})
display(Markdown(response['output']))

Question: The question is asking which city is hotter, but it doesn't specify which cities to compare. I need more information to proceed.

Thought: I need to ask the user for the specific cities they want to compare in terms of temperature.

Final Answer: Please provide the names of the cities you would like to compare to determine which one is hotter.

The agent is doing pretty well but unfortunately it doesn't remember conversations. We will use some user-session based memory to store this and dive deeper into this in the next video.