# Project 3: **Ask‚Äëthe‚ÄëWeb Agent**

Welcome to Project‚ÄØ3! In this project, you will learn how to use tool‚Äëcalling LLMs, extend them with custom tools, and build a simplified *Perplexity‚Äëstyle* agent that answers questions by searching the web.

## Learning Objectives  
* Understand why tool calling is useful and how LLMs can invoke external tools.
* Implement a minimal loop that parses the LLM's output and executes a Python function.
* See how *function schemas* (docstrings and type hints) let us scale to many tools.
* Use **LangChain** to get function‚Äëcalling capability
* Combine LLM with a web‚Äësearch tool to build a simple ask‚Äëthe‚Äëweb agent.
* Connect to external tools using **MCP (Model Context Protocol)**, a universal standard for LLM‚Äëtool integration.
* Optionally build a UI using Chainlit to test your agent.

## Roadmap
0. Environment setup
1. Write simple tools and connect them to an LLM
2. Standardize tool calling with JSON schemas
3. Use LangGraph for tool calling
4. Build a Perplexity-style web-search agent
5. (Optional) MCP: connect to external tool servers
6. (Optional) A minimal UI

# 0- Environment setup

### Step 1: Create your environment and install dependencies 
Before we start coding, you need a reproducible setup. Open a terminal in the same directory as this notebook, and use Conda or uv to install the project dependencies.

#### Option 1: Conda


```bash
# Create and activate the conda environment
conda env create -f environment.yml && conda activate web_agent

```

#### Option 2: UV (faster)

If you prefer [uv](https://docs.astral.sh/uv/) over Conda:

```bash
# Install uv (skip if already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create venv and install dependencies
uv venv .venv-web-agent-uv && source .venv-web-agent-uv/bin/activate
uv pip install -r requirements.txt
```

### Step 2: Register this environment as a Jupyter kernel
```bash
python -m ipykernel install --user --name=web_agent --display-name "web_agent"
```
Now open your notebook and switch to the `web_agent` kernel (Kernel ‚Üí Change Kernel).

### Step 3: Set up Ollama

In this project, we use **Ollama** to load and use open-weight LLMs. We start with smaller models like `gemma3:1b` and then switch to larger models like `llama3.2:3b`.

Start the **Ollama** server in a terminal. This launches a local API endpoint that listens for LLM requests.

```bash
ollama serve
```

Downloads the model so you can run them locally without API calls. 
```bash
ollama pull gemma3:1b
ollama pull llama3.2:3b
```

You can explore other available models [here](https://ollama.com/library) and pull them to experiment with.

In [1]:
# Quick check: is Ollama running?
# If this fails, open a terminal and run: ollama serve

import httpx

response = httpx.get("http://localhost:11434/api/tags", timeout=5)
models = [m["name"] for m in response.json().get("models", [])]
print(f"Ollama is running. Installed models: {models}")

Ollama is running. Installed models: ['llama3.2:3b', 'gemma3:1b']


## 1- Tool¬†Calling

LLMs are strong at answering questions, but they cannot directly access external data such as live web results, APIs, or computations. In real applications, agents rarely rely only on their internal knowledge. They need to query APIs, retrieve data, or perform calculations to stay accurate and useful. Tool calling bridges this gap by allowing the LLM to request actions from the outside world.

<img src="assets/tools.png" width="700">

As show below, We first implement a tool, then describe the tool as part of the model's prompt. When the model decides that a tool is needed, it emits a structured output. A parser will detect this output, execute the corresponding function, and feed the result back to the LLM so the conversation continues.

<img src="assets/tool_flow.png" width="700">

In this section, you will implement a simple `get_current_weather` function and teach the `gemma3:1b` model to use it when required.

In [2]:
# ---------------------------------------------------------
# Step 1: Implement the tool
# ---------------------------------------------------------
# You can either:
#   (a) Call a real weather API (for example, OpenWeatherMap), or
#   (b) Create a dummy function that returns a fixed response (e.g., "It is 23¬∞C and sunny in San Francisco.")
#
# Output:
#   ‚Ä¢ Return a short, human-readable sentence describing the weather.
#
# Example expected behavior:
#   get_current_weather("San Francisco") ‚Üí "It is 23¬∞C and sunny in San Francisco."
#

def get_current_weather(location: str) -> str:
    return f"It is 23¬∞C and sunny in {location}."

In [18]:
# ----------------------------------------------------------------------
# Step 2: Create a prompt to teach the LLM when and how to use your tool
# ----------------------------------------------------------------------
# What to include:
#   ‚Ä¢ A SYSTEM_PROMPT that tells the model about the tool use and describes the tool
#   ‚Ä¢ A USER_QUESTION with a user query that should trigger the tool.
#       Example: "What is the weather in San Diego today?"

SYSTEM_PROMPT = """
You are a helpful assistant that can answer questions about the weather. You have access to the following tool:
Tool: get_current_weather(location)
Description: This tool takes a location as input and returns a short, human-readable sentence describing the current weather in that location.
To use the tool, respond with a message in the following format:
TOOL_CALL: {"name": "get_current_weather", "args": {"location": "<location_name>"}}
"""

USER_QUESTION = "What is the weather in San Diego today?"


Now that you have defined a tool and shown the model how to use it, the next step is to call the LLM using your prompt.

In [19]:
# ---------------------------------------------------------
# Step 3: Call the LLM with your prompt
# ---------------------------------------------------------
# Task:
#   Send SYSTEM_PROMPT + USER_QUESTION to the model.
#
# Steps:
#   1. Create an Ollama client
#   2. Use chat.completions.create to send your prompt to gemma3:1b
#   3. Print the response.
#
# Expected:
#   The model should return something like:
#   TOOL_CALL: {"name": "get_current_weather", "args": {"city": "San Diego"}}
# ---------------------------------------------------------
from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='gemma3:1b', messages=[
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": USER_QUESTION}
])
print("LLM response:", response)
print(response['message']['content'])

LLM response: model='gemma3:1b' created_at='2026-02-06T15:35:23.133641Z' done=True done_reason='stop' total_duration=3541010000 load_duration=201637041 prompt_eval_count=125 prompt_eval_duration=1566408792 eval_count=26 eval_duration=1752033835 message=Message(role='assistant', content='TOOL_CALL: {"name": "get_current_weather", "args": {"location": "San Diego"}}\n', thinking=None, images=None, tool_name=None, tool_calls=None) logprobs=None
TOOL_CALL: {"name": "get_current_weather", "args": {"location": "San Diego"}}



In [28]:
# ---------------------------------------------------------
# Step 4: Manually parse the LLM output and call the tool
# ---------------------------------------------------------
# Task:
#   Detect when the model requests a tool, extract its name and arguments,
#   and execute the corresponding function.
#
# Steps:
#   1. Search for the text pattern "TOOL_CALL:{...}" in the model output.
#   2. Parse the JSON inside it to get the tool name and args.
#   3. Call the matching function (e.g., get_current_weather).
#
# Expected:
#   You should see a line like:
#       Calling tool `get_current_weather` with args {'city': 'San Diego'}
#       Result: It is 23¬∞C and sunny in San Diego.
# ---------------------------------------------------------

import re, json

tool_call_pattern = r"TOOL_CALL:\s*(\{.*\})"

def call_tool(response_content: str):
  match = re.search(tool_call_pattern, response_content)
  if match:
      tool_call_json = match.group(1)
      tool_call = json.loads(tool_call_json)
      tool_name = tool_call['name']
      tool_args = tool_call['args']
      
      if tool_name == "get_current_weather":
          return get_current_weather(**tool_args)
  else:
      return None

res = call_tool(response['message']['content'])
print("Result:", res)

Result: It is 23¬∞C and sunny in San Diego.
Done. It is 23¬∞C and sunny in San Diego.


# 2- Standardize tool calling

So far, we handled tool calling manually by writing a function, manually teaching the LLM about it, and write a regex to parse the output. This approach does not scale if we want to add more tools. Adding more tools would mean more `if/else` blocks and manual edits to the prompt.

To make the system flexible, we can standardize tool definitions by automatically reading each function's signature, converting it to a JSON schema, and passing that schema to the LLM. This way, the LLM can dynamically understand which tools exist and how to call them without requiring manual updates to prompts or conditional logic.

Next, you will implement a small helper that extracts metadata from functions and builds a schema for each tool.

In [50]:
# ---------------------------------------------------------
# Generate a JSON schema for a tool automatically
# ---------------------------------------------------------
#
# Steps:
#   1. Rewrite the get_current_weather function with docstring and arg types
#   2. Use `inspect.signature` to automatically get function parameters and docstring
#   2. For each argument, record its name, type, and description.
#   3. Build a schema containing: name, description, and parameters.
#   4. Test your helper on `get_current_weather` and print the result.
#
# Expected:
#   A dictionary describing the tool (its name, args, and types).
# ---------------------------------------------------------

from pprint import pprint
import inspect
import json

def get_current_weather(city: str, unit: str = "celsius") -> str:
    """
    Get the current weather for a given city.
    
    Args:
        city (str): The city to get the weather for.
        unit (str, optional): The unit of temperature. Defaults to "celsius".
    
    Returns:
        str: A string describing the current weather.
    """
    return f"It is 23¬∞C and sunny in {city}."

def to_schema(fn):
    sig = inspect.signature(fn)
    doc = inspect.getdoc(fn)
    
    args_section = re.search(r"Args:\n(.*?)(?:\n\n|$)", doc, re.S)
    lines = [ln.strip() for ln in args_section.group(1).splitlines() if ln.strip()]
    descriptions = {n.split(" ")[0]: d.strip() for n, d in (ln.split(":", 1) for ln in lines)}
        
    data = {
        "name": fn.__name__,
        "description": doc.split("\n")[0] if doc else "",
        "parameters": [{"name": n, "type": str(t.annotation.__name__), "description": descriptions.get(n, "")} for (n, t) in sig.parameters.items()]
    }
    return data

tool_schema = to_schema(get_current_weather)
pprint(tool_schema)

{'description': 'Get the current weather for a given city.',
 'name': 'get_current_weather',
 'parameters': [{'description': 'The city to get the weather for.',
                 'name': 'city',
                 'type': 'str'},
                {'description': 'The unit of temperature. Defaults to '
                                '"celsius".',
                 'name': 'unit',
                 'type': 'str'}]}


In [57]:
# ---------------------------------------------------------
# Provide the tool schema to the model instead of prompt surgery
# ---------------------------------------------------------
# Goal:
#   Give the model a "menu" of available tools so it can choose
#   which one to call based on the user‚Äôs question.
#
# Steps:
#   1. Add an extra system message (e.g., name="tool_spec")
#      containing the JSON schema(s) of your tools.
#   2. Include SYSTEM_PROMPT and the user question as before.
#   3. Send the messages to the model (e.g., gemma3:1b).
#   4. Print the model output to see if it picks the right tool.
#
# Expected:
#   The model should produce a structured TOOL_CALL indicating
#   which tool to use and with what arguments.
# ---------------------------------------------------------

tool_desc = json.dumps(to_schema(get_current_weather), indent=2)

print((tool_desc))

SYSTEM_PROMPT = f"""
You are a helpful assistant that can answer questions about the weather. You have access to the following tool:
{tool_desc}
To use the tool, respond with a message in the following format:
TOOL_CALL: {{"name": "<tool_name>", "args": <tool_args_json>}}
"""

from ollama import chat
from ollama import ChatResponse

response: ChatResponse = chat(model='gemma3:1b', messages=[
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user", "content": USER_QUESTION}
])
print("LLM response:", response)
print(response['message']['content'])

{
  "name": "get_current_weather",
  "description": "Get the current weather for a given city.",
  "parameters": [
    {
      "name": "city",
      "type": "str",
      "description": "The city to get the weather for."
    },
    {
      "name": "unit",
      "type": "str",
      "description": "The unit of temperature. Defaults to \"celsius\"."
    }
  ]
}
LLM response: model='gemma3:1b' created_at='2026-02-06T16:35:22.270271Z' done=True done_reason='stop' total_duration=10494991875 load_duration=1108065166 prompt_eval_count=199 prompt_eval_duration=7774138917 eval_count=33 eval_duration=1523282918 message=Message(role='assistant', content='TOOL_CALL: [{"name": "get_current_weather", "args": {"city": "San Diego", "unit": "celsius"}}]', thinking=None, images=None, tool_name=None, tool_calls=None) logprobs=None
TOOL_CALL: [{"name": "get_current_weather", "args": {"city": "San Diego", "unit": "celsius"}}]


## 3- LangChain for Tool Calling

So far, you built a simple tool-calling pipeline. While this helps you understand the logic, it does not scale well when working with multiple tools, complex parsing, or multi-step reasoning. We have to write manual parsers, function calling logic, and adding responses back to the prompt.

LangChain simplifies this process. You only need to declare your tools, and its *Agent* abstraction handles when to call a tool, how to use it, and how to continue reasoning afterward. In this section, you will create a **ReAct** Agent (Reasoning + Acting). As shown below, the model alternates between reasoning steps and tool use wihtout any manual work.

<img src="assets/react.png" width="500">

The following links might be helpful for completing this section:
- [Create Agents](https://docs.langchain.com/oss/python/langchain/agents)
- [LangChain Tools](https://docs.langchain.com/oss/python/langchain/tools)
- [Ollama](https://docs.langchain.com/oss/python/integrations/chat/ollama)

In [61]:
# ---------------------------------------------------------
# Step 1: Define tools for LangChain
# ---------------------------------------------------------
# Steps:
#   1. Keep your existing `get_current_weather` function as before.
#   2. Create a new function (e.g., get_weather) that calls it.
#   3. Add the `@tool` decorator so LangChain can register it automatically.
#
# Notes:
#   ‚Ä¢ The decorator converts your Python function into a standardized tool object.

from langchain_core.tools import tool

@tool
def get_weather(location: str) -> str:
    """
    Get the current weather for a given location.
    
    Args:
        location (str): The location to get the weather for.
    """
    return get_current_weather(location, unit="celsius")

In [66]:
# ---------------------------------------------------------
# Step 2: Create the Agent
# ---------------------------------------------------------
# Steps:
#   1. Create a gemma3:1b LLM instance 
#   2. Create the agent using create_agent
#   3. Test the agent with a natural question using agent.invoke

from langchain_ollama import ChatOllama
from langchain.agents import create_agent

llm = ChatOllama(model="llama3.2:3b")
agent = create_agent(llm, tools=[get_weather])

response = agent.invoke({"messages": [{"role": "user", "content": "What is the weather in San Diego today?"}]})
print("Agent response:", response)


Agent response: {'messages': [HumanMessage(content='What is the weather in San Diego today?', additional_kwargs={}, response_metadata={}, id='fde3b309-2bf4-4d00-84b1-a4a7347a6bce'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.2:3b', 'created_at': '2026-02-06T16:49:41.862701Z', 'done': True, 'done_reason': 'stop', 'total_duration': 22615143334, 'load_duration': 2316782667, 'prompt_eval_count': 169, 'prompt_eval_duration': 17551987500, 'eval_count': 18, 'eval_duration': 2728353874, 'logprobs': None, 'model_name': 'llama3.2:3b', 'model_provider': 'ollama'}, id='lc_run--019c33db-a78e-7531-a27d-edc7ede5894e-0', tool_calls=[{'name': 'get_weather', 'args': {'location': 'San Diego'}, 'id': '506bcdcc-f3a4-4f6f-879e-3a20079707cb', 'type': 'tool_call'}], invalid_tool_calls=[], usage_metadata={'input_tokens': 169, 'output_tokens': 18, 'total_tokens': 187}), ToolMessage(content='It is 23¬∞C and sunny in San Diego.', name='get_weather', id='1ce69b4c-62bb-4243-9242

### What just happened?
Your run failed because `gemma3:1b` does not support native tool calling (function calling). LangChain expects the model to return a structured tool-call object, but `gemma3:1b` can only return plain text, so the tool invocation step breaks.

### Why previosuly, our manual approach worked with any model?

In previous sections, we used **text-based tool calling**. We described the tool format in the system prompt. We asked the model to output `TOOL_CALL: {"name": ..., "args": ...}`. We then parsed this text with regex.

This works with **any model** (even small ones like `gemma3:1b`) because we're just asking the model to follow a certain structured output format.

### Why LangChain requires specific models?

LangChain relies on **native tool calling** and it expects a consistent structured output format irrespective of the model. Hence, it enfornces model outputs structured tool calls in a specific format. This requires models trained specifically for function calling

**Rule of thumb**: Models under 3B parameters typically lack native tool-calling capability.

| Model | Size | Native Tool Support | Notes |
|-------|------|---------------------|-------|
| `gemma3:1b` | 1B | No | Works for manual approach only |
| `llama3.2:1b` | 1B | No | Works for manual approach only |
| `llama3.2:3b` | 3B | Yes | Good balance of speed and capability |
| `gemma3` | 4B | Yes | Supports native tools |
| `mistral` | 7B | Yes | Strong tool support |

Let's fix the issue we observed in the previous cell.



In [67]:
# ---------------------------------------------------------
# Step 2 (retry): Re-create the Agent with a native tool-calling LLM
# ---------------------------------------------------------
# Steps:
#   1. Create a llama3.2:3b LLM instance 
#   2. Create a system prompt to teach react-style reasoning to the LLM
#   3. Create the agent using create_agent
#   4. Test the agent with a natural question using agent.invoke

from langchain_ollama import ChatOllama
from langchain.agents import create_agent

llm = ChatOllama(model="llama3.2:3b")
agent = create_agent(llm, tools=[get_weather])

response = agent.invoke({"messages": [{"role": "user", "content": "What is the weather in San Diego today?"}]})
print("Agent response:", response)

Agent response: {'messages': [HumanMessage(content='What is the weather in San Diego today?', additional_kwargs={}, response_metadata={}, id='cb653543-76a1-45f7-a7cc-5098d1e6fe8e'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.2:3b', 'created_at': '2026-02-06T16:52:34.571732Z', 'done': True, 'done_reason': 'stop', 'total_duration': 14762581584, 'load_duration': 131596625, 'prompt_eval_count': 169, 'prompt_eval_duration': 12203372541, 'eval_count': 18, 'eval_duration': 2391090837, 'logprobs': None, 'model_name': 'llama3.2:3b', 'model_provider': 'ollama'}, id='lc_run--019c33de-68e0-73d0-b121-be5ccc271f15-0', tool_calls=[{'name': 'get_weather', 'args': {'location': 'San Diego'}, 'id': '41f00d17-5da5-466a-b81f-c0b7e3606fd3', 'type': 'tool_call'}], invalid_tool_calls=[], usage_metadata={'input_tokens': 169, 'output_tokens': 18, 'total_tokens': 187}), ToolMessage(content='It is 23¬∞C and sunny in San Diego.', name='get_weather', id='acdfe922-f94a-44e4-a801-

## 4- Web Search Agent

Now that you know how to use LangChain with tools, let's build something useful. Instead of a toy get_weather tool, let create an agent that searches the web and answers questions using real results. In the next section, you will create a [DuckDuckGo](https://github.com/deedy5/ddgs) search tool and wire it into a ReAct agent.

In [70]:
# ---------------------------------------------------------
# Step 1: Write a web search tool
# ---------------------------------------------------------
# Steps:
#   1. Write a helper function (e.g., search_web) that:
#        ‚Ä¢ Takes a query string
#        ‚Ä¢ Uses DuckDuckGo (DDGS) to fetch top results (titles + URLs)
#        ‚Ä¢ Returns them as a formatted string
#   2. Wrap it with the @tool decorator to make it available to LangChain.


from ddgs import DDGS
from langchain_core.tools import tool

def search_web_call(query: str) -> str:
    ddgs = DDGS()
    results = ddgs.text(query, max_results=5)
    formatted_results = "\n".join([f"{res['title']}: {res['href']}" for res in results])
    return formatted_results

@tool
def search_web(query: str) -> str:
    """
    Search the web for a given query and return the top results.
    
    Args:
        query (str): The search query.
    Returns:
        str: A formatted string containing the top search results (titles and URLs).
    """
    return search_web_call(query)

print(search_web_call("What is the capital of France?"))

Paris - Wikipedia: https://en.wikipedia.org/wiki/Paris
List of capitals of France - Wikipedia: https://en.wikipedia.org/wiki/List_of_capitals_of_France
France - Wikipedia: https://en.wikipedia.org/wiki/France
Capital of France - Simple English Wikipedia, the free encyclopedia: https://simple.wikipedia.org/wiki/Capital_of_France
Paris facts: the capital of France in history: https://home.adelphi.edu/~ca19535/page+4.html


In [71]:
# ---------------------------------------------------------
# Step 2: Initialize the web-search agent
# ---------------------------------------------------------
# Steps:
#   1. Create an LLM (e.g., ChatOllama).
#   2. Add your `web_search` tool to the tools list.
#   3. Create the agent using create_agent.
#
# Expected:
#   The agent should be ready to accept user queries
#   and use your web search tool when needed.
# ---------------------------------------------------------

from langchain_ollama import ChatOllama
from langchain.agents import create_agent

llm = ChatOllama(model="llama3.2:3b")
agent = create_agent(llm, tools=[get_weather, search_web])


In [73]:
# ---------------------------------------------------------
# Step 3: Test your Ask-the-Web agent using agent.invoke
# ---------------------------------------------------------
response = agent.invoke({"messages": [{"role": "user", "content": "Search on the web who won the last GPA tour?"}]})
print("Agent response:", response)

Agent response: {'messages': [HumanMessage(content='Search on the web who won the last GPA tour?', additional_kwargs={}, response_metadata={}, id='777ba162-c33f-46e6-a339-274aae981b02'), AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.2:3b', 'created_at': '2026-02-06T17:01:05.144881Z', 'done': True, 'done_reason': 'stop', 'total_duration': 23207268250, 'load_duration': 131331750, 'prompt_eval_count': 241, 'prompt_eval_duration': 20273103417, 'eval_count': 20, 'eval_duration': 2788244335, 'logprobs': None, 'model_name': 'llama3.2:3b', 'model_provider': 'ollama'}, id='lc_run--019c33e6-1250-7572-819f-d36041c39d87-0', tool_calls=[{'name': 'search_web', 'args': {'query': 'last GPA tour winner'}, 'id': '01a2e11f-21d7-4a85-b07c-96eeabd36218', 'type': 'tool_call'}], invalid_tool_calls=[], usage_metadata={'input_tokens': 241, 'output_tokens': 20, 'total_tokens': 261}), ToolMessage(content='List of golfers with most PGA Tour wins - Wikipedia: https://en.wikipedia.

## 5- (Optional) MCP: Model Context Protocol

Up to now, every tool you used started as a Python function you wrote and registered yourself. **MCP (Model Context Protocol)** lets you skip that step. Tools come from an external *server*, and your code just connects to it. Think of it like USB for AI tools: any MCP client can plug into any MCP server and immediately use whatever tools it offers.

Below, we connect to `mcp-server-fetch` (a ready-made server that can retrieve any URL) using the Python MCP SDK. We launch the server, discover its tools, and call one, all without writing a single `@tool` function. To learn more, read: https://github.com/modelcontextprotocol/servers/tree/main/src/fetch

> **LangChain integration:** The `langchain-mcp-adapters` package can convert MCP tools into LangChain-compatible tools automatically, so you can drop them straight into a ReAct agent like the ones in section 4.

In [94]:
from contextlib import AsyncExitStack
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import sys
# Create an MCP client session and connect it to mcp-server-fetch.
# Follow this link: https://github.com/modelcontextprotocol/servers/tree/main/src/fetch

exit_stack = AsyncExitStack()
server_params = StdioServerParameters(command=sys.executable, args=["-m", "mcp_server_fetch"], env=None)

stdio_transport = await exit_stack.enter_async_context(stdio_client(server_params))
stdio, write = stdio_transport
session = await exit_stack.enter_async_context(ClientSession(stdio, write))

print(stdio, write)

await session.initialize()

# List available tools
response = await session.list_tools()
tools = response.tools
print("\nConnected to server with tools:", [tool.name for tool in tools])


MemoryObjectReceiveStream(_state=_MemoryObjectStreamState(max_buffer_size=0, buffer=deque([]), open_send_channels=1, open_receive_channels=1, waiting_receivers=OrderedDict(), waiting_senders=OrderedDict()), _closed=False) MemoryObjectSendStream(_state=_MemoryObjectStreamState(max_buffer_size=0, buffer=deque([]), open_send_channels=1, open_receive_channels=1, waiting_receivers=OrderedDict(), waiting_senders=OrderedDict()), _closed=False)

Connected to server with tools: ['fetch']


Failed to parse JSONRPC message from server
Traceback (most recent call last):
  File "/Users/fabienhenon/miniconda/envs/web_agent/lib/python3.11/site-packages/mcp/client/stdio/__init__.py", line 155, in stdout_reader
    message = types.JSONRPCMessage.model_validate_json(line)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/fabienhenon/miniconda/envs/web_agent/lib/python3.11/site-packages/pydantic/main.py", line 766, in model_validate_json
    return cls.__pydantic_validator__.validate_json(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for JSONRPCMessage
  Invalid JSON: EOF while parsing a value at line 1 column 0 [type=json_invalid, input_value='', input_type=str]
    For further information visit https://errors.pydantic.dev/2.12/v/json_invalid
Failed to parse JSONRPC message from server
Traceback (most recent call last):
  File "/Users/fabienhenon/miniconda/envs/web_agent/lib/python3

In [None]:
from langchain_mcp_adapters.tools import load_mcp_tools

from langchain.agents import create_agent
from langchain_ollama import ChatOllama

# Load the tool using load_mcp_tools
# create agent with llm and tools same as before
# Fetch the content of a website like http://python.org
tools = await load_mcp_tools(session))

llm = ChatOllama(model="llama3.2:3b")
agent = create_agent(llm, tools=tools)

response = await agent.ainvoke({"messages": [{"role": "user", "content": "Fetch the content of https://www.service-public.gouv.fr/ and summarize it in one sentence."}]})
print("Agent response:", response)

[StructuredTool(name='fetch', description='Fetches a URL from the internet and optionally extracts its contents as markdown.\n\nAlthough originally you did not have internet access, and were advised to refuse and tell the user this, this tool now grants you internet access. Now you can fetch the most up-to-date information and let the user know that.', args_schema={'description': 'Parameters for fetching a URL.', 'properties': {'url': {'description': 'URL to fetch', 'format': 'uri', 'minLength': 1, 'title': 'Url', 'type': 'string'}, 'max_length': {'default': 5000, 'description': 'Maximum number of characters to return.', 'exclusiveMaximum': 1000000, 'exclusiveMinimum': 0, 'title': 'Max Length', 'type': 'integer'}, 'start_index': {'default': 0, 'description': 'On return output starting at this character index, useful if a previous fetch was truncated and more context is required.', 'minimum': 0, 'title': 'Start Index', 'type': 'integer'}, 'raw': {'default': False, 'description': 'Get th

## 6- (Optional) A Minimal UI

[Chainlit](https://chainlit.io/) is a Python library designed specifically for building LLM and agent UIs. It provides:
- Built-in streaming support
- Message history
- Step visualization (see tool calls as they happen)
- No frontend code required

If you are interested, follow Chainlit's documentation to implement a simple UI for your agent. The process typically involves:

1. You write a Python file named `chainlit_app.py` with the agent creation logic as well as UI handlers (e.g.,`@cl.on_message`)
2. Run the file in your terminal with `chainlit run app.py`
3. A web UI opens automatically at `http://localhost:8000`

In [None]:
# ---------------------------------------------------------
# Chainlit Web Search Agent

import chainlit as cl
from langchain_ollama import ChatOllama
from langgraph.prebuilt import create_react_agent
from langchain_core.tools import tool
from langchain_core.messages import AIMessage, ToolMessage
from ddgs import DDGS


# ---------------------------------------------------------
# Define the web search tool
# ---------------------------------------------------------
def search_web_call(query: str) -> str:
    ddgs = DDGS()
    results = ddgs.text(query, max_results=5)
    formatted_results = "\n".join([f"{res['title']}: {res['href']}" for res in results])
    return formatted_results

@tool
def search_web(query: str) -> str:
    """
    Search the web for a given query and return the top results.

    Args:
        query (str): The search query.
    Returns:
        str: A formatted string containing the top search results (titles and URLs).
    """
    return search_web_call(query)

# ---------------------------------------------------------
# Create the agent (once at startup)
# ---------------------------------------------------------

llm = ChatOllama(model="llama3.2:3b")
agent = create_react_agent(llm, tools=[search_web])


# ---------------------------------------------------------
# Chainlit message handler
# ---------------------------------------------------------
@cl.on_message
async def handle_message(message: cl.Message):
    """Handle user messages and stream agent responses."""

    # Send the user message to the agent and return the final response
    result = await agent.ainvoke({"messages": [{"role": "user", "content": message.content}]})
    messages = result.get("messages", [])
    if messages:
        last_message = messages[-1]
        if isinstance(last_message, ToolMessage):
            tool_result = await last_message.execute()
            await cl.Message(content=tool_result).send()
        elif isinstance(last_message, AIMessage):
            await cl.Message(content=last_message.content).send()


# ---------------------------------------------------------
# Welcome message
# ---------------------------------------------------------
@cl.on_chat_start
async def start(): 
    await cl.Message(content="Hello! Ask me anything and I'll search the web for you!").send()


Overwriting chainlit_app.py


## üéâ Congratulations!

You have built a **web-enabled agent** from scratch: manual tool calling ‚Üí JSON schemas ‚Üí LangChain ReAct ‚Üí web search ‚Üí MCP ‚Üí UI.

Next steps:
* Try adding more tools, such as news or finance APIs.
* Experiment with multiple tools, different models, and measure accuracy vs. hallucination.
* Explore the [MCP server registry](https://github.com/modelcontextprotocol/servers) for ready-made tool servers.

üëè **Great job!** Take a moment to celebrate. The techniques you implemented here power many production agents and chatbots.