# Project 3: **Ask‑the‑Web Agent**

Welcome to Project 3! In this project, you will learn how to use tool‑calling LLMs, extend them with custom tools, and build a simplified *Perplexity‑style* agent that answers questions by searching the web.

## Learning Objectives  
* Understand why tool calling is useful and how LLMs can invoke external tools.
* Implement a minimal loop that parses the LLM's output and executes a Python function.
* See how *function schemas* (docstrings and type hints) let us scale to many tools.
* Use **LangChain** to get function‑calling capability for free (ReAct reasoning, memory, multi‑step planning).
* Combine LLM with a web‑search tool to build a simple ask‑the‑web agent.

## Roadmap
1. Environment setup
2. Write simple tools and connect them to an LLM
3. Standardize tool calling by writing `to_schema`
4. Use LangChain to augment an LLM with your tools
5. Build a Perplexity‑style web‑search agent
6. (Optional) A minimal backend and frontend UI

# 1- Environment setup

## 1.1- Conda environment

Before we start coding, you need a reproducible setup. Open a terminal in the same directory as this notebook and run:

```bash
# Create and activate the conda environment
conda env create -f environment.yml && conda activate web_agent

# Register this environment as a Jupyter kernel
python -m ipykernel install --user --name=web_agent --display-name "web_agent"
```
Once this is done, you can select “web_agent” from the Kernel → Change Kernel menu in Jupyter or VS Code.


> Behind the scenes:
> * Conda reads `environment.yml`, resolves the pinned dependencies, creates an isolated environment named `web_agent`, and activates it.
> * `ollama pull` downloads the model so you can run it locally without API calls.


## 1.2 Ollama setup

In this project, we start with `gemma3-1B` because it is lightweight and runs on most machines. You can try other smaller or larger LLMs such as `mistral:7b`, `phi3:mini`, or `llama3.2:1b` to compare performance. Explore available models here: https://ollama.com/library

```bash
ollama pull gemma3:1b
```

`ollama pull` downloads the model so you can run it locally without API calls.


## 2- Tool Calling

LLMs are strong at answering questions, but they cannot directly access external data such as live web results, APIs, or computations. In real applications, agents rarely rely only on their internal knowledge. They need to query APIs, retrieve data, or perform calculations to stay accurate and useful. Tool calling bridges this gap by allowing the LLM to request actions from the outside world.


We describe each tool’s interface in the model’s prompt, defining what it does and what arguments it expects. When the model decides that a tool is needed, it emits a structured output like: `TOOL_CALL: {"name": "get_current_weather", "args": {"city": "San Francisco"}}`. Your code will detect this output, execute the corresponding function, and feed the result back to the LLM so the conversation continues.

In this section, you will implement a simple `get_current_weather` function and teach the `gemma3` model how to use it when required in four steps:
1. Implement the tool
2. Create the instructions for the LLM
3. Call the LLM with the prompt
4. Parse the LLM output and call the tool

In [None]:
from openai import OpenAI

client = OpenAI(api_key = "ollama", base_url = "http://localhost:11434/v1")

In [None]:
def get_current_weather(city: str, unit: str = "celsius") -> str:
    """Return a simple, human-readable description of the weather."""
    normalized_city = city.strip()
    if not normalized_city:
        raise ValueError("city must be provided")

    normalized_unit = unit.lower()
    if normalized_unit not in {"celsius", "fahrenheit"}:
        normalized_unit = "celsius"

    temperatures = {"celsius": "22°C", "fahrenheit": "72°F"}
    description = "partly cloudy"
    return f"It is {temperatures[normalized_unit]} and {description} in {normalized_city}."


In [None]:
SYSTEM_PROMPT = (
    "You are a weather assistant. You can call a single tool named `get_current_weather`\n"
    "whenever the user needs up-to-date weather information. The tool accepts a city\n"
    "(string) and an optional unit (either 'celsius' or 'fahrenheit'). When the tool\n"
    "is required, respond with a single line formatted exactly as:\n"
    "TOOL_CALL:{\\"name\\": \\"get_current_weather\\", \\"args\\": {\\"city\\": \\"<city>\\", \\"unit\\": \\"<unit>\\"}}\n"
    "If the question can be answered without the tool, reply normally."
)

USER_QUESTION = "What is the weather in San Diego today?"


Now that you have defined a tool and shown the model how to use it, the next step is to call the LLM using your prompt.

Start the **Ollama** server in a terminal with `ollama serve`. This launches a local API endpoint that listens for LLM requests. Once the server is running, return to the notebook and in the next cell send a query to the model.


In [None]:
response = client.chat.completions.create(
    model="gemma3:1b",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": USER_QUESTION},
    ],
    temperature=0,
)

print(response)
model_output = response.choices[0].message.content or ""
print(model_output)


In [None]:
import re, json

match = re.search(r"TOOL_CALL:\s*(\{.*\})", model_output, flags=re.DOTALL)
if not match:
    print("No tool call requested by the model.")
else:
    payload = json.loads(match.group(1))
    tool_name = payload.get("name")
    args = payload.get("args", {})
    print(f"Calling tool `{tool_name}` with args {args}")

    if tool_name == "get_current_weather":
        result = get_current_weather(**args)
        print("Result:", result)
    else:
        print(f"No implementation available for tool `{tool_name}`.")


# 3- Standadize tool calling

So far, we handled tool calling manually by writing one regex and one hard-coded function. This approach does not scale if we want to add more tools. Adding more tools would mean more `if/else` blocks and manual edits to the `TOOL_SPEC` prompt.

To make the system flexible, we can standardize tool definitions by automatically reading each function’s signature, converting it to a JSON schema, and passing that schema to the LLM. This way, the LLM can dynamically understand which tools exist and how to call them without requiring manual updates to prompts or conditional logic.

Next, you will implement a small helper that extracts metadata from functions and builds a schema for each tool.

In [None]:
from pprint import pprint
import inspect


def to_schema(fn):
    """Generate a JSON schema for an arbitrary callable."""
    signature = inspect.signature(fn)
    description = inspect.getdoc(fn) or "No description provided."

    type_mapping = {
        str: "string",
        int: "integer",
        float: "number",
        bool: "boolean",
    }

    properties = {}
    required = []
    for name, param in signature.parameters.items():
        if param.kind not in (
            inspect.Parameter.POSITIONAL_ONLY,
            inspect.Parameter.POSITIONAL_OR_KEYWORD,
            inspect.Parameter.KEYWORD_ONLY,
        ):
            continue

        annotation = param.annotation if param.annotation is not inspect._empty else str
        schema_type = type_mapping.get(annotation, "string")
        properties[name] = {
            "type": schema_type,
            "description": f"Argument `{name}` for {fn.__name__}.",
        }
        if param.default is inspect._empty:
            required.append(name)

    return {
        "type": "function",
        "function": {
            "name": fn.__name__,
            "description": description,
            "parameters": {
                "type": "object",
                "properties": properties,
                "required": required,
            },
        },
    }


tool_schema = to_schema(get_current_weather)
pprint(tool_schema)


In [None]:
messages_with_schema = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "system",
        "name": "tool_spec",
        "content": json.dumps({"tools": [tool_schema]}),
    },
    {"role": "user", "content": USER_QUESTION},
]

response_with_schema = client.chat.completions.create(
    model="gemma3:1b",
    messages=messages_with_schema,
    temperature=0,
)

schema_model_output = response_with_schema.choices[0].message.content or ""
print(schema_model_output)


## 4- LangChain for Tool Calling
So far, you built a simple tool-calling pipeline manually. While this helps you understand the logic, it does not scale well when working with multiple tools, complex parsing, or multi-step reasoning.

LangChain simplifies this process. You only need to declare your tools, and its *Agent* abstraction handles when to call a tool, how to use it, and how to continue reasoning afterward.

In this section, you will use the **ReAct** Agent (Reasoning + Acting). It alternates between reasoning steps and tool use, producing clearer and more reliable results. We will explore reasoning-focused models in more depth next week.

The following links might be helpful:
- https://python.langchain.com/api_reference/langchain/agents/langchain.agents.initialize.initialize_agent.html
- https://python.langchain.com/docs/integrations/tools/
- https://python.langchain.com/docs/integrations/chat/ollama/
- https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.llms.LLM.html

In [None]:
@tool
def get_weather(city: str, unit: str = "celsius") -> str:
    """Return a natural language weather report for a city."""
    return get_current_weather(city=city, unit=unit)


In [None]:
llm = ChatOllama(model="gemma3:1b", temperature=0)

langchain_tools = [get_weather]
weather_agent = initialize_agent(
    langchain_tools,
    llm,
    agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
)

agent_result = weather_agent.invoke({"input": "Do I need an umbrella in Seattle today?"})
print(agent_result["output"])


### What just happened?

The console log displays the **Thought → Action → Observation → …** loop until the agent produces its final answer. Because `verbose=True`, LangChain prints each intermediate reasoning step.

If you want to add more tools, simply append them to the tools list. LangChain will handle argument validation, schema generation, and tool-calling logic automatically.


## 5- Perplexity‑Style Web Search
Agents become much more powerful when they can look up real information on the web instead of relying only on their internal knowledge.

In this section, you will combine everything you have learned to build a simple Ask-the-Web Agent. You will integrate a web search tool (DuckDuckGo) and make it available to the agent using the same tool-calling approach as before.

This will let the model retrieve fresh results, reason over them, and generate an informed answer—similar to how Perplexity works.

You may find some examples from the following links:
- https://pypi.org/project/duckduckgo-search/

In [None]:
@tool
def web_search(query: str) -> str:
    """Search DuckDuckGo for the query and return a summary of the top results."""
    results = []
    try:
        with DDGS() as ddgs:
            for idx, entry in enumerate(ddgs.text(query, max_results=3), start=1):
                title = entry.get("title") or "Untitled result"
                snippet = entry.get("body") or ""
                url = entry.get("href") or entry.get("url") or ""
                summary = f"{idx}. {title}\n{snippet}".strip()
                if url:
                    summary += f"\n{url}"
                results.append(summary)
    except Exception as exc:
        return f"Web search failed: {exc}"

    if not results:
        return "No results found."

    return "\n\n".join(results)


In [None]:
web_llm = OpenAI(
    temperature=0,
    model_name="gemma3:1b",
    openai_api_key="ollama",
    openai_api_base="http://localhost:11434/v1",
)

web_tools = [web_search]
web_agent = initialize_agent(
    web_tools,
    web_llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
)



Let’s see the agent's output in action with a real example.


In [None]:
question = "What are the current events in San Francisco this week?"
web_response = web_agent.invoke({"input": question})
print(web_response["output"])



## 6- A minimal UI
This project includes a simple **React** front end that sends the user’s question to a FastAPI back end and streams the agent’s response in real time. To run the UI:

1- Open a terminal and start the Ollama server: `ollama serve`.

2- In a second terminal, navigate to the frontend folder and install dependencies:`npm install`.

3- In the same terminal, start the FastAPI back‑end: `uvicorn app:app --reload --port 8000`

4- Open a third terminal, stay in the frontend folder, and start the React dev server: `npm run dev`

5- Visit `http://localhost:5173/` in your browser.



## 🎉 Congratulations!

* You have built a **web‑enabled agent**: tool calling → JSON schema → LangChain ReAct → web search → simple UI.
* Try adding more tools, such as news or finance APIs.
* Experiment with multiple tools, different models, and measure accuracy vs. hallucination.


👏 **Great job!** Take a moment to celebrate. The techniques you implemented here power many production agents and chatbots.