# LLM Tool Calling + MCP Concept Demo 🧠🛠️

This notebook **step by step** explains what really happens when an LLM:

1. Receives a user query.
2. Sees a list of tools (e.g., coming from an MCP server via an MCP client).
3. Decides to call a tool.
4. Returns only **tool name + JSON arguments** (not schemas).
5. Lets the **client** execute the tool and give results back.
6. Produces a final natural-language answer.

We will **simulate** the OpenAI + MCP behavior so that you can run this without any API keys.

The focus is on **clarity and intuition**, not on networking or real HTTP calls.


## 0. Big Picture: Who Does What?

We care about the path from:

> **User query → tool selection → tool execution → final answer**

In a real system with MCP, you have:

- **MCP Server** – defines tools (capabilities), e.g. `fake_weather(city)`.
- **MCP Client** – connects to the server, calls `list_tools()` & `call_tool(...)`.
- **LLM (OpenAI)** – sees tools as function schemas and decides *which* tool to call.

In this notebook, we simulate this flow:

1. A *fake MCP server* that just holds tool definitions in Python.
2. An *MCP client* that:
   - discovers tools,
   - converts them to **OpenAI-style tool schemas**,
   - executes the tools when the model asks.
3. A *fake LLM* that behaves like a tool-calling model:
   - first call: chooses a tool and JSON arguments,
   - second call: uses tool results to generate the final answer.

We will trace everything with prints so students can see **exactly** what is happening.


## 1. Simulated MCP Server: Defining Tools

In a real MCP server, you would write Python like:

```python
@mcp.tool()
async def fake_weather(city: str) -> str:
    ...
```

The MCP runtime would then expose a **tool description** (name, description, JSON schema for parameters) to clients.

Here, we simulate that with a simple Python dict that represents what `list_tools()` might return.


In [None]:
# Simulated MCP "server" tool definitions (what list_tools() might give)

mcp_tools = [
    {
        "name": "fake_weather",
        "description": "Return a rough current temperature for a city.",
        "inputSchema": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name"}
            },
            "required": ["city"],
        },
    },
    {
        "name": "echo",
        "description": "Echo back a message.",
        "inputSchema": {
            "type": "object",
            "properties": {
                "message": {"type": "string", "description": "Message to echo"}
            },
            "required": ["message"],
        },
    },
]

print("Simulated MCP tools (list_tools() output):")
for t in mcp_tools:
    print(f"- {t['name']}: {t['description']}")


## 2. MCP Client: Converting Tools → OpenAI Tool Schema

An MCP client does not send MCP protocol details to the model. Instead it converts tools into
the **OpenAI `tools` format** (or Anthropic `tools`, etc.).

This is what `OpenAIQueryHandler._get_tools_for_openai()` does conceptually:

- For each MCP tool:
  - Use `name`, `description`, and `inputSchema`.
  - Build a `{"type": "function", "function": {...}}` entry for OpenAI.

Let's simulate that transformation.


In [None]:
def mcp_tools_to_openai_tools(mcp_tools):
    """Convert simulated MCP tools into OpenAI-style tools list."""
    openai_tools = []
    for tool in mcp_tools:
        openai_tools.append(
            {
                "type": "function",
                "function": {
                    "name": tool["name"],
                    "description": tool["description"],
                    "parameters": tool["inputSchema"],
                },
            }
        )
    return openai_tools

openai_tools = mcp_tools_to_openai_tools(mcp_tools)

import json
print("OpenAI tools schema that would be sent to the model:\n")
print(json.dumps(openai_tools, indent=2))


## 3. First Model Call: Choosing a Tool (Tool Call Decision)

Now imagine we send this to an OpenAI model:

- `messages = [{"role": "user", "content": "What is the weather in London?"}]`
- `tools = openai_tools` (from above)

The model will read the tools and think:

> *"User asked about weather. I see a tool called `fake_weather(city)`. I should call it with `city="London"`."*

The model then **does not send back the schema**.
Instead, it sends back **only the tool name + arguments** in a `tool_calls` structure.

We simulate this behavior with a simple function instead of actually calling OpenAI.


In [None]:
def fake_model_decide_tool(user_query: str, tools: list[dict]) -> dict:
    """Simulate a tool-calling model's first response.

    It looks at the text and decides which tool to call and with what arguments.
    For teaching, we implement a simple rule-based version.
    """
    user_query_lower = user_query.lower()

    if "weather" in user_query_lower:
        # Pretend the model parsed the city name "London" from the question.
        return {
            "role": "assistant",
            "tool_calls": [
                {
                    "id": "call_1",
                    "type": "function",
                    "function": {
                        "name": "fake_weather",
                        "arguments": json.dumps({"city": "London"}),
                    },
                }
            ],
            "content": "Let me check the weather using my tools.",
        }
    else:
        # No tools used, just answer directly.
        return {
            "role": "assistant",
            "content": "I don't think I need tools for this question.",
            "tool_calls": [],
        }

user_query = "What is the weather in London right now?"
first_model_message = fake_model_decide_tool(user_query, openai_tools)

print("First model response (note: tool name + arguments, no schema):\n")
print(json.dumps(first_model_message, indent=2))


👉 **Key observation:**

- The model output contains:
  - `tool_calls[0].function.name` → `"fake_weather"`
  - `tool_calls[0].function.arguments` → `{ "city": "London" }`
- It does **not** return the JSON schema.
- It does **not** return MCP protocol frames.

Schemas only flowed **from client → model** (as part of the `tools` list).
The model's job is just to **pick a tool and fill arguments**.


## 4. MCP Client: Executing the Requested Tool

Now the **client** inspects `tool_calls` and says:

> *"The model wants to call `fake_weather` with `city="London"`. I will execute that tool."*

In a real MCP setup:

- The client would translate this into an MCP `call_tool` request over stdio or sockets.
- The MCP server would run the Python function and return the result.

Here, we simulate that by:

1. Defining local Python functions `fake_weather` and `echo`.
2. Writing a simple `execute_tool_call(...)` that:
   - looks up the function by name,
   - parses arguments,
   - executes the function,
   - returns a `role="tool"` message (like OpenAI expects).


In [None]:
from typing import Callable

# 1. Define the actual underlying tool implementations (like MCP server functions)

def tool_fake_weather(city: str) -> str:
    temps = {"London": 18, "Hyderabad": 32, "Bangalore": 26}
    temp = temps.get(city, 25)
    return f"It is roughly {temp}°C in {city} right now."


def tool_echo(message: str) -> str:
    return f"Echo: {message}"


# 2. Registry mapping tool names → functions
TOOL_REGISTRY: dict[str, Callable[..., str]] = {
    "fake_weather": tool_fake_weather,
    "echo": tool_echo,
}


def execute_tool_call(tool_call: dict) -> dict:
    """Simulate MCP client executing an MCP tool and returning a tool message.

    This is where, in reality, MCP protocol would be used.
    Here we just call local Python functions for clarity.
    """
    tool_name = tool_call["function"]["name"]
    args = json.loads(tool_call["function"]["arguments"] or "{}")
    tool_fn = TOOL_REGISTRY.get(tool_name)
    if tool_fn is None:
        content = f"Error: unknown tool '{tool_name}'"
    else:
        try:
            content = tool_fn(**args)
        except Exception as e:  # noqa: BLE001
            content = f"Error executing tool {tool_name}: {e}"

    # This mimics an OpenAI "tool" message that would be appended to the chat.
    tool_message = {
        "role": "tool",
        "tool_call_id": tool_call["id"],
        "content": content,
    }
    return tool_message


# Execute all tool calls from the first model message
tool_messages = []
for tc in first_model_message.get("tool_calls", []):
    tm = execute_tool_call(tc)
    tool_messages.append(tm)

print("Tool messages produced by client (after executing tools):\n")
print(json.dumps(tool_messages, indent=2))


## 5. Second Model Call: Using Tool Results to Answer

Now the client has:

- Original user message.
- Assistant's tool call message ("Let me check the weather using my tools.").
- One or more `role="tool"` messages with the actual tool outputs.

In a real system, the client would send all this back to the LLM and ask:

> *"Here are the tool results. Now give a final answer to the user."*

We'll simulate that with another fake model function `fake_model_final_answer(...)`.


In [None]:
def fake_model_final_answer(user_query: str, tool_messages: list[dict]) -> dict:
    """Simulate the second model call that produces the final answer.

    Here we keep it very simple: just weave the tool content into a natural sentence.
    """
    if not tool_messages:
        return {
            "role": "assistant",
            "content": "I did not use any tools. (Demo)",
        }

    tool_content = tool_messages[0]["content"]
    answer = (
        f"You asked: '{user_query}'.\n"
        f"Based on the tool result, here is the answer: {tool_content}"
    )
    return {"role": "assistant", "content": answer}


final_message = fake_model_final_answer(user_query, tool_messages)

print("Final assistant message (second model call simulation):\n")
print(json.dumps(final_message, indent=2))
print("\nAs plain text:\n")
print(final_message["content"])


## 6. Summary: Where Is MCP, Where Is the Model?

In this notebook, we simulated the entire flow **without any real network calls**, but the pattern is the same in real MCP+OpenAI setups:

1. **MCP server** defines tools and exposes `list_tools()` and `call_tool()`.
2. **MCP client**:
   - Calls `list_tools()` → gets tool names + JSON schemas.
   - Converts them to OpenAI `tools` format.
3. **First model call**:
   - Input: user message + tools.
   - Output: (optionally) `tool_calls` with tool name + JSON arguments.
   - ❌ No schemas returned by the model.
4. **Client executes tools**:
   - Uses MCP `call_tool` protocol in reality (here: local Python calls).
   - Builds `role="tool"` messages with the tool outputs.
5. **Second model call**:
   - Input: conversation so far + tool messages.
   - Output: final natural-language answer.

👉 The illusion that **"LLM talks MCP"** comes from the client hiding all protocol
details and giving the model just a **menu of tools** and the **results** of those tools.

The model:

- Never crafts MCP JSON-RPC frames.
- Never sends back JSON schemas.
- Only chooses **which tool** to call and **what arguments** to pass.

You can now use this notebook to walk students cell by cell through the entire story.
Encourage them to modify the tools, change the query text, or add new tools to see how
the flow generalizes.
