#MCP Workshop with a Real LLM Agent

This notebook demonstrates the Model Context Protocol (MCP) using a live, lightweight language model (DistilGPT-2) to act as the decision-making agent.

It shows how an LLM can understand a user's request, select the appropriate tool from a list, and format the arguments to execute a function.

In [None]:
# This cell installs the Hugging Face transformers library to load the LLM
# and PyTorch as its backend, then imports all required modules.
# In a real Colab notebook, you would run this cell first.
try:
    import transformers
    import torch
except ImportError:
    print("Installing required libraries: transformers, torch. Please wait...")
    import subprocess
    import sys
    subprocess.check_call([sys.executable, "-m", "pip", "install", "transformers", "torch"])
    print("Installation complete.")

from __future__ import annotations
from typing import Any, Callable, Dict, List, Optional, Tuple
import json
import uuid
import re
from transformers import pipeline, set_seed


# 2) Define Tools (The "Coffee Machines")
These are the backend functions the agent can choose to call.

In [None]:
# These are the backend functions the agent can choose to call.
CAS_DB = {
    "7732-18-5": {"name": "Water", "mw": 18.015},
    "64-17-5":   {"name": "Ethanol", "mw": 46.07},
    "67-56-1":   {"name": "Methanol", "mw": 32.04},
}

def tool_get_mw(cas_number: str) -> Dict[str, Any]:
    """Return molecular weight by CAS number."""
    data = CAS_DB.get(cas_number)
    if not data:
        raise ValueError(f"CAS not found: {cas_number}")
    return {"cas": cas_number, "name": data["name"], "mw": data["mw"]}

def tool_compare_mw(cas_a: str, cas_b: str) -> Dict[str, Any]:
    """Compare molecular weights for two CAS numbers."""
    a = CAS_DB.get(cas_a)
    b = CAS_DB.get(cas_b)
    if not a or not b:
        missing = [c for c, d in [(cas_a, a), (cas_b, b)] if not d]
        raise ValueError(f"Missing CAS: {missing}")
    heavier = cas_a if a["mw"] > b["mw"] else cas_b
    return {
        "cas_a": cas_a, "mw_a": a["mw"],
        "cas_b": cas_b, "mw_b": b["mw"],
        "heavier": heavier
    }


# 3) MCP Server (The Chef)
This class manages the tool registry and executes calls. It's the central hub that holds the "menu" of tools.

In [None]:
# This class manages the tool registry and executes calls.
class MCPServer:
    def __init__(self):
        self.registry: Dict[str, Dict[str, Any]] = {}

    def register(self, name: str, func: Callable, description: str, parameters: Dict[str, str]):
        self.registry[name] = {"name": name, "func": func, "description": description, "parameters": parameters}

    def list_tools(self) -> List[Dict[str, Any]]:
        return [{"name": meta["name"], "description": meta["description"], "parameters": meta["parameters"]} for meta in self.registry.values()]

    def _validate(self, tool: str, args: Dict[str, Any]) -> Tuple[bool, Optional[str]]:
        meta = self.registry.get(tool)
        if not meta: return False, f"Unknown tool: {tool}"
        for k in meta["parameters"]:
            if k not in args: return False, f"Missing parameter: {k}"
        return True, None

    def execute(self, request: Dict[str, Any]) -> Dict[str, Any]:
        req_id = request.get("id", str(uuid.uuid4()))
        tool, arguments = request.get("tool"), request.get("arguments", {})
        ok, err = self._validate(tool, arguments)
        if not ok: return {"id": req_id, "status": "error", "error": err}
        try:
            output = self.registry[tool]["func"](**arguments)
            return {"id": req_id, "status": "ok", "output": output}
        except Exception as e:
            return {"id": req_id, "status": "error", "error": str(e)}


# 4) MCP Client (The "Waiter")
This class provides a simple interface for the Agent to discover and call tools without needing to know the server's internal logic.

In [None]:
# This class provides a simple interface for calling tools.
class MCPClient:
    def __init__(self, server: MCPServer):
        self.server = server

    def discover(self) -> List[Dict[str, Any]]:
        return self.server.list_tools()

    def call(self, tool: str, arguments: Dict[str, Any]) -> Dict[str, Any]:
        request = {"id": str(uuid.uuid4()), "tool": tool, "arguments": arguments}
        return self.server.execute(request)


# 5) The LLM Agent (The "AI Decision-Maker")
This is the new Agent, powered by a real LLM. It builds a prompt, asks the LLM to choose a tool, and then parses the response to execute the tool call.

In [None]:
# This is the new Agent, powered by a real LLM.
class LLMAgent:
    def __init__(self, client: MCPClient):
        self.client = client
        print("Initializing LLM Agent. This may take a moment to download the model...")
        # Use a text-generation pipeline from Hugging Face.
        # distilgpt2 is a small, fast model suitable for this demo.
        self.generator = pipeline('text-generation', model='distilgpt2')
        set_seed(42) # for reproducibility
        print("LLM Agent ready.")

    def _build_prompt(self, user_query: str) -> str:
        """Creates a prompt for the LLM to select and format a tool call."""
        tool_list = self.client.discover()
        formatted_tools = json.dumps(tool_list, indent=2)
        return f"""
You are an expert agent that converts user requests into tool calls.
Based on the user's query, select the single best tool from the available tools list and format the arguments as a JSON object.
Your response MUST be only the JSON object for the tool call, and nothing else.

USER QUERY: "{user_query}"

AVAILABLE TOOLS:
{formatted_tools}

TOOL CALL JSON:
"""

    def _parse_llm_output(self, text: str) -> Optional[Dict[str, Any]]:
        """Extracts the JSON object from the LLM's raw output."""
        match = re.search(r'\{.*\}', text, re.DOTALL)
        if match:
            try:
                return json.loads(match.group(0))
            except json.JSONDecodeError:
                return None
        return None

    def run(self, user_query: str) -> str:
        """The main agent loop: prompt -> LLM -> parse -> call."""
        prompt = self._build_prompt(user_query)
        raw_output = self.generator(prompt, max_length=512, num_return_sequences=1)[0]['generated_text']
        llm_response_text = raw_output[len(prompt):]

        print(f"\nDEBUG: LLM raw response -> {llm_response_text}")

        tool_call = self._parse_llm_output(llm_response_text)

        if not tool_call or "tool" not in tool_call or "arguments" not in tool_call:
            return f"Error: The LLM failed to generate a valid tool call. Response was: {llm_response_text}"

        tool_name = tool_call["tool"]
        arguments = tool_call["arguments"]
        print(f"INFO: LLM decided to call tool '{tool_name}' with args {arguments}")

        response = self.client.call(tool_name, arguments)

        if response["status"] == "ok":
            return f"Success! Tool '{tool_name}' output:\n{json.dumps(response['output'], indent=2)}"
        else:
            return f"Error from tool '{tool_name}': {response['error']}"


#6) Wire It Up and Run the Simulation!
This final cell initializes all the components and runs a few test cases to demonstrate the agent in action.

In [None]:
# This block will only run when the script is executed directly.
if __name__ == "__main__":
    # Setup server and register tools
    server = MCPServer()
    server.register("get_mw", tool_get_mw, "Get molecular weight by CAS number.", {"cas_number": "string"})
    server.register("compare_mw", tool_compare_mw, "Compare molecular weights for two CAS numbers.", {"cas_a": "string", "cas_b": "string"})

    # Setup client
    client = MCPClient(server)

    # Initialize the AI Agent
    agent = LLMAgent(client)

    # --- Let's test the agent ---
    print("\n" + "="*50)
    print("Test Case 1: Simple lookup")
    query1 = "What is the molecular weight for CAS 64-17-5?"
    print(f"USER QUERY: {query1}")
    print(agent.run(query1))
    print("="*50)

    print("\n" + "="*50)
    print("Test Case 2: Comparison")
    query2 = "Which one is heavier, CAS 7732-18-5 or CAS 67-56-1?"
    print(f"USER QUERY: {query2}")
    print(agent.run(query2))
    print("="*50)

    print("\n" + "="*50)
    print("Test Case 3: Invalid Input")
    query3 = "What's the MW of CAS 999-99-9?"
    print(f"USER QUERY: {query3}")
    print(agent.run(query3))
    print("="*50)


Initializing LLM Agent. This may take a moment to download the model...


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/353M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=512) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


LLM Agent ready.

Test Case 1: Simple lookup
USER QUERY: What is the molecular weight for CAS 64-17-5?


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=512) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



DEBUG: LLM raw response -> [



"Get molecular weight by CAS number.",
   "parameters": {
      "cas_b": "string"
     }
]
]
]
]
]
]
]
]
]

]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]

]

]
]
]
]
]
]
]
]
]
]


]

]
]

]
]
]
]


]

]
]

]
]
]
]
]

]


]
]

]
]

]
]
]
]

]
]
]

]
]
]
]

]
]


]
]
]
]
]
]



]

]
Error: The LLM failed to generate a valid tool call. Response was: [



"Get molecular weight by CAS number.",
   "parameters": {
      "cas_b": "string"
     }
]
]
]
]
]
]
]
]
]

]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]
]

]

]
]
]
]
]
]
]
]
]
]


]

]
]

]
]
]
]


]

]
]

]
]
]
]
]

]


]
]

]
]

]
]
]
]

]
]
]

]
]
]
]

]
]


]
]
]
]
]
]



]

]

Test Case 2: Comparison
USER QUERY: Which one is heavier, CAS 7732-18-5 or CAS 67-56-1?


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=512) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)



DEBUG: LLM raw response -> [

{




























































































































































































































































Error: The LLM failed to generate a valid tool call. Response was: [

{





























































































































































































































































Test Case 3: Invalid Input
USER QUERY: What's the MW of CAS 999-99-9?

DEBUG: LLM raw response -> [



{


























































































































































































































































Error: The LLM failed to generate a