# Part 1 - Getting started

*Choose either Ollama or Google AI Studio setup*

## Ollama (Fully Local)

* Go to [https://ollama.com/](https://ollama.com/) download the binary, and install it.
* If prompted, install command line tools.
* If you have trouble with installation, go to the [Ollama GitHub repository](https://github.com/ollama/ollama) or [documentation](https://ollama.readthedocs.io/en/quickstart/) for more specific and up-to-date instructions.


### Downloading the Gemma model

Depending on your hardware, download one of the following models (we recommend 12b as it is quite smart but much smaller and faster than 27b).
For today's workshop we are using Q4 quantised models, with relatively small inputs, so we suggest the following VRAM requirements:
```
| Model                            | VRAM  |
|:---------------------------------|-------|
| **PetrosStav/gemma3-tools:4b**   | 8 GB  |
| **PetrosStav/gemma3-tools:12b**  | 16 GB |
| **PetrosStav/gemma3-tools:27b**  | 32 GB |
```

### Testing the Gemma model

```
ollama pull <model-name>
ollama run <model-name>
```

Ask to tell a joke or something and see if you get a response. Voilà, everything works!

## Google AI Studio (API-based)

* Go to [https://aistudio.google.com/](https://aistudio.google.com/) and sign in with your Google Account.
* Click “Get API key” in the top-right corner of the page.
* Click “Create API key” in the top-right corner of the page and then choose any available project in the dialog that appears.
* Once the key is generated, copy it and paste it into this notebook.

## Let's get coding

Create a virtual environment (or not) and install required dependencies:

In [None]:
!pip install openai mcp langfuse pydantic nest_asyncio

In [None]:
import json
import os
from openai import OpenAI
from openai.types.chat import ChatCompletionMessage, ChatCompletionMessageToolCall, ChatCompletionToolParam
from openai.types.chat.chat_completion_message_tool_call import Function
import subprocess
import re

Now let us initialize a client of your choice and test that it works:

In [None]:
### Local ollama setup, make sure you run ollama serve beforehand
# MODEL_NAME = "PetrosStav/gemma3-tools:12b"
# client = OpenAI(
#     base_url='http://localhost:11434/v1',
#     api_key='ollama',  # unused for ollama
# )

### If you were not able to install ollama, it is too slow or something else; you can continue with the workshop with the gemini model
MODEL_NAME = "gemini-2.0-flash"
client = OpenAI(
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
    api_key=os.environ["AI_STUDIO_KEY"],
)

In [None]:
response = client.chat.completions.create(
    model=MODEL_NAME,
    messages=[
        {"role": "user", "content": "What is the determinant of [[1,2],[3,4]]?"}
    ]
).choices[0].message.content

print(response)

# Part 2 - Tool Calling

In [None]:
SYSTEM_MESSAGE = """You are a chatbot assisting the developer in day-to-day tasks."""

response = client.chat.completions.create(
    model=MODEL_NAME,
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": "List files in ~/Dev"}
    ]
)

print(json.dumps(response.choices[0].message.model_dump(), indent=4))

In [None]:
list_directory_tool_description = {
    'type': 'function',
    'function': {
        'name': 'list_directory',
        'description': 'List the files in a directory',
        'parameters': {
            'type': 'object',
            'properties': {
                'directory': {
                    'type': 'string',
                    'description': 'The path to the directory',
                },
            },
            'required': ['directory'],
        }
    },
}

SYSTEM_MESSAGE = """
You are a chatbot assisting the developer in day-to-day tasks.
"""

response = client.chat.completions.create(
    model=MODEL_NAME,
    messages=[
        {"role": "system", "content": SYSTEM_MESSAGE},
        {"role": "user", "content": "List files in ~/Dev?"}
    ],
    tools=[list_directory_tool_description]
)

print(json.dumps(response.choices[0].message.model_dump(), indent=4))

# Part 3 - CLI Agent

## Some miscellaneous setup

In [None]:
UNSAFE_COMMANDS = [
    "rm",
    "rmdir",
    "mv",
    "cp",
    "dd",
    "del",
    "sudo",
    "su",
    ">",
    ">>",
    "2>",
    "2>>",
    "chmod",
    "chown",
    "shutdown",
    "reboot",
    "curl",
    "wget",
]


def execute_command(command: str) -> str:
    """
    Execute a shell command and return the output.

    If the command contains any unsafe commands, ask for user approval first.

    Args:
        command: The shell command to execute

    Returns:
        The command output or an error message
    """
    # Check if command contains any unsafe commands
    needs_approval = False
    for unsafe in UNSAFE_COMMANDS:
        if unsafe in command.split():
            needs_approval = True
            break

    if needs_approval:
        approval = input(
            f"\nWARNING: The command '{command}' contains potentially dangerous operations.\nDo you want to proceed? (y/n): "
        ).lower()
        if approval != "y":
            return "Command execution cancelled by user."

    try:
        result = subprocess.run(
            command,
            shell=True,
            capture_output=True,
            text=True,
            timeout=30,
        )

        if result.returncode == 0:
            return result.stdout
        else:
            return f"Command exited with code {result.returncode}, stderr:\n{result.stderr}"
    except subprocess.TimeoutExpired:
        return "Command timed out after 30 seconds."
    except Exception as e:
        return f"Error executing command: {str(e)}"


def try_to_parse_tool_call(
    message: ChatCompletionMessage,
) -> ChatCompletionMessageToolCall | None:
    if not "```tool_call" in message.content:
        return None

    tool_call_match = re.search(
        r"```tool_call(.*?)```", message.content, re.DOTALL
    ) or re.search(r"```tool_call(.*?)</tool_call>`", message.content, re.DOTALL)

    if not tool_call_match:
        return None

    tool_call_json = tool_call_match.group(1)
    
    try:
        tool_call_data = json.loads(tool_call_json)
    except json.JSONDecodeError:
        print(f"Error parsing tool call: {tool_call_json}")
        return None
    
    if "parameters" not in tool_call_data and "arguments" in tool_call_data:
        tool_call_data["parameters"] = tool_call_data["arguments"]
    
    return ChatCompletionMessageToolCall(
        id="random_id",
        type="function",
        function=Function(
            name=tool_call_data["name"],
            arguments=tool_call_data["parameters"],
        ),
    )

## Writing tool description

In [None]:
command_tool_description: ChatCompletionToolParam = {
    "type": "function",
    "function": {
        "name": "execute_command",
        "description": "Execute a shell command and return the output",
        "parameters": {
            "type": "object",
            "properties": {
                "command": {
                    "type": "string",
                    "description": "The shell command to execute",
                },
            },
            "required": ["command"],
        },
    },
}


def convert_tool_call_to_command(tool_call: ChatCompletionMessageToolCall) -> str:
    return " ".join(json.loads(tool_call.function.arguments).values())

## Writing system message and stopping criteria

In [None]:
SYSTEM_MESSAGE = """
You are a CLI agent.
Your job is to interpret user requests and complete them by executing appropriate shell commands.
You can only interact with the system by calling the tool `execute_command`, which runs a shell command and returns the output.

Instructions:
- Translate the user's request into one or more shell commands.
- For each shell command, call `execute_command`.
- For each shell command call provide a short (1-2 sentences) description of why we are you calling it.
- Once all necessary commands have been executed answer a user's query and finish with word `DONE`.
- If you need to write something to a file, use the following syntax: `cat > filename.txt <<EOF\ncontent\nEOF`

Examples of tasks:
- If the user asks to clone a GitHub repository, call `execute_command` with a `git clone` command.
- If the user wants to install packages, call `execute_command` with a `pip install` or appropriate package manager command.
- If the task involves multiple steps (e.g., creating a directory and then initializing a Git repo inside it), call `execute_command` for each step.
"""

def is_done(message: ChatCompletionMessage) -> bool:
    return not message.tool_calls and (message.content or "").strip().endswith("DONE")

## Agent loop using your tool descriptions and system message

In [None]:
class CLIAgent:
    """CLI Agent that uses LM to process and execute commands."""

    def __init__(
            self,
            model: str,
            client: OpenAI,
            tools: list[ChatCompletionToolParam],
            system_message: str,
            max_steps: int = 10,
    ):
        """
        Initialize the CLI Agent.

        Args:
            model: The OpenAI model to use
            client: The OpenAI client to use for chat completions
            system_message: The system message to use for the agent
            max_steps: Maximum number of steps in a conversation
        """
        self.model = model
        if self.model.startswith("PetrosStav"):
            self.system_message = "YOU MUST DELIMIT TOOL CALLS WITH <tool_call>...</tool_call> tags" + system_message
        else: 
            self.system_message = system_message
        self.max_steps = max_steps
        self.client = client
        self.tools = tools
        self.tool_names = {tool["function"]["name"] for tool in tools}

    def run_agent(self, user_input: str) -> str:
        """
        Run the agent with the given user input.

        Args:
            user_input: The user's input message

        Returns:
            The final response from the agent
        """
        messages = [
            {"role": "system", "content": self.system_message},
            {"role": "user", "content": user_input},
        ]

        for step in range(self.max_steps):
            try:
                response = self.client.chat.completions.create(
                    model=self.model,
                    messages=messages,
                    tools=self.tools,
                )

                message = response.choices[0].message

                print(f"\n🤖 Step {step + 1}: \033[36m{message.content.rstrip()}\033[0m")

                if step > 0 and is_done(message):
                    print(f"✨ \033[32mRequest fulfilled after {step + 1} steps.\033[0m")
                    return message.content
                
                if not message.tool_calls:
                    message.tool_calls = [try_to_parse_tool_call(message)]

                messages.append(message)

                if not message.tool_calls:
                    print("⚠️ \033[33mNo tool calls found\033[0m")
                    continue

                for tool_call in message.tool_calls:
                    if tool_call.function.name in self.tool_names:
                        try:
                            command = convert_tool_call_to_command(tool_call)

                            print(f"\n🔧 \033[35mExecuting command:\033[0m \033[1m{command}\033[0m")
                            tool_result = execute_command(command)
                            print(f"📝 \033[34mTOOL RESULT:\033[0m {tool_result}")

                            messages.append(
                                {
                                    "role": "tool",
                                    "tool_call_id": tool_call.id,
                                    "name": tool_call.function.name,
                                    "content": tool_result,
                                }
                            )
                        except json.JSONDecodeError:
                            error_msg = "❌ \033[31mError: Invalid JSON in tool arguments\033[0m"
                            messages.append(
                                {
                                    "role": "tool",
                                    "tool_call_id": tool_call.id,
                                    "name": tool_call.function.name,
                                    "content": error_msg,
                                }
                            )
                    else:
                        print(f"⚠️ \033[33mUnknown tool call {tool_call.function.name}. Skipping.\033[0m")

            except Exception as e:
                print(f"❌ \033[31mError during parsing  message. Message:\n {message.content}\nError\n{str(e)}\033[0m")
                return f"An error occurred: {str(e)}"

        print(
            f"⚠️ \033[33mReached maximum number of steps ({self.max_steps}). Request may not be completely fulfilled.\033[0m"
        )
        return messages[-1].content

## Let's try it out!

In [None]:
agent = CLIAgent(
    max_steps=10,
    client=client,
    tools=[command_tool_description],
    system_message=SYSTEM_MESSAGE,
    model=MODEL_NAME,
)

while True:
    try:
        user_input = input("> ")

        if not user_input or user_input.lower() in ["exit", "quit"]:
            print("Exiting CLI Agent. Goodbye!")
            break

        agent.run_agent(user_input)
    except KeyboardInterrupt:
        print("\nExiting CLI Agent. Goodbye!")
        break
    except Exception as e:
        raise e

# Part 4
## MCP

In [None]:
import nest_asyncio

nest_asyncio.apply()
from mcp import ClientSession, StdioServerParameters, types
from mcp.client.stdio import stdio_client

WORK_DIR = "."

server_parameters = StdioServerParameters(
    command="python",
    args=["python-filesystem-mcp.py", WORK_DIR],
)

async with stdio_client(server_parameters) as (read, write):
    async with ClientSession(read, write) as session:
        await session.initialize()

        tool_result = await session.list_tools()
        tools = [
            {
                "type": "function",
                "function": {
                    "name": tool.name,
                    "description": tool.description,
                    "parameters": {
                        k: v
                        for k, v in tool.inputSchema.items()
                        if k not in ["additionalProperties", "$schema"]
                    },
                }}
            for tool in tool_result.tools
        ]

        print("\n\033[1;36m=== 🤖 MCP Agent ===\033[0m")
        print("\033[1;33mAvailable tools:\033[0m")
        for tool in tool_result.tools:
            print(f"\033[1;32m•\033[0m {tool.name}: {tool.description}")

        messages = []
        while True:
            try:
                user_input = input("> ")

                if not user_input or user_input.lower() in ["exit", "quit"]:
                    print("\n\033[1;36mExiting MCP Agent. Goodbye! 👋\033[0m")
                    break

                messages.append({"role": "user", "content": user_input})

                response = client.chat.completions.create(
                    model=MODEL_NAME,
                    messages=messages,
                    tools=tools,
                    tool_choice="auto"
                )

                choice = response.choices[0]

                if not choice.message.tool_calls:
                    choice.message.tool_calls = [try_to_parse_tool_call(choice.message)]
                    if choice.message.tool_calls:
                        choice.finish_reason = "tool_calls"

                if choice.finish_reason != "tool_calls":
                    output = choice.message.content
                    messages.append({"role": "assistant", "content": output})
                    print(f"\n\033[1;34m🤖 Assistant:\033[0m {output}")
                    continue

                tool_call = choice.message.tool_calls[0]
                tool_name = tool_call.function.name
                tool_args = json.loads(tool_call.function.arguments)

                print(f"\n\033[1;33m🛠️  Using tool:\033[0m {tool_name}")
                print(f"\033[1;33m📝 Arguments:\033[0m {json.dumps(tool_args, indent=2)}")

                result = await session.call_tool(
                    name=tool_name,
                    arguments=tool_args
                )

                messages.append({
                    "role": "assistant",
                    "tool_calls": [tool_call.model_dump()]
                })
                messages.append({
                    "role": "tool",
                    "tool_call_id": tool_call.id,
                    "content": str(result)
                })

                followup = client.chat.completions.create(
                    model=MODEL_NAME,
                    messages=messages
                )
                final = followup.choices[0].message.content
                messages.append({"role": "assistant", "content": final})
                print(f"\n\033[1;34m🤖 Agent:\033[0m {final}")

            except KeyboardInterrupt:
                print("\n\033[1;36mExiting MCP Agent. Goodbye! 👋\033[0m")
                break
            except Exception as e:
                print(f"\n\033[1;31m❌ Error:\033[0m {e}")
                break

## Extras for self study
- [Awesome MCP servers](https://mcpservers.org)
- [Prompt Engineering Guide](https://www.promptingguide.ai)
- [AI Agents course from Huggingface](https://huggingface.co/learn/agents-course/unit0/introduction)
- [Koog: framework for writing agents in Kotlin](https://docs.koog.ai)
- [Agent Developement Kit](https://google.github.io/adk-docs/)
- [Smolagents](https://huggingface.co/docs/smolagents/index)
- [Langgraph](https://langchain-ai.github.io/langgraph/)
