# Fully offline Agent!

_(no colab for this one since the point is to be run entirely locally)_

This tutorial will guide on how to run agent, fully locally / offline i.e., running with a local LLM and a local MCP server, so no data will leave your machine!

This can be especially useful for privacy-sensitive applications or when you want to avoid any cloud dependencies.

In this example, we will showcase how to let an agent read and write in your local filesystem! Specifically, we will give read-access to the agent to our codebase so that it can write up and generate a README file to describe the project.

## Install Dependencies

any-agent uses the python asyncio module to support async functionality. When running in Jupyter notebooks, this means we need to enable the use of nested event loops. We'll install any-agent and enable this below using nest_asyncio.

In [None]:
%pip install 'any-agent[smolagents]' --quiet

import nest_asyncio

nest_asyncio.apply()

## Set up your own LLM locally

Regardless of which agent framework you choose in any-agent, all of them support LiteLLM, which is a proxy that allows us to use whichever LLM inside the framework, hosted on by any provider. For example, we could use a local model via llama.cpp or [llamafile](https://github.com/Mozilla-Ocho/llamafile), a google hosted gemini model, or a AWS bedrock hosted Llama model. For this example, we will use [Ollama](https://ollama.com/) to run our LLM locally!



### Ollama setup

First, install Ollama by following their instructions: https://ollama.com/download

### Picking an LLM

Pick a model that you can run locally based on your hardware and running it in your terminal. For example:

16-24GB RAM -> `granite3.3`  or  `deepseek-r1:8b` with ~20–35k context length

24+GB RAM -> `mistral-small3.2` or `devstral:24b` with ~40k+ context length


### Serving the model with the appropriate context length

By default, Ollama has a context length of 8192 tokens, which is not enough for our agent to work properly.
So we will need to run ollama while changing its default context length through an environment variable. In a terminal run:

```
OLLAMA_CONTEXT_LENGTH=24000 OLLAMA_DEBUG=1 ollama serve
```

All four of the models above have a max context length of 128k tokens, but if you have limited RAM if you set it to 128k it might cause you memory issues. For this example, we will set it to 24.000 tokens and provide a relatively small codebase.

### Load and run the model

In this tutorial, we will be running in a terminal, parallel to our notebook, `granite3.3`. In another terminal run:

```
ollama run granite3.3
```

For more information on setting environment variables in Ollama, please refer to their [documentation](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-configure-ollama-server).

## Configure the Agent and the Tools

Instead of giving read-write access to the agent to the whole of our filesystem, we will limit its scope by manually adding which path its allowed to work in by providing it as an argument to the filesystem tool later on.

In [None]:
from pathlib import Path

codebase_directory = "toy_codebase"
abs_path = str(Path(codebase_directory).resolve())
print(f"Codebase directory set: {abs_path}")

### Pick which tools to use

Since we want our agent to work fully locally/offline, we will not add any tools that require communication with remote servers, in this case a local MCP server for secure file-system operations. We could also simply implement python callable functions that do these operations (e.g. using the os library), but instead we are opting here for an MCP server to showcase how easy it would be to swap or add other MCP servers to this use-case. To enforce that our agent doesn't go rogue and read/write in directories it's not supposed to access we will use the MCP server through a docker container that only has access to the directory above by mounting it. Before running the code below, make sure you have [Docker](https://docs.docker.com/get-started/) running in the background.

In [None]:
from any_agent.config import MCPStdio

docker_destination = "/projects"

mcp_filesystem = MCPStdio(
    command="docker",
    args=[
        "run",
        "-i",
        "--rm",
        "--mount",
        f"type=bind,src={abs_path},dst={docker_destination}",
        "mcp/filesystem",
        docker_destination,
    ],
    tools=[
        "read_file",
        "read_multiple_files",
        "write_file",
        "list_allowed_directories",
        "list_directory",
        "search_files",
        "directory_tree",
    ],  # we only include the tools we need
)

Now that your LLM is running on the background (local server) and you have defined your tools, you need to pick your agent framework to build your agent. Note that the agent you'll built with any-agent can be run across multiple agent frameworks (Smolagent, TinyAgent, OpenAI, etc) and across various LLMs (Llama, DeepSeek, Mistral, etc). For this example, we will use the smolagents framework.  

In [None]:
from any_agent import AgentConfig, AnyAgent
from any_agent.tools import show_plan

# Define the agent
agent = AnyAgent.create(
    "smolagents",
    AgentConfig(
        model_id="ollama/granite3.3",
        instructions="""
        You must use the available tools to find an answer.
        """,
        tools=[mcp_filesystem, show_plan],
        model_args={"tool_choice": "required"},
    ),
)

## Run the Agent


In [None]:
agent_trace = agent.run(
    "Go through the files in the allowed directory and any folder inside it recursively. "
    "Read the content of each file that might contain documentation or code and then "
    "create a summary in markdown format that summarizes what this project is about "
    "and then write it in a README.md file in the same directory."
)

## View the results 

The `agent.run` method returns an AgentTrace object, which has a few convenient attributes for displaying some interesting information about the run.

In [None]:
print(agent_trace.final_output)  # Final answer
print(f"Duration: {agent_trace.duration.total_seconds():.2f} seconds")