GitHub - santiagomora2/dispatch: Understand how CLI AI agents work and use your own local CLI AI! Easy to understand, written in python, built on Ollama.

██████╗ ██╗███████╗██████╗  █████╗ ████████╗ ██████╗██╗  ██╗
██╔══██╗██║██╔════╝██╔══██╗██╔══██╗╚══██╔══╝██╔════╝██║  ██║
██║  ██║██║███████╗██████╔╝███████║   ██║   ██║     ███████║
██║  ██║██║╚════██║██╔═══╝ ██╔══██║   ██║   ██║     ██╔══██║
██████╔╝██║███████║██║     ██║  ██║   ██║   ╚██████╗██║  ██║
╚═════╝ ╚═╝╚══════╝╚═╝     ╚═╝  ╚═╝   ╚═╝    ╚═════╝╚═╝  ╚═╝

A local AI agent harness written in python, built on ollama; with tool calling, streaming, and a persistent memory system.

Dispatch does not intend to compete with Claude Code, DeepAgents, OpenCode or other famous CLIs.

It is a tool I built for the love of the game, and is meant to be an easy-to-understand, easily-to-modify, light-weight, local CLI agent that you can study to understand how popular agentic systems and famous Agentic AI CLIs work.

Index

Index - This section
Installation - Setup and prerequisites
Project Structure - Directory tree overview
How It Works - Boot and main loop explained
Tool Registry - Available tools
Slash Command Registry - All slash commands
What's Next - Planned features
How to build on top of Dispatch - Adding custom tools/commands
License

Installation

Developed and tested on macOS and Linux. Windows should work but is untested.

Prerequisites

Before installing Dispatch, ensure you have:

1. Ollama installed and running

Download from ollama.com or install via package manager:

# macOS
brew install ollama

# Linux
curl -fsSL https://ollama.com/install.sh | sh

# Windows
# Download installer from https://ollama.com/download

Start the server:

ollama serve

2. A model that supports tool calling

# Recommended options:
ollama pull qwen3.5:9b       # Fast, excellent tool calling
ollama pull gemma4:e4b       # Strong reasoning

Install Dispatch

Option A: From PyPI (easiest)

pip install dispatch-agent
dispatch

Option B: From source

git clone https://github.com/santiagomora2/dispatch.git
cd dispatch
uv tool install --editable .

(Requires uv. Install with: curl -LsSf https://astral.sh/uv/install.sh | sh)

Configure

Update config.json in the dispatch directory to match your Ollama model:

{
  "model": "qwen3.5:9b",
  "context_limit": 32000,
  "mode": "auto",
  "version": "0.1.2",
  "auto_compact_tools": true
}

Then run:

dispatch

Dispatch operates on your current directory while keeping memory and config at the project root.

Project Structure

dispatch/
├── agent/
│   ├── cmd/
│   │   ├── __init__.py     # registry + dispatch
│   │   ├── arg_completers.py        # contains arg_completer functions for commands
│   │   ├── files.py        # file commands (/tree, /ls etc.)
│   │   ├── memory.py.      # memory commands (/note, /forget, etc.)
│   │   ├── plan.py         # plan command
│   │   └── session.py      # session commands (/clear, /compact, /model, etc.)
│   ├── tools/
│   │   ├── __init__.py     # registry + dispatch + get_schemas
│   │   ├── files.py        # file tools (read_file, patch_file, tree, etc.)
│   │   ├── memory.py       # memory tools (add_fact, forget_fact, etc.)
│   │   ├── session.py      # compact conversation (not callable, handled in main loop)
│   │   ├── shell.py.       # shell tools (run_shell)
│   │   └── web.py          # web search tools (web_search, fetch_url)
│   ├── plans/              # directory where agent's plans and statuses are logged
│   ├── __init__.py
│   ├── agent.py            # main loop
│   ├── completer.py        # slash commands auto-completer
│   ├── fancy_banner.py     # fancy welcome banner, ways to goodbye 
│   ├── main.py             # CLI entrypoint (typer)
│   ├── paths.py            # ROOT-anchored file
│   └── system_prompt.py    # system prompt
├── README.md               
├── config.json             # model, context_limit, mode
├── memory.md               # persistent agent memory
├── pyproject.toml          # entry point: `dispatch` command
├── session.json            # last compact summary
└── uv.lock

How It Works

Boot

dispatch is invoked from anywhere in the terminal
main.py calls run() in agent.py
agent.py loads config.json (model, context limit, mode)
memory.md is read and injected into the system prompt
The message history is initialized with the system prompt
The main loop starts

Main Loop

START
│
├── pending_tool_response = False?
│   YES ──> get user input
│         ├── "/" ──> run slash command ──> back to START
│         ├── empty ──> back to START
│         └── normal ──> append to messages
│
├── check token estimate > 80% limit?
│   YES ──> compact conversation (summarize history) ──> continue
│
├── call ollama.chat(stream=True, tools=get_schemas())
│   └── stream chunks to terminal as they arrive
│       ├── text content ──> print immediately
│       └── tool_calls ──> accumulate, execute after stream ends
│
├── tool_calls found?
│   YES ──> for each call:
│         ├── print [dim] tool: name(args)
│         ├── dispatch(name, args)
│         ├── append result to messages (role=tool)
│         └── set pending_tool_response = True ──> back to START (skip user input)
│
│   NO  ──> response already streamed
             wait for next user input ──> back to START

Tool Registry

Every tool is a decorated Python function in tools/:

@tool(schema_dict, lazy=False)
def read_file(path: str):
    ...

The @tool decorator registers the function and its JSON schema into TOOLS = {} or into LAZY{} if lazy=True(tool disabled).
The modules are imported at the bottom of tools/__init__.py so decorators run on startup. get_schemas() returns all schemas to pass to Ollama.
dispatch(name, args) looks up and calls the function, always returning {"error": "..."} on failure instead of raising.

Slash Command Registry

Every slash command is a decorated Python function in cmd/:

@command("note", description="Append a fact to memory", usage="")
def cmd_note(arg, ctx):
    ...

The @command decorator registers the function into COMMANDS = {} with its name, description, and usage hint.
Modules are imported in cmd/__init__.py so decorators run on startup.
dispatch_command(raw_input, ctx) parses the command name and argument, looks up the function, and calls it.
ctx is a dict holding live references to messages, model, config, and system_prompt so commands can mutate agent state directly.
The SlashCompleter reads from COMMANDS at runtime so any new command automatically appears in the / menu.

Flow note: if a command and a tool perform the same function, the command file should import from the corresponding tool file (Example: /note uses save_memory() tool).

Path Anchoring

paths.py uses __file__ to resolve all Dispatch-internal files (config, memory, session) to the project root; regardless of where you invoke dispatch from.
File operations the agent performs use os.getcwd() captured at startup as the working directory.

Current Tools

Tool	File	What it does
`read_file`	`tools/files.py`	Reads a file, returns content with line numbers
`create_file`	`tools/files.py`	Creates a file with an initial skeleton
`patch_file`	`tools/files.py`	Replaces, inserts, or deletes content via `old_str/new_str`
`append_file`	`tools/files.py`	Appends content to end of existing file
`find_pattern`	`tools/files.py`	Glob search for files matching a pattern
`list_dir`	`tools/files.py`	Lists files and dirs at a path
`tree`	`tools/files.py`	Prints a directory tree up to a given depth
`update_memory`	`tools/memory.py`	Update a section of the agent's persistent memory markdown file
`web_search`	`tools/web.py`	Searches the web for relevant URLs
`fetch_url`	`tools/web.py`	Fetches the content from a given URL, parses it as Markdown (`jina` + `tralifatura` fallback)
`run_shell`	`tools/shell.py`	Run a shell command with human confirmation, streaming output line by line

Current Slash Commands

Command	Usage	What it does
`/memory`	`/memory`	Print current memory.md
`/note`	`/note <text>`	Append a fact to memory directly
`/forget`	`/forget <section>`	Clear a memory section
`/clear`	`/clear`	Reset messages, keep memory
`/reset`	`/reset`	Reset messages and memory
`/compact`	`/compact`	Summarize session and replace history
`/compact_tools`	`/compact_tools`	Compact tool results into a summary
`/tools`	`/tools [enable/disable] <tool>`	List, enable, or disable tools
`/auto_compact_tools`	`/auto_compact_tools [enable/disable]`	Enable or disable auto tool compaction
`/model`	`/model [name]`	Show or switch the active Ollama model
`/tree`	`/tree <path> <depth>`	Print directory tree
`/ls`	`/ls <path>`	List directory contents
`/plan`	`/plan <task>`	Generate and execute a step-by-step plan
`/help`	`/help`	List all available commands
`/exit`	`/exit`	Quit Dispatch

What's Next

/mode - toggle careful/auto HITL aggressiveness
/retry - resend last user message
/history - print condensed message log

Maybe in the future MCP servers and custom skills idk

How to build on top of Dispatch

I know it may seem complicated but bear with me, customizing it is actually simple.

Add a tool

Create your tool as a python function, return a dict even if the function doesn't return anything. Make sure to wrap it in a try, except block that returns an error dict.

def read_file(path: str):
    try:
        with open(path) as f:
            lines = f.readlines()
        numbered = "".join(f"{i+1}: {l}" for i, l in enumerate(lines))
        return {"content": numbered}
        # or if the function shouldn't return anything
        return {"read_file": path}
    except Exception as e:
        return {"error": f"An  error occurred while reading the file: {str(e)}"}

Add the @tool decorator with an adequate description.
- The description: think which of these are necessary to tell the agent: what the tool does, when to use it, which tools it may use before/after, examples of usage, etc.
- The parameters the function recieves: their type and description; which ones are required
- Whether or not the tool is disabled on startup (lazy)

@tool({
    "type": "function",
    "function": {
        "name": "read_file",
        "description": "Read a file and return its contents with line numbers",
        "parameters": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "Path to the file"}
            },
            "required": ["path"]
        }
    }
}, lazy = False)
def read_file(path: str):
    try:
        with open(path) as f:
            lines = f.readlines()
        numbered = "".join(f"{i+1}: {l}" for i, l in enumerate(lines))
        return {"content": numbered}
    except Exception as e:
        return {"error": f"An  error occurred while reading the file: {str(e)}"}

It seems like a lot but is really quite simple.

If it was in a new file, add the import to /agent/tools/__init__.py

from agent.tools import files, memory, your_new_file #noqa

HITL gated

If you need a human-in-the-loop confirmation, add the following lines before executing the function that needs to be confirmed in the tool and returning.

from rich.prompt import Confirm # this,

@tool({...})
def read_file(path: str):
    try:
        if not Confirm.ask(f"Read file {path}?"): # this
          return {"error": "aborted"} # and this
        with open(path) as f:
            lines = f.readlines()
        numbered = "".join(f"{i+1}: {l}" for i, l in enumerate(lines))
        return {"content": numbered}
    except Exception as e:
        return {"error": f"An  error occurred while reading the file: {str(e)}"}

Add a slash command

Works very similar to the tool creation.

Create your command as a python function, only now you may execute some function and then print the result or that the function was executed correctly or completed.

Note: if a command and a tool perform the same function, just import the function from the corresponding tool file (Example: /read_file <path> uses the read_file(path) tool we just defined).

from agent.tools.files import read_file
from rich.console import Console
console = Console()

def cmd_read_file(path: str):
    content = read_file(path).get("content")
    console.print(content)

Add the @command decorator with an adequate description.
- The description will be seen by the user when using /help or the auto-completer suggests / commands.
- The usage (optional) is also seen by the user and is basically the argument (or arguments) of your function.
- arg_completer (also optional) is a function that will give the user the options they can choose as their argument(s).

@command("read", description="Print a file's content", usage="<path>",
         arg_completer=None)
def cmd_read_file(path: str):
    content = read_file(path).get("content")
    console.print(content)

If it was in a new file, add the import to /agent/cmd/__init__.py

from agent.cmd import memory, files, your_new_file #noqa

This was maybe a bad example because this command isn't actually implemented (why print it in console if you can just open the file) but you get the gist.

Arg completer

It is, again, a function that will give the user the options they can choose as their argument(s). For example, the models they can choose when using /model:

import ollama

def get_available_models():
    try:
        return [m.model for m in ollama.list().models]
    except Exception:
        return []
  
@command("model",
         description="Switch the active Ollama model",
         usage="<model>",
         arg_completer=get_available_models)
def cmd_model(arg, ctx):
    ctx["model"] = arg
    console.print(f"[green]Switched to: {arg}[/green]")
# Oversimplified version for illustrative purposes
# check agent/cmd/session.py for the actual implementation

Prefer putting it into arg_completers.py unless definable as a lambdafunction inside the decorator.

Everything else

All of the other stuff like welcome banner in the fancy_banner.py file or the main loop in agent.py can also be modified to make your own version of dispatch and learn how agentic AI and tools work.

My favorite feature? (personal note)

I really enjoyed implementing /plan.

Because of the natural constraints of running models locally on a computer without that much memory capacity, Mixture of Experts (MoE) models are an attractive option due to their speed and lower memory usage with only some parameters active at inference.

A possible problem when trying to implement a big change suddenly is prompt trajectory: the early tokens heavily influence which experts activate, and a poorly structured initial prompt locks you into a suboptimal expert path for the entire generation.

Hence /plan, which breaks the task into a guided reasoning sequence before any action is taken:

1. Understand & Decompose: Restate the task and break it into discrete subtasks.
2. Sequence & Assess Risks: Order the subtasks and note potential risks.
3. Finalize Plan: Output a structured markdown plan with steps and placeholders for decisions/artifacts.
4. Execute Sequentially: For each step, run it in isolation, allowing tool calls, and update the plan file with progress and artifacts.

I've found it works pretty well with both dense and MoE models.

The reason for the plan file is not filling up KV cache too quickly, but still having a shared memory between the sub-agents which implement the task (and also have logs for implemented plans); and for MoE models, also having each step activate the appropriate experts for the task.

License

MIT License. Use, modify, teach, learn. It's all yours.

Built with ❤️, as always.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Index

Installation

Prerequisites

Install Dispatch

Option A: From PyPI (easiest)

Option B: From source

Configure

Project Structure

How It Works

Boot

Main Loop

Tool Registry

Slash Command Registry

Path Anchoring

Current Tools

Current Slash Commands

What's Next

How to build on top of Dispatch

Add a tool

HITL gated

Add a slash command

Arg completer

Everything else

My favorite feature? (personal note)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
agent		agent
plans		plans
.gitignore		.gitignore
.python-version		.python-version
LICENSE.md		LICENSE.md
README.md		README.md
config.json		config.json
memory.md		memory.md
pyproject.toml		pyproject.toml
session.json		session.json
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Index

Installation

Prerequisites

Install Dispatch

Option A: From PyPI (easiest)

Option B: From source

Configure

Project Structure

How It Works

Boot

Main Loop

Tool Registry

Slash Command Registry

Path Anchoring

Current Tools

Current Slash Commands

What's Next

How to build on top of Dispatch

Add a tool

HITL gated

Add a slash command

Arg completer

Everything else

My favorite feature? (personal note)

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages