Picoagent

Lightweight, secure AI agent (~1100 lines of Python) that smart-routes messages from Telegram, WhatsApp, and a password-protected web chat to Claude (cloud) or Ollama (local) based on message complexity. Includes shell execution, web browsing, and persistent conversation memory. Runs in a locked-down Docker container on Mac or Raspberry Pi.

How It Works

You send a message via Telegram, WhatsApp, or the web chat UI (accessible from any browser). The message flows through an ngrok tunnel into a Docker container running Picoagent. A regex-based router classifies the message as simple, complex, or tool-requiring, and sends it to either a local Ollama model (free, fast) or Claude (powerful, paid). The response comes back through the same channel.

You (Telegram / WhatsApp / Browser)
     |
     v
ngrok tunnel (HTTPS → localhost:8443, single shared port)
     |
     ├── POST /telegram/{id}  → Telegram webhook
     ├── GET  /               → Web chat UI (login page)
     ├── POST /api/login      → Password auth → session cookie
     ├── POST /api/chat       → Web chat messages
     ├── GET  /api/history    → Load conversation from SQLite
     └── POST /api/logout     → Invalidate session
     |
     v
Docker container (picoagent)
     |
     v
SecurityGate ──→ reject if not in allowlist / rate limited
     |
     v
Agent (ReAct loop, max 5 tool rounds)
     |
     v
SmartRouter (regex classifier, zero-latency)
     |
     ├── "simple" ──→ Ollama (llama3.2, 3B, runs on your Mac)
     |                    └── if fails → auto-escalate to Claude
     |
     ├── "complex" ──→ Claude (claude-sonnet via Anthropic API)
     |
     └── "tool" ──→ Claude (needs function calling)
                        ├── shell tool (run commands in container)
                        └── web_fetch tool (fetch web pages)
     |
     v
Memory (SQLite in Docker volume, 20-msg sliding window per user)
     |
     v
Reply sent back to Telegram / WhatsApp / Browser

Smart Router

The router classifies every message before any LLM call using pure regex pattern matching — zero latency, zero cost.

Classification Logic

Classification	What Triggers It	Routed To	Example
tool	Words like `run`, `execute`, `shell`, `directory`, `file`, `fetch`, `install`	Claude	"what directory am I in?"
complex	`explain`, `analyze`, `write code`, `debug`, code blocks, `step by step`, `architecture`, `api`, `docker`, or messages >50 words	Claude	"write a Python sort function"
simple	Greetings, thanks, yes/no, jokes, definitions, short messages (<=5 words) with no complex signals	Ollama	"hello", "tell me a joke"

The classifier uses a scoring system: each matching regex adds +1 to either a complex or simple score. Long messages and multi-line messages add to the complex score. If in doubt, it defaults to Claude (the safer choice).

Auto-Escalation

If Ollama fails (connection refused, timeout) or returns an empty response, the request automatically escalates to Claude. The user never sees an error — routing is invisible.

Why Not Use an LLM to Classify?

The regex classifier runs in microseconds. An LLM-based classifier would add 200-500ms latency to every single message just for routing. The regex approach is good enough for the routing decision and keeps things instant.

Models

Cloud: Claude (Anthropic)

Property	Value
Model	`claude-sonnet-4-20250514` (configurable)
Provider	Anthropic API
Strengths	Code generation, complex reasoning, tool use (function calling)
Cost	Pay-per-token via Anthropic API
Used for	Complex queries, tool calls, code, analysis

Local: Ollama (llama3.2)

Property	Value
Model	`llama3.2` — Meta's Llama 3.2, 3B parameters
Size	2.0 GB on disk
Source	Downloaded from ollama.com/library (pre-quantized GGUF)
Runs on	Your Mac/Pi via Ollama, served on `localhost:11434`
Strengths	Fast responses (~100ms), free, fully private
Cost	Zero — runs entirely on your hardware
Used for	Greetings, simple chat, basic Q&A, jokes

Alternative local models (change model in config.yaml):

Model	Size	Notes
`llama3.2` (default)	2 GB	Fastest, good for simple chat
`llama3.1:8b`	4.7 GB	Better quality, slower
`mistral`	4.1 GB	Good balance of speed and quality
`gemma2:9b`	5.4 GB	Strong reasoning for its size

Security (6 Layers)

Layer	What It Does	Details
Docker-only execution	Refuses to start outside a container	Checks for `/.dockerenv` at startup
User allowlist	Only authorized users can use the bot	Per-channel allowlists in config.yaml; web chat uses password auth
Rate limiting	20 messages/user/minute	Sliding window, configurable
Command blocklist	Blocks dangerous shell commands	`rm -rf`, `sudo`, `mkfs`, `dd`, `curl\|sh`, `chmod 777`, `shutdown`, `nc -e`
Secret exfiltration prevention	Blocks attempts to read API keys	`env`, `printenv`, `cat .env`, `$API_KEY`, `$TOKEN`, `/proc/*/environ`
Container isolation	Locked-down Docker container	Non-root user, all capabilities dropped, 50 PID limit, 512MB memory, no privilege escalation

Web Chat

A password-protected browser UI that lets you chat with the same agent from anywhere in the world. No app install needed — just open the ngrok URL in any browser.

Password auth: Set WEBCHAT_PASSWORD in .env, log in via browser
Session cookies: HttpOnly; SameSite=Strict, 24h TTL, in-memory (restart = re-login)
Persistent history: Conversations stored in SQLite, reload the page and history loads back
Port sharing: Web chat and Telegram webhook share port 8443 on the same aiohttp server (works with ngrok free tier's single tunnel)
No new dependencies: Uses stdlib hashlib, secrets, uuid + existing aiohttp

Setup

Add to .env:
```
WEBCHAT_PASSWORD=your-secret-password
```

Add to config.yaml:

webchat:
  password: "${WEBCHAT_PASSWORD}"

Run ./start.sh and open the ngrok URL in your browser.

Static Domain (Optional)

By default, ngrok assigns a random URL on every restart. Claim a free static domain at dashboard.ngrok.com/domains, then add to .env:

NGROK_DOMAIN=your-name.ngrok-free.dev

The URL will stay the same across restarts.

Tools

The agent can use two tools via the ReAct loop (Claude only — local models don't get tools):

Tool	Description	Limits
`shell`	Execute shell commands inside the container	30s timeout, 4000 char output, blocked commands rejected
`web_fetch`	Fetch web page content as text	15s timeout, 4000 char output

The ReAct loop allows up to 5 tool rounds per message. The LLM calls a tool, observes the result, and decides whether to call another tool or respond.

Persistent Memory

Conversation history is stored in SQLite inside a Docker named volume:

Docker volume: picoagent_picoagent-data
  └── /data/picoagent.db
       └── messages table (conv_key, role, content, tool_calls, timestamp)

Survives container restarts and image rebuilds
20-message sliding window per user (older messages trimmed)
Isolated inside Docker — not directly accessible from the host
Wipe with: ./stop.sh --wipe

Use Cases

Picoagent turns any device with a browser or Telegram into a secure terminal to your own AI — here's what people are building with it.

Personal Productivity

Use Case	How It Works
Morning briefing from bed	Text your bot "briefing" from Telegram on your phone — it fetches weather, calendar, news via `web_fetch` and summarizes
Quick captures on the go	Send ideas, notes, or reminders to the bot from anywhere — persistent memory means nothing is lost
Research assistant	"Summarize this paper: [URL]" — Claude fetches and distills while you're in a meeting
Expense logging	"Lunch $14.50 with client" — simple messages handled locally by Ollama, zero API cost
Draft emails and messages	"Write a polite follow-up to the vendor about delayed shipment" — Claude drafts, you copy-paste

Personal & Side Projects

Use Case	How It Works
Code from your phone	Ask the bot to write, debug, or explain code — then `shell` tool runs it in the container to verify
Server monitoring	"Check if my site is up" → `web_fetch` pings your URL; "Show disk usage" → `shell` runs `df -h`
Learning assistant	Study a new language or framework — ask questions, get examples, run them live
Home automation hub	Expose shell scripts that control your smart home — "turn on the lights" triggers a curl to your Home Assistant API
Raspberry Pi projects	Run Picoagent on a Pi — a private AI assistant on $35 hardware, accessible from anywhere

Chip Design & Hardware Engineering

Use Case	How It Works
RTL quick checks	"Does this Verilog have a latch?" — paste a module, Claude analyzes combinational logic and flags issues
Timing analysis helper	"Explain this slack violation on the reg2reg path" — describe the scenario, get debugging steps
Script generation	"Write a TCL script to report all clock domain crossings" — Claude generates EDA tool scripts (Synopsys, Cadence, Mentor)
Datasheet lookup	"Fetch the AXI4 spec summary from ARM" → `web_fetch` grabs it, Claude extracts what you need
Design review prep	"List common pitfalls in FIFO design for clock domain crossing" — get a structured checklist from your phone before a review
Regression triage	"Run `grep -c FAIL results.log`" → `shell` tool counts failures; follow up with "show me the first 5 failing tests"
Spice/simulation helper	"Write a testbench for a 2-stage op-amp with 60dB gain target" — Claude drafts the netlist, you iterate

Why These Work on Picoagent

Any device — Phone (Telegram), laptop (browser), tablet — no app install, no IDE required
Smart routing saves money — Quick questions ("what's AXI burst length?") stay on free local Ollama; complex analysis goes to Claude
Shell + web tools — Not just chat: fetch URLs, run scripts, parse logs — all from a text message
Security — 6 layers mean you can expose it to the internet without losing sleep
Persistent memory — Start a conversation on your phone, continue on your laptop — history follows you

Build Your Own Picoagent (Step by Step)

Prerequisites

You need these installed on your Mac (or Linux/Pi):

Tool	Install	What It's For
Docker Desktop	docker.com/products/docker-desktop	Runs Picoagent in a secure container
ngrok	`brew install ngrok`	Tunnels Telegram webhooks to your local machine
Ollama	`brew install ollama`	Runs local LLM (llama3.2) for simple queries
Git	Pre-installed on Mac	Clone the repository

You also need these accounts/keys:

Account	Where to Get It	What It's For
Telegram bot token	@BotFather on Telegram	Your bot's identity
Your Telegram user ID	@userinfobot on Telegram	Allowlist (only you can use the bot)
Anthropic API key	console.anthropic.com	Claude API access
ngrok account	ngrok.com (free tier works)	Authenticate ngrok

Step 1: Create Your Telegram Bot

Open Telegram and search for @BotFather
Send /newbot
Choose a name (e.g., "My Picoagent")
Choose a username (must end in bot, e.g., my_picoagent_bot)
BotFather gives you a bot token like 8267213402:AAG...DrA — save this

Step 2: Get Your Telegram User ID

Open Telegram and search for @userinfobot
Send any message
It replies with your user ID (a number like 1015155520) — save this

Step 3: Get Your Anthropic API Key

Go to console.anthropic.com
Sign up / log in
Go to API Keys and create a new key
Save the key (starts with sk-ant-...)

Step 4: Set Up ngrok

Go to ngrok.com and create a free account
Get your auth token from the ngrok dashboard
Run:

ngrok config add-authtoken YOUR_NGROK_AUTH_TOKEN

Step 5: Install and Start Ollama

# Install
brew install ollama

# Start the server
ollama serve &

# Pull the model (2 GB download)
ollama pull llama3.2

# Verify it works
curl -s http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"llama3.2","messages":[{"role":"user","content":"say hi"}]}'

Step 6: Clone and Configure Picoagent

# Clone the repo
git clone https://github.com/profitmonk/picoagent.git
cd picoagent

# Create .env with your secrets
cat > .env << 'EOF'
ANTHROPIC_API_KEY=sk-ant-your-key-here
TELEGRAM_BOT_TOKEN=your-bot-token-here
TELEGRAM_WEBHOOK_URL=
WEBCHAT_PASSWORD=your-secret-password
# NGROK_DOMAIN=your-name.ngrok-free.dev
EOF

# Create config.yaml from the template
cp config.example.yaml config.yaml

Now edit config.yaml — replace 123456789 with your Telegram user ID:

providers:
  - type: claude
    api_key: "${ANTHROPIC_API_KEY}"
    model: "claude-sonnet-4-20250514"

  - type: ollama
    base_url: "http://host.docker.internal:11434/v1"
    model: "llama3.2"

telegram:
  token: "${TELEGRAM_BOT_TOKEN}"
  webhook_url: "${TELEGRAM_WEBHOOK_URL}"
  webhook_port: 8443

webchat:
  password: "${WEBCHAT_PASSWORD}"

security:
  allowlist:
    telegram:
      - YOUR_TELEGRAM_USER_ID    # <-- replace this
  rate_limit: 20
  rate_window: 60

Step 7: Start Everything

./start.sh

This single command:

Starts an ngrok tunnel to port 8443
Updates .env with the new ngrok URL
Builds the Docker image and starts the container
Registers the webhook with Telegram

You should see:

=== Picoagent Running ===
Tunnel:    https://xxxx.ngrok-free.dev
Web Chat:  https://xxxx.ngrok-free.dev  (open in browser)
Container: Up 3 seconds (healthy)

Step 8: Test It

Open the ngrok URL in your browser to use the web chat, or send messages to your Telegram bot:

Test	Send This	Expected
Simple (Ollama)	`hello`	Fast reply from local model
Complex (Claude)	`explain how Docker works`	Detailed reply from Claude
Tool use (Claude)	`what directory are you in?`	Calls `pwd`, returns `/home/agent/app`
Security	`cat .env`	Blocked by security policy
Web fetch	`what's on example.com?`	Fetches and summarizes the page

Watch the logs in another terminal:

docker compose logs -f

You'll see lines like:

Router: complexity=simple for: hello...
Router: handled locally
Router: complexity=complex for: explain how Docker works...
Router: complexity=tool for: what directory are you in?...

Daily Usage

./start.sh           # Start ngrok + Docker (one command)
./stop.sh            # Stop everything, keep conversation history
./stop.sh --wipe     # Stop everything + delete all conversation history

Managing the Container

docker compose ps          # Check status
docker compose logs -f     # Watch live logs
docker compose down        # Stop container only
docker compose up -d       # Restart container only (no ngrok)

Project Structure

picoagent/
├── picoagent/                    # Core Python package (~1100 lines)
│   ├── __init__.py        (2)   - Package version string
│   ├── main.py          (156)   - Config loading, .env → ${VAR} substitution, Docker enforcement, startup
│   ├── agent.py          (92)   - ReAct loop: LLM → tool call → observe → repeat (max 5 rounds)
│   ├── router.py        (181)   - Smart routing: regex classifier + SmartRouter (cloud/local dispatch)
│   ├── providers.py     (148)   - ClaudeProvider + OpenAICompatibleProvider + ProviderChain (fallback)
│   ├── channels.py      (165)   - TelegramChannel (webhook + polling, shared_app support) + WhatsAppChannel
│   ├── webchat.py       (119)   - WebChatChannel: password auth, session cookies, chat/history API
│   ├── webchat.html     (200)   - Single-page chat UI: login, message bubbles, history, mobile-responsive
│   ├── memory.py         (80)   - SQLite conversation store: save/load/trim (20-msg window)
│   ├── tools.py          (76)   - Shell executor (30s timeout) + web fetcher (15s timeout)
│   └── security.py       (81)   - SecurityGate: allowlist, rate limit, command blocklist, secret protection
├── tests/
│   └── test_picoagent.py (693)  - 79 tests: security, tools, providers, memory, agent, router, config, webchat
├── config.example.yaml          - Annotated config template (copy to config.yaml)
├── Dockerfile                   - python:3.11-slim, non-root user, curl/wget/jq/git, /data volume
├── docker-compose.yml           - Security: cap_drop ALL, pids 50, mem 512m, host.docker.internal
├── requirements.txt             - pyyaml, aiohttp, python-telegram-bot, anthropic, openai
├── start.sh                     - One-command launch: ngrok tunnel (static domain support) + Docker build + start
├── stop.sh                      - Teardown: Docker + ngrok (--wipe to delete history)
└── CLAUDE.md                    - Project conventions for AI assistants

How the Files Connect

main.py
  ├── loads .env → os.environ
  ├── loads config.yaml (resolves ${VAR} references)
  ├── creates SecurityGate (from config allowlist/rates)
  ├── creates SmartRouter or ProviderChain (auto-detects local providers)
  ├── creates Memory (SQLite at /data/picoagent.db)
  ├── creates Agent (wires router + security + memory)
  ├── creates shared aiohttp app (if webchat configured)
  ├── creates WebChatChannel (registers routes on shared app)
  ├── creates TelegramChannel (registers webhook on shared app)
  └── starts single HTTP server on port 8443

WebChatChannel (webchat.py)
  └── POST /api/chat → agent.handle_message(user_id, "webchat", text)

TelegramChannel (channels.py)
  └── POST /telegram/{id} → agent.handle_message(user_id, "telegram", text)

Agent (agent.py)
  ├── security.authorize() → reject if not allowed
  ├── memory.save() → persist user message
  ├── router.complete(messages, tools) → get LLM response
  │     ├── classify(text) → "simple" / "complex" / "tool"
  │     ├── simple → Ollama (local, free) → fallback to Claude if fails
  │     └── complex/tool → Claude (cloud, paid)
  ├── if tool_calls → execute tools → save results → loop (max 5)
  └── memory.save() → persist assistant reply

Tests

# Install dependencies (needed for running tests outside Docker)
pip install -r requirements.txt

# Run all 79 tests
python -m pytest tests/ -v

Tests cover: security allowlist/blocklist/rate limiting, shell tool execution/timeout/truncation, provider chain config/fallback, SQLite memory persistence/trimming, agent ReAct loop/tool dispatch/security enforcement, router classification/routing/escalation, config loading, and web chat login/session/chat/history/logout.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
picoagent		picoagent
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.example.yaml		config.example.yaml
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
start.sh		start.sh
stop.sh		stop.sh

Folders and files

Latest commit

History

Repository files navigation

Picoagent

How It Works

Smart Router

Classification Logic

Auto-Escalation

Why Not Use an LLM to Classify?

Models

Cloud: Claude (Anthropic)

Local: Ollama (llama3.2)

Security (6 Layers)

Web Chat

Setup

Static Domain (Optional)

Tools

Persistent Memory

Use Cases

Personal Productivity

Personal & Side Projects

Chip Design & Hardware Engineering

Why These Work on Picoagent

Build Your Own Picoagent (Step by Step)

Prerequisites

Step 1: Create Your Telegram Bot

Step 2: Get Your Telegram User ID

Step 3: Get Your Anthropic API Key

Step 4: Set Up ngrok

Step 5: Install and Start Ollama

Step 6: Clone and Configure Picoagent

Step 7: Start Everything

Step 8: Test It

Daily Usage

Managing the Container

Project Structure

How the Files Connect

Tests

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages