Lightweight, secure AI agent (~1100 lines of Python) that smart-routes messages from Telegram, WhatsApp, and a password-protected web chat to Claude (cloud) or Ollama (local) based on message complexity. Includes shell execution, web browsing, and persistent conversation memory. Runs in a locked-down Docker container on Mac or Raspberry Pi.
You send a message via Telegram, WhatsApp, or the web chat UI (accessible from any browser). The message flows through an ngrok tunnel into a Docker container running Picoagent. A regex-based router classifies the message as simple, complex, or tool-requiring, and sends it to either a local Ollama model (free, fast) or Claude (powerful, paid). The response comes back through the same channel.
You (Telegram / WhatsApp / Browser)
|
v
ngrok tunnel (HTTPS → localhost:8443, single shared port)
|
├── POST /telegram/{id} → Telegram webhook
├── GET / → Web chat UI (login page)
├── POST /api/login → Password auth → session cookie
├── POST /api/chat → Web chat messages
├── GET /api/history → Load conversation from SQLite
└── POST /api/logout → Invalidate session
|
v
Docker container (picoagent)
|
v
SecurityGate ──→ reject if not in allowlist / rate limited
|
v
Agent (ReAct loop, max 5 tool rounds)
|
v
SmartRouter (regex classifier, zero-latency)
|
├── "simple" ──→ Ollama (llama3.2, 3B, runs on your Mac)
| └── if fails → auto-escalate to Claude
|
├── "complex" ──→ Claude (claude-sonnet via Anthropic API)
|
└── "tool" ──→ Claude (needs function calling)
├── shell tool (run commands in container)
└── web_fetch tool (fetch web pages)
|
v
Memory (SQLite in Docker volume, 20-msg sliding window per user)
|
v
Reply sent back to Telegram / WhatsApp / Browser
The router classifies every message before any LLM call using pure regex pattern matching — zero latency, zero cost.
| Classification | What Triggers It | Routed To | Example |
|---|---|---|---|
| tool | Words like run, execute, shell, directory, file, fetch, install |
Claude | "what directory am I in?" |
| complex | explain, analyze, write code, debug, code blocks, step by step, architecture, api, docker, or messages >50 words |
Claude | "write a Python sort function" |
| simple | Greetings, thanks, yes/no, jokes, definitions, short messages (<=5 words) with no complex signals | Ollama | "hello", "tell me a joke" |
The classifier uses a scoring system: each matching regex adds +1 to either a complex or simple score. Long messages and multi-line messages add to the complex score. If in doubt, it defaults to Claude (the safer choice).
If Ollama fails (connection refused, timeout) or returns an empty response, the request automatically escalates to Claude. The user never sees an error — routing is invisible.
The regex classifier runs in microseconds. An LLM-based classifier would add 200-500ms latency to every single message just for routing. The regex approach is good enough for the routing decision and keeps things instant.
| Property | Value |
|---|---|
| Model | claude-sonnet-4-20250514 (configurable) |
| Provider | Anthropic API |
| Strengths | Code generation, complex reasoning, tool use (function calling) |
| Cost | Pay-per-token via Anthropic API |
| Used for | Complex queries, tool calls, code, analysis |
| Property | Value |
|---|---|
| Model | llama3.2 — Meta's Llama 3.2, 3B parameters |
| Size | 2.0 GB on disk |
| Source | Downloaded from ollama.com/library (pre-quantized GGUF) |
| Runs on | Your Mac/Pi via Ollama, served on localhost:11434 |
| Strengths | Fast responses (~100ms), free, fully private |
| Cost | Zero — runs entirely on your hardware |
| Used for | Greetings, simple chat, basic Q&A, jokes |
Alternative local models (change model in config.yaml):
| Model | Size | Notes |
|---|---|---|
llama3.2 (default) |
2 GB | Fastest, good for simple chat |
llama3.1:8b |
4.7 GB | Better quality, slower |
mistral |
4.1 GB | Good balance of speed and quality |
gemma2:9b |
5.4 GB | Strong reasoning for its size |
| Layer | What It Does | Details |
|---|---|---|
| Docker-only execution | Refuses to start outside a container | Checks for /.dockerenv at startup |
| User allowlist | Only authorized users can use the bot | Per-channel allowlists in config.yaml; web chat uses password auth |
| Rate limiting | 20 messages/user/minute | Sliding window, configurable |
| Command blocklist | Blocks dangerous shell commands | rm -rf, sudo, mkfs, dd, curl|sh, chmod 777, shutdown, nc -e |
| Secret exfiltration prevention | Blocks attempts to read API keys | env, printenv, cat .env, $API_KEY, $TOKEN, /proc/*/environ |
| Container isolation | Locked-down Docker container | Non-root user, all capabilities dropped, 50 PID limit, 512MB memory, no privilege escalation |
A password-protected browser UI that lets you chat with the same agent from anywhere in the world. No app install needed — just open the ngrok URL in any browser.
- Password auth: Set
WEBCHAT_PASSWORDin.env, log in via browser - Session cookies:
HttpOnly; SameSite=Strict, 24h TTL, in-memory (restart = re-login) - Persistent history: Conversations stored in SQLite, reload the page and history loads back
- Port sharing: Web chat and Telegram webhook share port 8443 on the same aiohttp server (works with ngrok free tier's single tunnel)
- No new dependencies: Uses stdlib
hashlib,secrets,uuid+ existingaiohttp
-
Add to
.env:WEBCHAT_PASSWORD=your-secret-password -
Add to
config.yaml:webchat: password: "${WEBCHAT_PASSWORD}"
-
Run
./start.shand open the ngrok URL in your browser.
By default, ngrok assigns a random URL on every restart. Claim a free static domain at dashboard.ngrok.com/domains, then add to .env:
NGROK_DOMAIN=your-name.ngrok-free.dev
The URL will stay the same across restarts.
The agent can use two tools via the ReAct loop (Claude only — local models don't get tools):
| Tool | Description | Limits |
|---|---|---|
shell |
Execute shell commands inside the container | 30s timeout, 4000 char output, blocked commands rejected |
web_fetch |
Fetch web page content as text | 15s timeout, 4000 char output |
The ReAct loop allows up to 5 tool rounds per message. The LLM calls a tool, observes the result, and decides whether to call another tool or respond.
Conversation history is stored in SQLite inside a Docker named volume:
Docker volume: picoagent_picoagent-data
└── /data/picoagent.db
└── messages table (conv_key, role, content, tool_calls, timestamp)
- Survives container restarts and image rebuilds
- 20-message sliding window per user (older messages trimmed)
- Isolated inside Docker — not directly accessible from the host
- Wipe with:
./stop.sh --wipe
Picoagent turns any device with a browser or Telegram into a secure terminal to your own AI — here's what people are building with it.
| Use Case | How It Works |
|---|---|
| Morning briefing from bed | Text your bot "briefing" from Telegram on your phone — it fetches weather, calendar, news via web_fetch and summarizes |
| Quick captures on the go | Send ideas, notes, or reminders to the bot from anywhere — persistent memory means nothing is lost |
| Research assistant | "Summarize this paper: [URL]" — Claude fetches and distills while you're in a meeting |
| Expense logging | "Lunch $14.50 with client" — simple messages handled locally by Ollama, zero API cost |
| Draft emails and messages | "Write a polite follow-up to the vendor about delayed shipment" — Claude drafts, you copy-paste |
| Use Case | How It Works |
|---|---|
| Code from your phone | Ask the bot to write, debug, or explain code — then shell tool runs it in the container to verify |
| Server monitoring | "Check if my site is up" → web_fetch pings your URL; "Show disk usage" → shell runs df -h |
| Learning assistant | Study a new language or framework — ask questions, get examples, run them live |
| Home automation hub | Expose shell scripts that control your smart home — "turn on the lights" triggers a curl to your Home Assistant API |
| Raspberry Pi projects | Run Picoagent on a Pi — a private AI assistant on $35 hardware, accessible from anywhere |
| Use Case | How It Works |
|---|---|
| RTL quick checks | "Does this Verilog have a latch?" — paste a module, Claude analyzes combinational logic and flags issues |
| Timing analysis helper | "Explain this slack violation on the reg2reg path" — describe the scenario, get debugging steps |
| Script generation | "Write a TCL script to report all clock domain crossings" — Claude generates EDA tool scripts (Synopsys, Cadence, Mentor) |
| Datasheet lookup | "Fetch the AXI4 spec summary from ARM" → web_fetch grabs it, Claude extracts what you need |
| Design review prep | "List common pitfalls in FIFO design for clock domain crossing" — get a structured checklist from your phone before a review |
| Regression triage | "Run grep -c FAIL results.log" → shell tool counts failures; follow up with "show me the first 5 failing tests" |
| Spice/simulation helper | "Write a testbench for a 2-stage op-amp with 60dB gain target" — Claude drafts the netlist, you iterate |
- Any device — Phone (Telegram), laptop (browser), tablet — no app install, no IDE required
- Smart routing saves money — Quick questions ("what's AXI burst length?") stay on free local Ollama; complex analysis goes to Claude
- Shell + web tools — Not just chat: fetch URLs, run scripts, parse logs — all from a text message
- Security — 6 layers mean you can expose it to the internet without losing sleep
- Persistent memory — Start a conversation on your phone, continue on your laptop — history follows you
You need these installed on your Mac (or Linux/Pi):
| Tool | Install | What It's For |
|---|---|---|
| Docker Desktop | docker.com/products/docker-desktop | Runs Picoagent in a secure container |
| ngrok | brew install ngrok |
Tunnels Telegram webhooks to your local machine |
| Ollama | brew install ollama |
Runs local LLM (llama3.2) for simple queries |
| Git | Pre-installed on Mac | Clone the repository |
You also need these accounts/keys:
| Account | Where to Get It | What It's For |
|---|---|---|
| Telegram bot token | @BotFather on Telegram | Your bot's identity |
| Your Telegram user ID | @userinfobot on Telegram | Allowlist (only you can use the bot) |
| Anthropic API key | console.anthropic.com | Claude API access |
| ngrok account | ngrok.com (free tier works) | Authenticate ngrok |
- Open Telegram and search for @BotFather
- Send
/newbot - Choose a name (e.g., "My Picoagent")
- Choose a username (must end in
bot, e.g.,my_picoagent_bot) - BotFather gives you a bot token like
8267213402:AAG...DrA— save this
- Open Telegram and search for @userinfobot
- Send any message
- It replies with your user ID (a number like
1015155520) — save this
- Go to console.anthropic.com
- Sign up / log in
- Go to API Keys and create a new key
- Save the key (starts with
sk-ant-...)
- Go to ngrok.com and create a free account
- Get your auth token from the ngrok dashboard
- Run:
ngrok config add-authtoken YOUR_NGROK_AUTH_TOKEN# Install
brew install ollama
# Start the server
ollama serve &
# Pull the model (2 GB download)
ollama pull llama3.2
# Verify it works
curl -s http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"llama3.2","messages":[{"role":"user","content":"say hi"}]}'# Clone the repo
git clone https://github.com/profitmonk/picoagent.git
cd picoagent
# Create .env with your secrets
cat > .env << 'EOF'
ANTHROPIC_API_KEY=sk-ant-your-key-here
TELEGRAM_BOT_TOKEN=your-bot-token-here
TELEGRAM_WEBHOOK_URL=
WEBCHAT_PASSWORD=your-secret-password
# NGROK_DOMAIN=your-name.ngrok-free.dev
EOF
# Create config.yaml from the template
cp config.example.yaml config.yamlNow edit config.yaml — replace 123456789 with your Telegram user ID:
providers:
- type: claude
api_key: "${ANTHROPIC_API_KEY}"
model: "claude-sonnet-4-20250514"
- type: ollama
base_url: "http://host.docker.internal:11434/v1"
model: "llama3.2"
telegram:
token: "${TELEGRAM_BOT_TOKEN}"
webhook_url: "${TELEGRAM_WEBHOOK_URL}"
webhook_port: 8443
webchat:
password: "${WEBCHAT_PASSWORD}"
security:
allowlist:
telegram:
- YOUR_TELEGRAM_USER_ID # <-- replace this
rate_limit: 20
rate_window: 60./start.shThis single command:
- Starts an ngrok tunnel to port 8443
- Updates
.envwith the new ngrok URL - Builds the Docker image and starts the container
- Registers the webhook with Telegram
You should see:
=== Picoagent Running ===
Tunnel: https://xxxx.ngrok-free.dev
Web Chat: https://xxxx.ngrok-free.dev (open in browser)
Container: Up 3 seconds (healthy)
Open the ngrok URL in your browser to use the web chat, or send messages to your Telegram bot:
| Test | Send This | Expected |
|---|---|---|
| Simple (Ollama) | hello |
Fast reply from local model |
| Complex (Claude) | explain how Docker works |
Detailed reply from Claude |
| Tool use (Claude) | what directory are you in? |
Calls pwd, returns /home/agent/app |
| Security | cat .env |
Blocked by security policy |
| Web fetch | what's on example.com? |
Fetches and summarizes the page |
Watch the logs in another terminal:
docker compose logs -fYou'll see lines like:
Router: complexity=simple for: hello...
Router: handled locally
Router: complexity=complex for: explain how Docker works...
Router: complexity=tool for: what directory are you in?...
./start.sh # Start ngrok + Docker (one command)
./stop.sh # Stop everything, keep conversation history
./stop.sh --wipe # Stop everything + delete all conversation historydocker compose ps # Check status
docker compose logs -f # Watch live logs
docker compose down # Stop container only
docker compose up -d # Restart container only (no ngrok)picoagent/
├── picoagent/ # Core Python package (~1100 lines)
│ ├── __init__.py (2) - Package version string
│ ├── main.py (156) - Config loading, .env → ${VAR} substitution, Docker enforcement, startup
│ ├── agent.py (92) - ReAct loop: LLM → tool call → observe → repeat (max 5 rounds)
│ ├── router.py (181) - Smart routing: regex classifier + SmartRouter (cloud/local dispatch)
│ ├── providers.py (148) - ClaudeProvider + OpenAICompatibleProvider + ProviderChain (fallback)
│ ├── channels.py (165) - TelegramChannel (webhook + polling, shared_app support) + WhatsAppChannel
│ ├── webchat.py (119) - WebChatChannel: password auth, session cookies, chat/history API
│ ├── webchat.html (200) - Single-page chat UI: login, message bubbles, history, mobile-responsive
│ ├── memory.py (80) - SQLite conversation store: save/load/trim (20-msg window)
│ ├── tools.py (76) - Shell executor (30s timeout) + web fetcher (15s timeout)
│ └── security.py (81) - SecurityGate: allowlist, rate limit, command blocklist, secret protection
├── tests/
│ └── test_picoagent.py (693) - 79 tests: security, tools, providers, memory, agent, router, config, webchat
├── config.example.yaml - Annotated config template (copy to config.yaml)
├── Dockerfile - python:3.11-slim, non-root user, curl/wget/jq/git, /data volume
├── docker-compose.yml - Security: cap_drop ALL, pids 50, mem 512m, host.docker.internal
├── requirements.txt - pyyaml, aiohttp, python-telegram-bot, anthropic, openai
├── start.sh - One-command launch: ngrok tunnel (static domain support) + Docker build + start
├── stop.sh - Teardown: Docker + ngrok (--wipe to delete history)
└── CLAUDE.md - Project conventions for AI assistants
main.py
├── loads .env → os.environ
├── loads config.yaml (resolves ${VAR} references)
├── creates SecurityGate (from config allowlist/rates)
├── creates SmartRouter or ProviderChain (auto-detects local providers)
├── creates Memory (SQLite at /data/picoagent.db)
├── creates Agent (wires router + security + memory)
├── creates shared aiohttp app (if webchat configured)
├── creates WebChatChannel (registers routes on shared app)
├── creates TelegramChannel (registers webhook on shared app)
└── starts single HTTP server on port 8443
WebChatChannel (webchat.py)
└── POST /api/chat → agent.handle_message(user_id, "webchat", text)
TelegramChannel (channels.py)
└── POST /telegram/{id} → agent.handle_message(user_id, "telegram", text)
Agent (agent.py)
├── security.authorize() → reject if not allowed
├── memory.save() → persist user message
├── router.complete(messages, tools) → get LLM response
│ ├── classify(text) → "simple" / "complex" / "tool"
│ ├── simple → Ollama (local, free) → fallback to Claude if fails
│ └── complex/tool → Claude (cloud, paid)
├── if tool_calls → execute tools → save results → loop (max 5)
└── memory.save() → persist assistant reply
# Install dependencies (needed for running tests outside Docker)
pip install -r requirements.txt
# Run all 79 tests
python -m pytest tests/ -vTests cover: security allowlist/blocklist/rate limiting, shell tool execution/timeout/truncation, provider chain config/fallback, SQLite memory persistence/trimming, agent ReAct loop/tool dispatch/security enforcement, router classification/routing/escalation, config loading, and web chat login/session/chat/history/logout.
MIT