Lightweight observability and security dashboard for a production WhatsApp bot running on a single-VPS Docker stack. Parses the bot's own log file and the nginx access log without touching the bot's code, exposes a read-only FastAPI endpoint, and renders a no-framework HTML dashboard that runs comfortably on a 1 GB VPS.
📸 Add a screenshot at
docs/screenshot.pngafter your first deploy.
The bot was already in production when observability became a need: message throughput, error rate, p50/p95 response latency, and suspicious access patterns against order-link URLs (enumeration, fuzzing of Redis IDs).
Rather than instrumenting the bot (intrusive, risky, and blocked on a release cycle) this project sits next to it on the same host and derives every metric by parsing files the bot is already writing.
Constraints it was designed against:
- Zero changes to the monitored bot.
- Under 10% of one CPU core on a small VPS, idle most of the time.
- No external services (no Prometheus, no Grafana, no cloud agent).
- PII-safe by default — sender phone numbers are HMAC-hashed, raw numbers never leave the process.
- System metrics — CPU, RAM, swap, disk, load average, uptime via
psutil. - Docker stack health — per-container state, CPU%, memory, with one-click
start/stop/restartactions. - SQL Server liveness probe — runs
SELECT @@VERSIONthroughsqlcmdinside the container, throttled to once per minute. - Bot message throughput — messages per minute over the last hour, sparkline rendered in pure SVG.
- Live message feed — with client-side pause + filter, and configurable
privacy policy (
full/hash/anon/agg). - Approximate bot latency — pairs each incoming Twilio message with the next outgoing response to the same sender, exposes p50/p95.
- Error tracker — counts
ERROR/CRITICAL/Tracebackin the last hour and shows the most recent ones with context. - Order-link enumeration detection — flags IPs hitting more than N
distinct
/menu/,/pago/,/t/…paths within a short window, and pushes the alert into the existing security tracker. - Suspicious IP dashboard — scores IPs hitting
/.env,/.git,/wp-admin, etc., or using scanner user-agents (sqlmap,nikto,nuclei, …).
┌─────────────────┐ tail + regex ┌──────────────────┐
│ panchi-bot.log │ ─────────────────────────▶│ PanchiStats │
└─────────────────┘ (inode+offset persist) │ (1h windows, │
│ HMAC privacy) │
┌─────────────────┐ docker logs nginx │ │
│ nginx access │ ─────────────────────────▶│ LinkStats │──┐
└─────────────────┘ │ SecurityTracker │ │
└──────────────────┘ │
▼
/api/panchi
/api/security
/api/system …
│
▼
┌──────────────┐
│ index.html │
│ (no framework)│
└──────────────┘
All data lives in bounded in-memory deques (maxlen or 1-hour sliding window).
No database, no persistence beyond a tiny panchi_offset.json so the log
reader survives restarts.
| Decision | Why |
|---|---|
Incremental log reader with inode + offset |
Re-reading 100 MB every tick is wasteful; this reads only what the bot appended since last tick, and detects rotation by comparing inodes. |
| TTL cache on every shell-out | docker stats costs ~800 ms. A dict-based cached(key, ttl, fn) wrapper cuts daemon hits by ~90% with zero dependencies. |
| HMAC with salt for sender IDs | Plain SHA-256 is reversible with a small number space (phone numbers). A salt kept only in .env makes hashes unlinkable across deployments. |
| No JS framework | Dashboard is 600 lines of vanilla HTML/JS; SVG sparkline in 15 lines. Loads in <100 ms, no build step. |
| Single uvicorn worker bound to localhost | Protected by an SSH tunnel instead of public TLS + WAF. Smaller attack surface, simpler ops. |
git clone https://github.com/<you>/vps_control.git
cd vps_control
cp .env.example .env
# edit .env: set VPS_CONTROL_PASS and PANCHI_HASH_SALT
./run.shThen open http://localhost:8080 (user/password from .env).
Requires Python 3.10+ and Docker available on the host.
| Variable | Default | Purpose |
|---|---|---|
VPS_CONTROL_USER / VPS_CONTROL_PASS |
admin / changeme |
HTTP basic auth. Change the password. |
VPS_CONTROL_PORT |
8080 |
Port uvicorn binds to. |
PANCHIBOT_DIR |
— | Directory where the monitored bot lives (used to find its .env for SQL password fallback). |
PANCHIBOT_LOG |
— | Absolute path to panchi-bot.log. |
STACK_SERVICES |
app,redis,sqlserver,worker-1,worker-2,nginx |
Compose service names to highlight on the Docker card. |
PANCHI_PRIVACY |
hash |
One of full, hash, anon, agg. See below. |
PANCHI_HASH_SALT |
— | 32 bytes of random hex. Generate with openssl rand -hex 32. |
PANCHI_TEXT_MAX |
160 |
Message text truncation in the feed. |
PANCHI_LINK_PREFIXES |
/menu/,/pago/,/t/ |
URL prefixes that count as "order link". |
PANCHI_ENUM_THRESHOLD |
15 |
Distinct links per IP before flagging as enumerator. |
PANCHI_ENUM_WINDOW_SEC |
300 |
Time window for the check above. |
| Mode | Sender shown as | Message text |
|---|---|---|
full |
whatsapp:+34600123456 |
full |
hash (default) |
a1b2c3d4 (HMAC-SHA256, first 8 hex chars) |
truncated to PANCHI_TEXT_MAX |
anon |
+34 *** *** *56 |
truncated |
agg |
— | not stored at all, only counters |
# on the VPS (as root or via sudo)
rsync -avz --exclude venv --exclude .env ./ vps:/opt/vps_control/
ssh vps
cd /opt/vps_control
python3 -m venv venv && ./venv/bin/pip install -r requirements.txt
cp .env.example .env && nano .env # fill in real valuesBind uvicorn to 127.0.0.1 by editing the last line of main.py:
uvicorn.run("main:app", host="127.0.0.1", port=PORT, reload=False)Access the dashboard via SSH tunnel (never expose 8080 publicly):
ssh -N -L 8080:127.0.0.1:8080 vps
# then open http://localhost:8080 in your browserSave as /etc/systemd/system/vps-control.service:
[Unit]
Description=VPS Control · Panchi Bot dashboard
After=docker.service network.target
Requires=docker.service
[Service]
Type=simple
User=root
WorkingDirectory=/opt/vps_control
ExecStart=/opt/vps_control/venv/bin/python main.py
Restart=on-failure
RestartSec=5
MemoryMax=256M
CPUQuota=50%
[Install]
WantedBy=multi-user.targetEnable and start:
sudo systemctl daemon-reload
sudo systemctl enable --now vps-control
sudo systemctl status vps-control
sudo journalctl -u vps-control -f # follow logs
⚠️ Running asrootis a convenience for initial testing — it gives the service automatic access to the kernel journal, UFW logs, Docker socket, and root-owned files monitored by the FIM module. For a hardened deployment, create a dedicated user and add it todocker,systemd-journalandadmgroups instead, then switchUser=in the unit above.
The host-security and network-security modules read logs and kernel state
that require access to system resources. Running as root (above) satisfies
all of them automatically. The following steps still need to be verified or
enabled on the host itself:
1. Enable UFW logging — the network module parses firewall events to detect port scans, floods and blocked connections. UFW doesn't log dropped packets by default:
# check current state
sudo ufw status verbose | grep Logging
# enable — 'medium' is the sweet spot (logs blocked packets without
# drowning the journal in accepted-traffic noise)
sudo ufw logging medium
# verify events are landing in the journal
sudo journalctl -k -t kernel --grep='\[UFW' -n 20 --no-pagerThe parser reads from journalctl -k rather than /var/log/ufw.log so it
works the same on any distro (Ubuntu, Debian, rsyslog or journald only).
2. Verify the service user can read the kernel journal:
journalctl -k -n 5 --no-pager
journalctl -u ssh -n 5 --no-pagerBoth must return output. If they don't, either run the service as root
(quick path) or add the service user to the systemd-journal group:
sudo usermod -aG systemd-journal <user>
# re-login required for group change to apply3. Verify Docker access — the service needs to talk to the Docker daemon
to read container state and logs. Running as root covers this; for a
non-root user, they must be in the docker group (sudo usermod -aG docker <user>).
Measured on a 2-core VPS with a full bot stack already running, refresh interval 10 s, one browser tab open:
| Resource | Usage |
|---|---|
| CPU | ~6% of one core average (peaks ~20% during docker stats ticks) |
| RAM | 55–75 MB RSS, stable (bounded deques) |
| Network (to browser) | ~5 MB/hour, ~4 MB/hour over ssh -C |
| Disk writes | panchi_offset.json (~80 B each tick), suspicious events appended to security_events.jsonl |
The expensive operations (docker stats, sqlcmd, docker logs) are each
behind a TTL cache (20 s / 60 s / 20 s), so multiple browser tabs do not
multiply the cost.
vps_control/
├── main.py # FastAPI app, auth, endpoints, SecurityTracker, TTL cache
├── panchi_stats.py # Incremental bot-log parser + privacy + aggregator
├── nginx_stats.py # Nginx access-log classifier + enumerator detector
├── static/
│ └── index.html # Single-file dashboard (vanilla HTML/CSS/JS + SVG)
├── tests/
│ └── test_panchi_stats.py
├── requirements.txt
├── run.sh # venv bootstrap + launch (dev)
├── .env.example
├── PLAN_NIVEL_1.md # Original design doc for the observability section
└── README.md
python -m unittest discover testsTests cover the pure aggregators (PanchiStats, LinkStats) — they don't
need a running server. See tests/test_panchi_stats.py.
- Backend: Python 3.10+, FastAPI, uvicorn, psutil, python-dotenv
- Frontend: Vanilla HTML + CSS + JS, SVG sparklines, no build step
- Ops: systemd, SSH tunnel, Docker CLI (no SDK)
MIT — see LICENSE.
