Skip to content

jorgearma/atalaya

Repository files navigation

VPS Control · Panchi Bot

Lightweight observability and security dashboard for a production WhatsApp bot running on a single-VPS Docker stack. Parses the bot's own log file and the nginx access log without touching the bot's code, exposes a read-only FastAPI endpoint, and renders a no-framework HTML dashboard that runs comfortably on a 1 GB VPS.

Dashboard screenshot

📸 Add a screenshot at docs/screenshot.png after your first deploy.


Why this exists

The bot was already in production when observability became a need: message throughput, error rate, p50/p95 response latency, and suspicious access patterns against order-link URLs (enumeration, fuzzing of Redis IDs).

Rather than instrumenting the bot (intrusive, risky, and blocked on a release cycle) this project sits next to it on the same host and derives every metric by parsing files the bot is already writing.

Constraints it was designed against:

  • Zero changes to the monitored bot.
  • Under 10% of one CPU core on a small VPS, idle most of the time.
  • No external services (no Prometheus, no Grafana, no cloud agent).
  • PII-safe by default — sender phone numbers are HMAC-hashed, raw numbers never leave the process.

Features

  • System metrics — CPU, RAM, swap, disk, load average, uptime via psutil.
  • Docker stack health — per-container state, CPU%, memory, with one-click start / stop / restart actions.
  • SQL Server liveness probe — runs SELECT @@VERSION through sqlcmd inside the container, throttled to once per minute.
  • Bot message throughput — messages per minute over the last hour, sparkline rendered in pure SVG.
  • Live message feed — with client-side pause + filter, and configurable privacy policy (full / hash / anon / agg).
  • Approximate bot latency — pairs each incoming Twilio message with the next outgoing response to the same sender, exposes p50/p95.
  • Error tracker — counts ERROR / CRITICAL / Traceback in the last hour and shows the most recent ones with context.
  • Order-link enumeration detection — flags IPs hitting more than N distinct /menu/, /pago/, /t/… paths within a short window, and pushes the alert into the existing security tracker.
  • Suspicious IP dashboard — scores IPs hitting /.env, /.git, /wp-admin, etc., or using scanner user-agents (sqlmap, nikto, nuclei, …).

Architecture

┌─────────────────┐      tail + regex         ┌──────────────────┐
│ panchi-bot.log  │ ─────────────────────────▶│  PanchiStats     │
└─────────────────┘   (inode+offset persist)  │  (1h windows,    │
                                              │   HMAC privacy)  │
┌─────────────────┐   docker logs nginx       │                  │
│ nginx access    │ ─────────────────────────▶│  LinkStats       │──┐
└─────────────────┘                           │  SecurityTracker │  │
                                              └──────────────────┘  │
                                                                    ▼
                                                           /api/panchi
                                                           /api/security
                                                           /api/system …
                                                                    │
                                                                    ▼
                                                          ┌──────────────┐
                                                          │ index.html   │
                                                          │ (no framework)│
                                                          └──────────────┘

All data lives in bounded in-memory deques (maxlen or 1-hour sliding window). No database, no persistence beyond a tiny panchi_offset.json so the log reader survives restarts.

Key design decisions

Decision Why
Incremental log reader with inode + offset Re-reading 100 MB every tick is wasteful; this reads only what the bot appended since last tick, and detects rotation by comparing inodes.
TTL cache on every shell-out docker stats costs ~800 ms. A dict-based cached(key, ttl, fn) wrapper cuts daemon hits by ~90% with zero dependencies.
HMAC with salt for sender IDs Plain SHA-256 is reversible with a small number space (phone numbers). A salt kept only in .env makes hashes unlinkable across deployments.
No JS framework Dashboard is 600 lines of vanilla HTML/JS; SVG sparkline in 15 lines. Loads in <100 ms, no build step.
Single uvicorn worker bound to localhost Protected by an SSH tunnel instead of public TLS + WAF. Smaller attack surface, simpler ops.

Quick start (local dev)

git clone https://github.com/<you>/vps_control.git
cd vps_control
cp .env.example .env
# edit .env: set VPS_CONTROL_PASS and PANCHI_HASH_SALT
./run.sh

Then open http://localhost:8080 (user/password from .env).

Requires Python 3.10+ and Docker available on the host.


Configuration (.env)

Variable Default Purpose
VPS_CONTROL_USER / VPS_CONTROL_PASS admin / changeme HTTP basic auth. Change the password.
VPS_CONTROL_PORT 8080 Port uvicorn binds to.
PANCHIBOT_DIR Directory where the monitored bot lives (used to find its .env for SQL password fallback).
PANCHIBOT_LOG Absolute path to panchi-bot.log.
STACK_SERVICES app,redis,sqlserver,worker-1,worker-2,nginx Compose service names to highlight on the Docker card.
PANCHI_PRIVACY hash One of full, hash, anon, agg. See below.
PANCHI_HASH_SALT 32 bytes of random hex. Generate with openssl rand -hex 32.
PANCHI_TEXT_MAX 160 Message text truncation in the feed.
PANCHI_LINK_PREFIXES /menu/,/pago/,/t/ URL prefixes that count as "order link".
PANCHI_ENUM_THRESHOLD 15 Distinct links per IP before flagging as enumerator.
PANCHI_ENUM_WINDOW_SEC 300 Time window for the check above.

Privacy modes

Mode Sender shown as Message text
full whatsapp:+34600123456 full
hash (default) a1b2c3d4 (HMAC-SHA256, first 8 hex chars) truncated to PANCHI_TEXT_MAX
anon +34 *** *** *56 truncated
agg not stored at all, only counters

Deployment (VPS)

# on the VPS (as root or via sudo)
rsync -avz --exclude venv --exclude .env ./ vps:/opt/vps_control/
ssh vps
cd /opt/vps_control
python3 -m venv venv && ./venv/bin/pip install -r requirements.txt
cp .env.example .env && nano .env   # fill in real values

Bind uvicorn to 127.0.0.1 by editing the last line of main.py:

uvicorn.run("main:app", host="127.0.0.1", port=PORT, reload=False)

Access the dashboard via SSH tunnel (never expose 8080 publicly):

ssh -N -L 8080:127.0.0.1:8080 vps
# then open http://localhost:8080 in your browser

systemd unit (running as root — initial testing)

Save as /etc/systemd/system/vps-control.service:

[Unit]
Description=VPS Control · Panchi Bot dashboard
After=docker.service network.target
Requires=docker.service

[Service]
Type=simple
User=root
WorkingDirectory=/opt/vps_control
ExecStart=/opt/vps_control/venv/bin/python main.py
Restart=on-failure
RestartSec=5
MemoryMax=256M
CPUQuota=50%

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl daemon-reload
sudo systemctl enable --now vps-control
sudo systemctl status vps-control
sudo journalctl -u vps-control -f   # follow logs

⚠️ Running as root is a convenience for initial testing — it gives the service automatic access to the kernel journal, UFW logs, Docker socket, and root-owned files monitored by the FIM module. For a hardened deployment, create a dedicated user and add it to docker, systemd-journal and adm groups instead, then switch User= in the unit above.

Host prerequisites

The host-security and network-security modules read logs and kernel state that require access to system resources. Running as root (above) satisfies all of them automatically. The following steps still need to be verified or enabled on the host itself:

1. Enable UFW logging — the network module parses firewall events to detect port scans, floods and blocked connections. UFW doesn't log dropped packets by default:

# check current state
sudo ufw status verbose | grep Logging

# enable — 'medium' is the sweet spot (logs blocked packets without
# drowning the journal in accepted-traffic noise)
sudo ufw logging medium

# verify events are landing in the journal
sudo journalctl -k -t kernel --grep='\[UFW' -n 20 --no-pager

The parser reads from journalctl -k rather than /var/log/ufw.log so it works the same on any distro (Ubuntu, Debian, rsyslog or journald only).

2. Verify the service user can read the kernel journal:

journalctl -k -n 5 --no-pager
journalctl -u ssh -n 5 --no-pager

Both must return output. If they don't, either run the service as root (quick path) or add the service user to the systemd-journal group:

sudo usermod -aG systemd-journal <user>
# re-login required for group change to apply

3. Verify Docker access — the service needs to talk to the Docker daemon to read container state and logs. Running as root covers this; for a non-root user, they must be in the docker group (sudo usermod -aG docker <user>).


Resource footprint

Measured on a 2-core VPS with a full bot stack already running, refresh interval 10 s, one browser tab open:

Resource Usage
CPU ~6% of one core average (peaks ~20% during docker stats ticks)
RAM 55–75 MB RSS, stable (bounded deques)
Network (to browser) ~5 MB/hour, ~4 MB/hour over ssh -C
Disk writes panchi_offset.json (~80 B each tick), suspicious events appended to security_events.jsonl

The expensive operations (docker stats, sqlcmd, docker logs) are each behind a TTL cache (20 s / 60 s / 20 s), so multiple browser tabs do not multiply the cost.


Project layout

vps_control/
├── main.py              # FastAPI app, auth, endpoints, SecurityTracker, TTL cache
├── panchi_stats.py      # Incremental bot-log parser + privacy + aggregator
├── nginx_stats.py       # Nginx access-log classifier + enumerator detector
├── static/
│   └── index.html       # Single-file dashboard (vanilla HTML/CSS/JS + SVG)
├── tests/
│   └── test_panchi_stats.py
├── requirements.txt
├── run.sh               # venv bootstrap + launch (dev)
├── .env.example
├── PLAN_NIVEL_1.md      # Original design doc for the observability section
└── README.md

Tests

python -m unittest discover tests

Tests cover the pure aggregators (PanchiStats, LinkStats) — they don't need a running server. See tests/test_panchi_stats.py.


Tech stack

  • Backend: Python 3.10+, FastAPI, uvicorn, psutil, python-dotenv
  • Frontend: Vanilla HTML + CSS + JS, SVG sparklines, no build step
  • Ops: systemd, SSH tunnel, Docker CLI (no SDK)

License

MIT — see LICENSE.

About

Lightweight self-hosted observability & security dashboard for a single-VPS Docker stack — real-time log tailing via ▎ inotify, SSH audit, file integrity monitor, nginx threat scoring and outbound C2 detection. No external DB. FastAPI + ▎ vanilla JS

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages