-
Notifications
You must be signed in to change notification settings - Fork 0
Dashboard
A React/Vite SPA, bundled into the wheel and served by Argus at dashboard_path
(default /) on the bot's loop. It is a single pane: Overview, Interactions,
Gateway, Grafana, and Analytics. Disable with Argus(bot, dashboard=False).
build_app(registry, metrics_path, dashboard=..., middlewares=...) mounts:
| Route | Purpose |
|---|---|
GET {dashboard_path} |
SPA index.html
|
GET {dashboard_path}assets/* |
hashed JS/CSS/fonts |
GET /api/config |
{namespace, metrics_path, grafana_url, analytics_enabled, version, auth_required} |
GET /api/stream |
SSE: a metric snapshot now, then every dashboard_intervals |
GET /api/analytics/interaction-volume |
per-guild daily counts (analytics) |
GET /api/analytics/command-stats |
per-guild count + avg ms per command |
GET /api/analytics/avg-duration |
per-guild overall avg ms |
GET /metrics, /healthz
|
unchanged |
/api/stream is text/event-stream. Each event is data: <json>\n\n where the
JSON is build_snapshot(registry):
{"metrics": {"discord_guilds": {"type": "gauge",
"samples": [{"name": "discord_guilds", "labels": {"cluster": "default"}, "value": 3.0}]}}}It walks CollectorRegistry.collect(), so gauges are still read live at scrape
time and every exposed metric is covered automatically. Note prometheus_client
strips _total from a counter's family name; the samples keep it (e.g. family
discord_interactions, sample discord_interactions_total).
The SPA prefers SSE and falls back to polling /metrics (parsed client-side into
the same shape). It keeps a rolling buffer of recent snapshots to draw rates and
sparklines (uPlot), and computes histogram quantiles client-side.
If dashboard_auth_token is set, an aiohttp middleware requires
Authorization: Bearer <token> or ?token= (EventSource cannot set headers),
compared with hmac.compare_digest. /healthz and the metrics path stay open so
a Prometheus scraper does not need the token. The SPA reads the token from
?token= or localStorage and shows a prompt on 401.
Shown only when /api/config.analytics_enabled is true (i.e. enable_per_guild
-
clickhouse_dsn). Enter a guild id to load command stats (count + avg ms), overall average duration, and interaction volume. The API fails closed (403) without a token. See History and ClickHouse.
When grafana_url is set, links and embeds the four provisioned dashboards
(argus-overview, argus-interactions, argus-gateway, argus-health);
otherwise an empty-state with setup guidance.
The self-host stack ships Prometheus rules in prometheus/rules/argus.rules.yml
(loaded via rule_files in prometheus/prometheus.yml and mounted by
docker-compose.yml). They are conservative starting points - tune thresholds
and for windows to your bot:
- Recording rules: per-cluster app/prefix command error ratio (divide-by-zero guarded) and p95 app command latency, reused by the Health dashboard.
-
Alerts:
ArgusDown,ArgusSubsystemDown,ArgusInstrumentationErrors,ArgusHistoryEventsDropped,DiscordShardsDisconnected,DiscordHighCommandErrorRatio,DiscordRateLimited.
The Argus - Health Grafana dashboard (argus-health) visualises argus_up,
argus_subsystem_up, instrumentation-error and history-drop rates, the command
error ratio, and p95 latency.
With no token the dashboard exposes the same operational data as /metrics to
anyone who can reach the port. For anything public, set dashboard_auth_token
and bind to localhost or sit behind a reverse proxy.
The SPA lives in frontend/ (React 19, Vite, uPlot, lucide-react, Nimble
tokens). npm run dev proxies /api and /metrics to localhost:9191.
npm run build outputs to frontend/dist; the hatch build hook copies it into
src/argus/dashboard/static/ at wheel build (end users need no Node). Only two
subset woff2 fonts ship; .ttf is excluded from the wheel.
The same SPA bundle also powers the Fleet control plane: when /api/config
reports fleet: true, the app renders the Global -> Fleet -> Cluster drill-down
(with per-cluster trend sparklines) instead of the per-process dashboard. The
per-process view above is unchanged.