Dashboard

A React/Vite SPA, bundled into the wheel and served by Argus at dashboard_path (default /) on the bot's loop. It is a single pane: Overview, Interactions, Gateway, Grafana, and Analytics. Disable with Argus(bot, dashboard=False).

Routes

build_app(registry, metrics_path, dashboard=..., middlewares=...) mounts:

Route	Purpose
`GET {dashboard_path}`	SPA `index.html`
`GET {dashboard_path}assets/*`	hashed JS/CSS/fonts
`GET /api/config`	`{namespace, metrics_path, grafana_url, analytics_enabled, version, auth_required}`
`GET /api/stream`	SSE: a metric snapshot now, then every `dashboard_interval`s
`GET /api/analytics/interaction-volume`	per-guild daily counts (analytics)
`GET /api/analytics/command-stats`	per-guild count + avg ms per command
`GET /api/analytics/avg-duration`	per-guild overall avg ms
`GET /metrics`, `/healthz`	unchanged

Snapshot protocol

/api/stream is text/event-stream. Each event is data: <json>\n\n where the JSON is build_snapshot(registry):

{"metrics": {"discord_guilds": {"type": "gauge",
  "samples": [{"name": "discord_guilds", "labels": {"cluster": "default"}, "value": 3.0}]}}}

It walks CollectorRegistry.collect(), so gauges are still read live at scrape time and every exposed metric is covered automatically. Note prometheus_client strips _total from a counter's family name; the samples keep it (e.g. family discord_interactions, sample discord_interactions_total).

The SPA prefers SSE and falls back to polling /metrics (parsed client-side into the same shape). It keeps a rolling buffer of recent snapshots to draw rates and sparklines (uPlot), and computes histogram quantiles client-side.

Auth

If dashboard_auth_token is set, an aiohttp middleware requires Authorization: Bearer <token> or ?token= (EventSource cannot set headers), compared with hmac.compare_digest. /healthz and the metrics path stay open so a Prometheus scraper does not need the token. The SPA reads the token from ?token= or localStorage and shows a prompt on 401.

Analytics section

Shown only when /api/config.analytics_enabled is true (i.e. enable_per_guild

clickhouse_dsn). Enter a guild id to load command stats (count + avg ms), overall average duration, and interaction volume. The API fails closed (403) without a token. See History and ClickHouse.

Grafana section

When grafana_url is set, links and embeds the four provisioned dashboards (argus-overview, argus-interactions, argus-gateway, argus-health); otherwise an empty-state with setup guidance.

Alerting and recording rules

The self-host stack ships Prometheus rules in prometheus/rules/argus.rules.yml (loaded via rule_files in prometheus/prometheus.yml and mounted by docker-compose.yml). They are conservative starting points - tune thresholds and for windows to your bot:

Recording rules: per-cluster app/prefix command error ratio (multi-window: 5m/30m/1h/6h) and p95 app command latency, reused by the Health dashboard.
Alerts: ArgusDown, ArgusSubsystemDown, ArgusInstrumentationErrors, ArgusHistoryEventsDropped, DiscordShardsDisconnected, DiscordHighCommandErrorRatio, DiscordRateLimited.
SLO burn-rate alerts (99% app-command success, 1% budget): a fast burn (ArgusAppCommandErrorBudgetBurnFast, pages) and a slow burn (ArgusAppCommandErrorBudgetBurnSlow, tickets), following the Google SRE multi-window multi-burn-rate pattern.

The Argus - Health Grafana dashboard (argus-health) visualises argus_up, argus_subsystem_up, instrumentation-error and history-drop rates, the command error ratio, and p95 latency, plus a Golden signals (RED) row: traffic (command throughput), errors (error ratio), and saturation (dropped events).

The rules are unit-tested with promtool test rules in CI, so a broken alert or recording rule fails the build.

Security

With no token the dashboard exposes the same operational data as /metrics to anyone who can reach the port. For anything public, set dashboard_auth_token and bind to localhost or sit behind a reverse proxy.

Building the SPA (contributors)

The SPA lives in frontend/ (React 19, Vite, uPlot, lucide-react, Nimble tokens). npm run dev proxies /api and /metrics to localhost:9191. npm run build outputs to frontend/dist; the hatch build hook copies it into src/argus/dashboard/static/ at wheel build (end users need no Node). Only two subset woff2 fonts ship; .ttf is excluded from the wheel.

Fleet mode

The same SPA bundle also powers the Fleet control plane: when /api/config reports fleet: true, the app renders the Global -> Fleet -> Cluster drill-down (with per-cluster trend sparklines) instead of the per-process dashboard. The per-process view above is unchanged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dashboard

Dashboard

Routes

Snapshot protocol

Auth

Analytics section

Grafana section

Alerting and recording rules

Security

Building the SPA (contributors)

Fleet mode

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Argus

Tutorials

Clone this wiki locally