-
Notifications
You must be signed in to change notification settings - Fork 0
Tutorial Single Bot
A complete, copy-pasteable walkthrough for adding production-grade observability to one discord.py bot. By the end you will have Prometheus metrics, a live dashboard, and (optionally) Grafana boards, all from one line of code.
If you run many processes or regions, do this first, then see the Tutorial Fleet.
- Python 3.10 or newer.
- A discord.py 2.4+ bot (upstream discord.py; see Compatibility for forks).
- A bot token. Nothing else: no Prometheus, no Docker, no config files.
pip install argus-dpyTake any discord.py bot and add Argus(bot) after you construct the bot.
# bot.py
import os
import discord
from discord.ext import commands
from argus import Argus
intents = discord.Intents.default()
intents.members = True # so cached_users is meaningful
bot = commands.Bot(command_prefix="!", intents=intents)
Argus(bot) # the whole integration
@bot.event
async def on_ready() -> None:
print(f"{bot.user} is ready; metrics on http://localhost:9191/metrics")
bot.run(os.environ["DISCORD_TOKEN"])Run it:
DISCORD_TOKEN=your-token python bot.pyThat is it. Argus registered its listeners, then started an aiohttp server on the bot's own loop once the bot logged in.
-
Metrics:
http://localhost:9191/metrics(Prometheus text format). -
Dashboard:
http://localhost:9191/(a live SPA). -
Health:
http://localhost:9191/healthz.
The dashboard shows shard latency, interaction and command throughput with a success/error split, command duration, gateway throughput, rate-limit pressure, and cache sizes, updating live. See Metrics Reference for the full list and Dashboard for the SPA details.
If the host is reachable by anyone, set one environment variable. Argus picks it up; nothing else to wire.
ARGUS_DASHBOARD_AUTH_TOKEN=your-secret DISCORD_TOKEN=... python bot.pyNow / and every /api/* route require the token; /metrics stays scrapeable
for Prometheus. Open the dashboard once with ?token=your-secret and the browser
remembers it:
http://your-host:9191/?token=your-secret
The repo ships a provisioned stack. From a clone:
docker compose up -d # Prometheus on :9090, Grafana on :3000 (admin/admin)Point prometheus/prometheus.yml at your bot (it defaults to
host.docker.internal:9191). Then link your bot's dashboard to Grafana:
Argus(bot, grafana_url="http://localhost:3000")Three dashboards are pre-provisioned. See Clustering for the scrape config and OTLP if you would rather push to an OpenTelemetry collector.
Per-guild and per-user questions never go to Prometheus (cardinality). To answer them, enable the analytical path, which drains events to ClickHouse:
pip install "argus-dpy[clickhouse]"Argus(
bot,
enable_per_guild=True,
clickhouse_dsn="http://user:pass@clickhouse:8123",
dashboard_auth_token="your-secret", # the analytics API fails closed without this
)The dashboard's Analytics section then serves per-guild command counts and average durations. See History and ClickHouse.
Every option is a kwarg on Argus(bot) and a matching ARGUS_* env var. Kwargs
win over env, which wins over defaults. The most common:
Argus(
bot,
port=9191,
cluster_id="default",
dashboard_auth_token=None,
grafana_url=None,
)The repo's .env.example lists every variable, and examples/config_kwargs.py
shows every kwarg with its default. Full table and parsing rules:
Configuration.
-
Set
dashboard_auth_tokenanywhere the host is not strictly localhost. Keep/metricsopen so Prometheus can scrape it; the token only gates the UI/APIs. -
Enable the members intent if you want
cached_usersto mean anything; it is off by default in discord.py. -
Pin a version in production (
argus-dpy==x.y.z, or a pinned GHCR image tag) so a mid-development change can never reach a deployment. -
Leave instrumentation alone. It is fail-open: any hook error is counted in
argus_instrumentation_errors_totaland swallowed, never raised into your bot. A non-zero counter is your signal to investigate, not an outage. -
Do not add high-cardinality labels. Argus forbids
guild_id/user_id/channel_idon Prometheus by construction; route those questions to the analytical path instead. -
Give each process a distinct
cluster_idthe moment you run more than one (sharding, blue/green, multiple bots). Then graduate to the Tutorial Fleet.
- Dashboard is blank / "waiting for the first sample": the bot has not logged in yet, or it just started; the first snapshot appears within a few seconds.
-
401 on the dashboard: a token is set; open with
?token=...or send theAuthorization: Bearerheader. -
cached_usersis 0: enable the members intent (and in the Discord developer portal). -
Prometheus shows
exported_cluster: you set aclustertarget label that clashes with Argus's own; remove it (see Clustering).