Summary
Expose Asynq queue metrics from §5.3 (queue depth, active, job duration, retries, failures, DLQ size, processing lag). These metrics drive the worker HPA (#81) and the §12 SLO alerts on processing lag and DLQ growth.
Design reference
- docs/10-observability.md §5.3 (Background jobs section)
- docs/12-jobs-cron.md §11.3 (Metrics)
Acceptance criteria
Dependencies
#150, #176
Complexity
M
Summary
Expose Asynq queue metrics from §5.3 (queue depth, active, job duration, retries, failures, DLQ size, processing lag). These metrics drive the worker HPA (#81) and the §12 SLO alerts on processing lag and DLQ growth.
Design reference
Acceptance criteria
gonext_asynq_queue_depth{queue},gonext_asynq_active_jobs{queue}gaugesgonext_asynq_job_duration_seconds{task_type}histogramgonext_asynq_job_retries_total{task_type}countergonext_asynq_job_failed_total{task_type, kind}counter (kind: error/panic/timeout)gonext_asynq_dlq_sizegaugegonext_asynq_processing_lag_seconds{queue}gauge — age of oldest pending jobgonext_jobs_enqueued_total{type,queue},gonext_jobs_idempotency_skips_total{type},gonext_jobs_unique_conflicts_total{type}counterstask_typecardinality bounded (~30) — enforced by registry (Public-form CSRF tokens (HMAC + anon-cookie binding) #176)Dependencies
#150, #176
Complexity
M