-
Notifications
You must be signed in to change notification settings - Fork 0
Add Observability With Observe
Important
Status: implemented — shipped in src/developer-workflows/commands/observe.md (v0.1.0).
Note
Goal: Wire structured telemetry (logging, RED metrics, distributed tracing, symptom-based alerts) into a feature or service as you build it, using the /observe command to enforce the "instrument as you build" discipline before any code ships to production.
Prereqs: the developer-workflows plugin installed at a version that ships /observe (Install crickets plugins); a working tree with code that runs in production (test-only changes are out of scope for /observe).
/observe enforces four disciplines in order: structured logging (log events not strings), RED metrics (Request rate, Error rate, Duration), OpenTelemetry tracing, and symptom-based alerting (alert on symptoms, not causes). It is triggered whenever you are adding telemetry or shipping anything to production.
-
Invoke the command at the point you are about to add telemetry or ship to production. Pass a component name, service name, or leave it empty to default to the current uncommitted diff:
/observe <component-or-service>The command identifies every production path in scope (HTTP handlers, background jobs, queue consumers, scheduled tasks, outbound API calls). Test-only paths are excluded (
observe.mdlines 59–61). -
Add structured log events at the entry and exit of each significant operation. The command enforces the rule "log events, not strings": every entry must carry event name, entity IDs, outcome, and duration. Secrets and PII are prohibited. Use
INFOfor normal events andERRORfor failures — noWARNfor errors, noDEBUGin production (observe.mdlines 63–66). -
Wire RED metrics for every external-facing surface or background job — request rate, error rate, and duration as a histogram (not a gauge). The command flags introduction of a second metrics library as a red flag; extend the existing one (
observe.mdlines 68–70). -
Wrap each significant operation in an OpenTelemetry span. Name the span after the operation, set
ERRORstatus on failures, attach the error message as a span attribute, and propagate trace context across async boundaries and outbound calls (observe.mdlines 72–75). -
Define symptom-based alerts for each RED metric: alert on user-visible symptoms (error rate above threshold, duration above SLO), not on infrastructure causes (CPU, memory, disk). Include a one-sentence runbook comment in each alert definition (
observe.mdlines 77–80).
Before committing, confirm the six-item checklist from observe.md lines 100–105:
- Every production path has a structured log event at entry and exit of each significant operation.
- RED metrics (request rate, error rate, duration histogram) are wired for every external-facing surface.
- Trace spans wrap significant operations and propagate across async boundaries.
- Alerts target user-visible symptoms, not infrastructure causes.
- No log entry contains secrets or PII.
- Observability is verifiable locally before committing: metrics emit, spans record, logs appear.
The command surfaces common rationalizations as red flags (observe.md lines 84–96):
| Symptom | Cause | Fix |
|---|---|---|
/observe flags the diff even though "the feature works" |
Observability is missing — "works" and "observable" are separate properties | Add structured logging, RED metrics, and trace spans before committing |
| Alert fires on CPU / memory / disk | Alert targets an infrastructure cause, not a user symptom | Rewrite alert to target error rate or latency SLO |
| Trace context lost across an async boundary or outbound call | Context not propagated | Propagate OTel context through the async handoff or outbound call |
-
Developer Workflows plugin — the plugin that ships
/observe. - How to run a pre-launch readiness gate with /launch — the companion gate that confirms observability is wired before first production rollout.
- Manifest schema — command primitive frontmatter reference
🔧 How-to
- Install plugins
- Using code review
- Provision a repo's wiki
- Declare a project's Architecture
- Maintain a wiki — wiki-watcher
- Review a change — code review
- In-flight decision review — /doubt
- Author a design (pending)
- Run a named plan
- Spawn a worker in a worktree
- Run isolated tasks
- Configure main branch protection
- Integrate a worker
- See every active plan
- Run a coordinator-directed worker team (pending)
- Install the vault backend (pending)
- Sync a project board