Skip to content

Observability improvements

Choose a tag to compare

@csommerauer csommerauer released this 20 Apr 09:40
· 1 commit to main since this release

Highlights

This release adds a full observability layer to the ledger: telemetry events
covering command, OCC, transaction, account, and instance lifecycles, plus an
optional trace_context field on commands for distributed tracing
correlation.

Added

  • Telemetry instrumentation across the ledger (OBS-1). New
    DoubleEntryLedger.Telemetry module emits 14 point events and one span
    covering command enqueue/claim/retry/dead_letter/idempotency_hit, OCC
    retries, transaction created/posted/archived, account created/updated,
    instance created, and InstanceProcessor start/stop. Metadata uses
    instance_id consistently to keep hot paths DB-free, and emissions are
    wrapped in try/rescue so observational code can never crash business
    logic. Optional telemetry_metrics dependency with dashboard_metrics/0
    for Phoenix LiveDashboard integration. See pages/Telemetry.md for the
    event catalog and integration guide. (2f2a005)
  • trace_context field for distributed tracing (OBS-2). Optional
    string-valued map on Command, TransactionCommandMap, and
    AccountCommandMap. Vendor-neutral — carries W3C traceparent/tracestate,
    Datadog trace IDs, or custom correlation IDs — persisted to its own
    column and propagated through Logger metadata via the Traceable
    protocol. Validated as a flat string map with a configurable max key
    count (:max_trace_context_keys, default 10). Ships with v3 migration.
    (3eddb6c)
  • Telemetry summary in load tests. LoadTesting.TelemetryCollector
    collects latency samples and lifecycle counters during a load run;
    output now includes P50/P95/P99 latency, OCC retry count, dead letter
    count, and transaction breakdown. (469f213)

Changed

  • Dead-letter transitions now log at error level. Intermediate
    states (OCC timeouts, command processing failures) remain at warning
    since they trigger retries; only terminal dead-lettering is elevated.
    Addresses FINDING-OBS-3. (5c6062f)
  • mix tidewave alias honors PORT. Defaults to 4000; use
    PORT=4020 mix tidewave when running alongside another app already on
    4000. (3a78550)

Docs

  • Expanded custom telemetry handler guidance in pages/Telemetry.md with
    a worked dead-letter alert example covering attach_many,
    Task.Supervisor for blocking I/O, and handler lifecycle gotchas.
    (2a74738)

Migration Notes

Run the v3 migration to add the trace_context column to the commands
table. The column is unindexed — consumers who query by trace context
should add their own index.

Full Changelog: v0.2.0...v0.3.0