feat: Observability — OutboxProcessorListener + okapi-micrometer module (KOJAK-44)#27
Merged
feat: Observability — OutboxProcessorListener + okapi-micrometer module (KOJAK-44)#27
Conversation
Add Micrometer to version catalog, register okapi-micrometer module in settings and BOM. Module depends on okapi-core with micrometer-core as compileOnly.
OutboxProcessor accepts an optional listener and clock. After each entry is processed, it emits a sealed OutboxProcessingEvent (Delivered, Retried, Failed) with per-entry Duration. After the batch, it calls onBatchProcessed. Exceptions in the listener are caught and logged — they never break processing.
Implements OutboxProcessorListener with Micrometer counters for delivered/retried/failed entries and a timer for batch duration.
Registers count-per-status and lag-per-status gauges that poll OutboxStore on each Prometheus scrape. Gauge suppliers are wrapped in an optional TransactionRunner (required for Exposed-backed stores) with try-catch returning NaN on failure.
…K-44) Add MicrometerConfiguration inner class that creates MicrometerOutboxListener and MicrometerOutboxMetrics beans when MeterRegistry is on the classpath. OutboxProcessor bean now accepts an optional OutboxProcessorListener.
"Retried" (past tense) implied the retry already happened, but the event is emitted when a failed delivery attempt is rescheduled for another try — even on the very first attempt. "RetryScheduled" is semantically accurate regardless of the attempt number. Renamed across: sealed event, OutboxProcessor mapping, MicrometerOutboxListener counter (okapi.entries.retried → okapi.entries.retry_scheduled), and all tests.
…y (KOJAK-44) outboxProcessor bean now injects ObjectProvider<Clock>, consistent with all other beans in OutboxAutoConfiguration. Previously it silently fell back to Clock.systemUTC() even when a custom Clock bean was present. Per-entry duration now captures only the delivery attempt time (entryProcessor.process), excluding store.updateAfterProcessing(). This prevents DB write latency from inflating delivery metrics.
…eMock (KOJAK-44) Verifies the full observability pipeline against real infrastructure: - Retry-then-succeed: RetryScheduled counter + Delivered counter + gauges - Permanent failure: Failed counter + gauge reflects FAILED status - Batch duration: timer records realistic HTTP delivery time (50ms stub) - Lag gauge: reflects real time difference for pending entries in Postgres
…el (KOJAK-44) Inner @configuration classes inside @autoConfiguration do not reliably see beans from other autoconfigurations via @ConditionalOnBean. This caused MicrometerConfiguration to never activate because MeterRegistry was not yet available when the condition was evaluated. Fix: extract to a separate top-level @autoConfiguration with its own @AutoConfigureAfter targeting the correct Spring Boot 4 package (org.springframework.boot.micrometer.metrics.autoconfigure).
…n (KOJAK-44) Add Observability section with metrics table and quick-start snippet. Update module diagram and table to include okapi-micrometer.
Rename okapi.entries.retry_scheduled to okapi.entries.retry.scheduled (dots-only follows Micrometer naming convention). Clarify README observability section, add tag names to gauge descriptions, document duration excludes DB write, single-listener note, autoconfig override.
ramafasa
reviewed
Apr 16, 2026
ramafasa
reviewed
Apr 16, 2026
ramafasa
approved these changes
Apr 16, 2026
…ics (KOJAK-44) Addresses PR #27 review comments by ramafasa: gauges previously called countByStatuses() and findOldestCreatedAt(setOf(status)) once per status, producing N queries per scrape with inconsistent snapshots between status tags. Switch from supplier-per-status (pull) to MultiGauge + push refresh, the canonical Micrometer pattern for DB-backed gauges (per Micrometer docs): - MicrometerOutboxMetrics now exposes refresh() which performs a single transaction containing both store queries and atomically registers all status rows on each MultiGauge — one query per metric per refresh, snapshot-consistent across status tags. - OutboxMetricsRefresher (new, framework-agnostic): single-thread daemon scheduler for non-Spring users (Ktor, plain JVM). Wraps refresh(). - okapi-spring-boot autoconfig wires a refresher bean with start/close lifecycle; refresh interval configurable via okapi.metrics.refresh-interval (Duration, default PT15S). No @EnableScheduling required. okapi-core untouched. okapi-micrometer has zero Spring dependencies. Multi-instance behaviour documented in README: each instance publishes identical gauge values (shared DB state); aggregate with max by (status) in PromQL, not sum. Polling cost: 2 queries per refresh-interval per pod.
3 tasks
endrju19
added a commit
that referenced
this pull request
Apr 24, 2026
…terval (#29) ## Summary Follow-up to #27. Two fixes discovered while testing the merged observability changes against a standalone Spring Boot demo app. ### 1. `okapi-micrometer` was not being published The module's `build.gradle.kts` was missing `id(\"buildsrc.convention.publish\")`. Without this, the module compiles and ships in source but is **not published to Maven Central**, so downstream users declaring `com.softwaremill.okapi:okapi-micrometer:0.1.0` would get an unresolvable dependency. Verified by reproducing the issue with `./gradlew publishToMavenLocal`: before the fix, every other module appeared in `~/.m2/repository/com/softwaremill/okapi/` except `okapi-micrometer`. After the fix, all modules publish. ### 2. `okapi.metrics.refresh-interval` lacked IDE autocomplete metadata The new property was not registered in `spring-configuration-metadata.json`, so users in IntelliJ / VS Code wouldn't get autocomplete or hover docs in `application.yml`. Same pattern as existing `okapi.processor.*` and `okapi.purger.*` entries. Also added KDoc on `OkapiMetricsProperties` and a Configuration table to the README Observability section. ## Test plan - [x] `./gradlew publishToMavenLocal -PskipSigning=true` produces `okapi-micrometer-0.1.0.jar` containing both `MicrometerOutboxMetrics` and `OutboxMetricsRefresher` - [x] Demo app at consumer-side (Spring Boot + Postgres + Prometheus actuator) successfully imports `okapi-micrometer:0.1.0` and renders all expected metrics on `/actuator/prometheus`: - Counters: `okapi_entries_delivered_total`, `okapi_entries_retry_scheduled_total`, `okapi_entries_failed_total` - Timer: `okapi_batch_duration_seconds_*` - MultiGauge: `okapi_entries_count{status=...}`, `okapi_entries_lag_seconds{status=...}` (3 status rows each, one query per refresh) - [x] `./gradlew ktlintCheck` clean ## Notes The publish-plugin omission would surface at the next Maven Central release of okapi (i.e. when `0.1.0` artifacts are pushed). Worth catching before then.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
OutboxProcessingEvent:Delivered,RetryScheduled,Failed) inokapi-core— enables exhaustivewhenin Kotlin, compiler warns on missing handlersOutboxProcessorListenerinterface with default no-op methods — callbacks for per-entry and per-batch processing eventsOutboxProcessoraccepts optional listener + clock — notifies with try-catch isolation (listener exceptions never break processing)okapi-micrometermodule —MicrometerOutboxListener(counters + timer) andMicrometerOutboxMetrics(gauges polling OutboxStore with TransactionRunner + NaN on failure)OkapiMicrometerAutoConfiguration— top-level Spring Boot autoconfiguration, auto-detects MeterRegistry, wires read-only TransactionRunner for gauge queriesMetrics
okapi.entries.deliveredokapi.entries.retry_scheduledokapi.entries.failedokapi.batch.durationokapi.entries.countokapi.entries.lag.secondsDesign decisions
java.time.Durationover Long millis — type-safe, MicrometerTimer.record(Duration)nativestore.updateAfterProcessing()OkapiMicrometerAutoConfiguration— inner@Configurationclasses don't reliably see@ConditionalOnBeanfrom other autoconfigs in Spring Boot 4RetryScheduledname (notRetried) — semantically correct even on first attempt ("attempt failed, scheduled for retry")Test plan
OutboxProcessorTest— listener events (Delivered, RetryScheduled, Failed), exception isolation, null listener, retry exhaustion → FailedMicrometerOutboxListenerTest— counters per event type, batch timerMicrometerOutboxMetricsTest— gauges per status, lag calculation, TransactionRunner wrapping, store exception → NaNOutboxProcessorAutoConfigurationTest— listener autowired when MeterRegistry presentObservabilityEndToEndTest— full pipeline on live Postgres + WireMock (Testcontainers)