v0.5.21
What's new in v0.5.21
v0.5.21 instruments the runtime ack endpoints introduced in v0.5.20 with two Prometheus counters on /metrics, so operators running the daemon ack workflow get trend lines on operator-driven activity and a structured signal for failure modes (auth misconfiguration, store saturation, oversized payloads). No HTTP-shape change ships: POST / DELETE / GET /api/findings/{signature}/ack and GET /api/acks keep their v0.5.20 status codes and JSON shapes byte-for-byte.
The two new counters are perf_sentinel_ack_operations_total{action="ack"|"unack"} for successful operations and perf_sentinel_ack_operations_failed_total{action,reason} for failures. The reason label covers nine documented values, with file_too_large (per-daemon JSONL saturation) and entry_too_large (per-request oversized by or reason payload) intentionally separated so an operator dashboard can dispatch the two failure modes to different runbooks. Pre-warming covers fifteen reachable (action, reason) combinations at startup (two success series, thirteen failure series), so dashboards can build with rate() queries without absent() guards. Impossible combinations such as action="ack",reason="not_acked" are left out, the alternative would publish series that can never grow and mislead operators.
The release also ships a small internal refactor: check_ack_preconditions factors the auth-then-store-presence guard shared by handle_ack and handle_unack, register_int_counter_vec deduplicates the create-clone-register boilerplate across the three IntCounterVec registration sites, and #[inline] hints land on the counter-bumping helpers (including the v0.5.19 record_otlp_reject for consistency). Counter-bumping stays branchless on the success path (cached IntCounter children, single relaxed atomic add per call), the failure path takes the label-hashmap lookup since failures are by definition rare. The release-binary size target is relaxed from < 10 MB to < 15 MB to account for the musl Linux statically-linked binary with mimalloc reaching 10.1 MB after recent additions. lto = "thin", strip = true, and panic = "abort" remain unchanged.
Helm chart 0.2.24 ships in lockstep, bumping the default daemon image tag to ghcr.io/robintra/perf-sentinel:0.5.21.
Added
- Two Prometheus counters on the daemon
/metricsendpoint:perf_sentinel_ack_operations_total{action}for successful ack and unack operations. CachedIntCounterchildren at struct level, branchlessmatchon the two handles, single relaxed atomic add per call.perf_sentinel_ack_operations_failed_total{action,reason}for failures. Thereasonlabel coversalready_acked,not_acked,unauthorized,no_store,invalid_signature,limit_reached,file_too_large,entry_too_large,internal_error. Failures are rare so no hot-path child cache, the lookup goes through prometheus's label hashmap per call (stillO(1)on the read lock).
- Pre-warmed series for the fifteen reachable
(action, reason)combinations at startup: two success (action=ack,action=unack) plus thirteen failure (eight reasons onaction=ack, five onaction=unack). Impossible combinations are intentionally not pre-warmed soabsent()guards become unnecessary inrate()queries. AckFailureReason::EntryTooLargeas a distinct variant fromFileTooLarge.entry_too_largeflags per-request misuse (the caller-suppliedbyorreasonfield exceeds the 4 KiB per-record cap),file_too_largeflags per-daemon saturation (the next append would push the JSONL above the 64 MiB cap and a restart-time compaction is needed). Both still returnHTTP 507 Insufficient Storage, the operator-visible HTTP status is unchanged.AckAction::as_strandAckFailureReason::as_strhelpers returning&'static strfor stable Prometheus label strings, mirroring the v0.5.19OtlpRejectReason::as_strpattern.#[inline]on counter-bumping methods:record_ack_success,record_ack_failure, and the v0.5.19record_otlp_rejectfor consistency. The compiler likely inlines them already atopt-level = 3, the explicit annotation matches the project inlining policy on critical helpers and pre-empts future regressions.register_int_counter_vechelper incrates/sentinel-core/src/report/metrics.rsfactors the create-clone-register pattern across the threeIntCounterVecregistration sites (otlp_rejected_total,ack_operations_total,ack_operations_failed_total).check_ack_preconditionshelper incrates/sentinel-core/src/daemon/query_api.rsfactors the auth-then-store-presence guard shared byhandle_ackandhandle_unack. Records the matchingAckFailureReason(UnauthorizedorNoStore) before returning, so every error exit stays observable in/metrics.- Six unit tests in
crates/sentinel-core/src/report/metrics.rs:as_strround-trip across all variants, success-path increments per action, failure-path increments per(action, reason), pre-warmed-zero contract on both counters, impossible-combinations-not-pre-warmed contract, and the/metricsrendered-output contract. - Three integration tests in
crates/sentinel-core/src/daemon/query_api.rs: a no-store failure incrementsreason="no_store", a TOML conflict bumps the samereason="already_acked"series as a daemon-side double-ack, and a malformed signature incrementsreason="invalid_signature". The four pre-existing ack tests gain counter assertions on the success andunauthorizedpaths. docs/METRICS.mdanddocs/FR/METRICS-FR.md: new "Ack metrics (since 0.5.21)" section with the label table, the per-reasonHTTP-status mapping, the pre-warming contract, and three sample PromQL queries (trend rate, per-reason failure rate, alert-worthy combinations).docs/HELM-DEPLOYMENT.mdanddocs/FR/HELM-DEPLOYMENT-FR.md: new "Daemon ack runtime store" subsection covering the four operator decisions when running the ack store under Kubernetes (api_keywhen bound non-loopback, persistence path remap to a PVC,securityContextmode-floor caveat, TOML ConfigMap mount). The existing### StatefulSetblock is repurposed from "reserved for future use" to the live ack-persistence guidance. A new ServiceMonitor warning paragraph notes the v0.5.20 default-filter behavior change for dashboards that scrape/api/findings.
Changed
AckError::FileTooLargeandAckError::EntryTooLargeno longer fold into the same metric label. The handler match arms inhandle_ackmap them to distinctreason="file_too_large"andreason="entry_too_large"series. The HTTP error message text also differentiates them ("ack file size cap reached"vs"ack entry size cap reached"). The HTTP status (507 Insufficient Storage) stays the same on both.- Binary size target relaxed from
< 10 MBto< 15 MBindocs/LIMITATIONS.md,docs/design/02-NORMALIZATION.md, and the FR mirrors. The musl Linux statically-linked binary with mimalloc currently sits at10.1 MBand the previous target was tight enough that small additions (the new counters, the ack store, the v0.5.20 query API surface) would have pushed it over. - Helm chart
0.2.23to0.2.24,appVersion0.5.20to0.5.21, default daemon image tag points atghcr.io/robintra/perf-sentinel:0.5.21. Theartifacthub.io/imagesannotation is updated in lockstep.
Behavior
- No HTTP-shape change. The three ack endpoints (
POST/DELETE /api/findings/{signature}/ackandGET /api/acks) plusGET /api/findings,GET /api/findings/{trace_id},GET /api/explain/{trace_id},GET /api/correlations,GET /api/status,GET /api/export/reportkeep their v0.5.20 JSON shapes byte-for-byte. The only HTTP-visible delta is the error-message text on the two storage-cap failures, where"ack file size cap reached"(which previously covered bothFileTooLargeandEntryTooLarge) splits into two distinct strings. Clients matching on the status code (507) are unaffected. /metricsendpoint authentication is unchanged. Default--listen-addressstays127.0.0.1. Operators who bind the metrics endpoint to a non-loopback address for cluster-wide scraping should keep theNetworkPolicyplus Prometheus-side mTLS posture from v0.5.19. The new counters carry no PII, no signature labels, nobyfield. Only the boundedactionandreasonenum strings.- Auth-presence inference via the
unauthorizedseries.perf_sentinel_ack_operations_failed_total{reason="unauthorized"}is pre-warmed to zero unconditionally at startup, but only ever increments when[daemon.ack] api_keyis set and a request fails auth. A non-zero value therefore confirmsapi_keyis configured. Documented indocs/METRICS.md. Mitigated by the loopback-by-default posture and the Prometheus-side network-policy guidance. - Constant-time
X-API-Keycomparison preserved. Thecheck_ack_authbody is unchanged, thesubtle::ConstantTimeEq::ct_eqcall still gates auth. Thecheck_ack_preconditionsrefactor extracts the call but keeps the comparison itself untouched, the counter increment fires strictly after the comparison returns its result, no new timing side channel is introduced. - Counter integrity under authenticated abuse. A holder of the
api_keycould triggerrecord_ack_failurearbitrarily and skew dashboards. Pre-existing risk, not a v0.5.21 regression: any holder can already write or revoke acks at will.
Documentation
- New "Ack metrics (since 0.5.21)" section in
docs/METRICS.mdanddocs/FR/METRICS-FR.mdwith the full label set, thereason-to-HTTP-status mapping including theentry_too_largedistinction, the pre-warming contract, three sample PromQL queries, and a paragraph on the auth-presence inference signal. - New "Daemon ack runtime store" subsection in
docs/HELM-DEPLOYMENT.mdanddocs/FR/HELM-DEPLOYMENT-FR.mdcovering the four operator decisions for running the ack store under Kubernetes, plus a ServiceMonitor warning on the v0.5.20 default-filter behavior of/api/findings. - Binary size target relaxed from
< 10 MBto< 15 MBindocs/LIMITATIONS.md,docs/design/02-NORMALIZATION.md, and the FR mirrors.
Install
Prebuilt binaries (Linux amd64 / arm64, macOS arm64, Windows amd64):
curl -LO https://github.com/robintra/perf-sentinel/releases/download/v0.5.21/perf-sentinel-linux-amd64
chmod +x perf-sentinel-linux-amd64
sudo mv perf-sentinel-linux-amd64 /usr/local/bin/perf-sentinelLinux binaries are statically linked against musl and run on any distribution (Alpine, Debian, RHEL, Ubuntu any version) regardless of glibc version, and inside FROM scratch images.
From crates.io:
cargo install perf-sentinel --version 0.5.21Docker:
docker run --rm -p 4317:4317 -p 4318:4318 \
ghcr.io/robintra/perf-sentinel:0.5.21 watch --listen-address 0.0.0.0Helm chart 0.2.24 ships alongside, see the matching chart-v0.2.24 release for the chart-side details.
Full Changelog: v0.5.20...v0.5.21