v0.5.25
What's new in v0.5.25
v0.5.25 surfaces Scaphandre scraper outcomes on the daemon /metrics endpoint and locks the long-standing contract that acknowledgments survive service restarts. Operators dogfooding Scaphandre energy attribution can now alert on degraded scrape success ratios without log-greping daemon stdout, and the endpoint component of every ack signature is verified by unit tests to depend solely on http.route, never on trace_id or span_id. Two complementary axes, one release.
The headline addition is two Prometheus counters on the daemon-side Scaphandre scraper. perf_sentinel_scaphandre_scrape_total{status} partitions every scrape attempt into success or failed, with the success and failed children cached as IntCounter fields on the daemon MetricsState for a lock-free fetch_add per tick. perf_sentinel_scaphandre_scrape_failed_total{reason} partitions each failure into one of six closed-enum reasons (unreachable, timeout, http_error, body_read_error, request_error, invalid_utf8), looked up via with_label_values only on the cold failure path. The full reason set is pre-warmed to zero at daemon startup so dashboards can plot rate() queries without absent() guards. Cardinality is bounded at 8 series, all label values are static &'static str constants from compile-time enums, no remote-controlled input feeds a label. Both counters are gated behind the daemon feature, the same gate that already protects the Scaphandre module itself. See docs/METRICS.md for the label spec and sample PromQL queries (success ratio over 5 minutes, dominant failure reason over 1 hour).
A second non-obvious change is the durable contract that acknowledgments survive service restarts. The compute_signature helper in crates/sentinel-core/src/acknowledgments.rs composes <finding_type>:<service>:<sanitized_endpoint>:<sha256-prefix-of-template>, deliberately excluding trace_id and span_id. v0.5.25 ships eleven new tests that lock this contract end-to-end. signature_stable_across_trace_id_changes and three companions verify the ack signature stays identical when only the trace id varies, when the endpoint sanitization edge cases (BiDi, slashes, spaces) trigger, and when the four other attributes drift. The endpoint extraction precedence is now also locked across all three ingest paths: OTLP reads http.route > http.url > url.full from the parent span (3 new tests), Jaeger and Zipkin read http.route > http.target from the current span tags (2 tests each). The signature is stable as long as the producer-side instrumentation emits http.route. Spring Boot 3+ with the OpenTelemetry Java agent, ASP.NET Core with the .NET SDK, and Express.js / Fastify / Koa with @opentelemetry/instrumentation-* all do this automatically.
The carbon scope documentation is realigned to match the daemon's actual semantics. Earlier audit framing of the cumulative perf_sentinel_avoidable_io_ops_total counter as a "double-count cross-restart bug" was misleading. The counter does what a Prometheus counter is supposed to do: accumulate every distinct event, intra-batch dedup keyed on (trace_id, template, source_endpoint) to prevent double-counting within a batch. Distinct traces (including those produced after a restart) are real distinct request executions, each carrying its own avoidable I/O. The waste ratio (avoidable / total) stays stable because numerator and denominator grow together per request. docs/ACK-WORKFLOW.md and the FR mirror now state this explicitly with a "Carbon scoring scope" section pointing operators to rate() over short windows for trend dashboards.
The CLI gains a completions subcommand following the cargo, gh, and rustup pattern. perf-sentinel completions <shell> writes a completion script to stdout for bash, zsh, fish, powershell, or elvish. No release artifact, no installer wiring: pipe to your shell's completion path. The subcommand is documented in docs/CLI.md and the FR mirror, and locked by two parse tests covering the accepted shell set plus rejection of unknown shells.
The release pre-flight script scripts/check-tag-version.sh is extended to validate intra-workspace dependency pins. The single literal pin in the workspace today is perf-sentinel-core = { version = "0.5.25", path = "../sentinel-core" } in crates/sentinel-cli/Cargo.toml, required for cargo publish to resolve sibling deps from the registry. If [workspace.package].version is bumped without bumping these pins, cargo publish either fails (registry not yet propagated) or silently publishes a binary linked to the previous core version. The script now scans every crates/*/Cargo.toml for any sibling-crate pin and aborts the release on mismatch. Strict-eq pins (version = "=0.5.25") are stripped before comparison. CONTRIBUTING.md "Release process" lists this as a bump target with a grep recipe to enumerate pinned crates.
A repo-versioned pre-commit hook ships in scripts/hooks/pre-commit plus an idempotent installer at scripts/install-hooks.sh. The hook scans staged changes via gitleaks git --staged (the modern syntax, since gitleaks 8.16). Missing tooling never blocks a commit: the hook skips silently if gitleaks is not on PATH or is older than 8.16, surfacing a one-line install or upgrade hint. The installer detects an existing core.hooksPath (e.g. dotfiles setups with a personal global hook) and aborts with two actionable remediation paths instead of silently writing a symlink that git would not execute. CI already runs gitleaks on every push, the local hook only catches issues earlier.
Operator guides receive a wave of editorial polish picked up during the v0.5.25 audit pass. docs/INTEGRATION.md is split conceptually with a redirection layer to docs/INSTRUMENTATION.md and docs/CI.md, the carbon-scope subsection of docs/ACK-WORKFLOW.md is rewritten to reflect cumulative semantics, the design notes under docs/design/ are condensed, and code comments across the workspace are pruned in favor of the .md docs. No behavior change.
Helm chart 0.2.28 ships in lockstep, bumping appVersion to 0.5.25 and the default daemon image tag to ghcr.io/robintra/perf-sentinel:0.5.25. No chart-level template change beyond the image tag, the v0.5.25 surface is a pure runtime addition on the daemon side and a CLI-only addition on the binary side.
Added
- Scaphandre scrape counters on the daemon
/metricsendpoint:perf_sentinel_scaphandre_scrape_total{status="success|failed"}andperf_sentinel_scaphandre_scrape_failed_total{reason=...}. The 6 reason labels areunreachable,timeout,http_error,body_read_error,request_error,invalid_utf8. Pre-warmed to zero so dashboards build withrate()withoutabsent()guards. CachedIntCounterchildren for the success / failed status counters (lock-free hot path),with_label_valueslookup on the cold failure path. Gated behind thedaemonfeature. perf-sentinel completions <shell>subcommand emits a completion script forbash,zsh,fish,powershell, orelvishto stdout. Locked bycompletions_subcommand_accepts_known_shellsandcompletions_subcommand_rejects_unknown_shell. Documented indocs/CLI.mdand the FR mirror.- Signature stability lock: 4 new tests in
crates/sentinel-core/src/acknowledgments.rs::testscovering the cross-restart invariant (signature_stable_across_trace_id_changes) and three diff sentinels onendpoint,service, andfinding_type. The signature format<finding_type>:<service>:<sanitized_endpoint>:<sha256-prefix-of-template>is now a public stability contract for.perf-sentinel-acknowledgments.tomlfiles already deployed. - OTLP route precedence lock: 3 new tests in
crates/sentinel-core/src/ingest/otlp.rs::testscoveringhttp.route > http.url > url.fullfrom the parent HTTP span. Critical for ack signature stability when the producer emits both attributes (the route template wins). - Jaeger and Zipkin route precedence lock: 2 tests each covering
http.route > http.targetfrom the current span tags. Same canonical priority across all three ingest paths. scripts/check-tag-version.shintra-workspace pin validation: second loop scans everycrates/*/Cargo.tomlfor sibling-crate dependency pins and aborts the release on mismatch with[workspace.package].version. Strict-eq pins (version = "=X.Y.Z") are stripped before comparison. Single-line table form supported, multi-line is documented as a future extension point.scripts/hooks/pre-commitandscripts/install-hooks.sh: zero-dep gitleaks pre-commit. The hook usesgitleaks git --stagedand skips gracefully if gitleaks is absent or older than 8.16. The installer is idempotent (ln -sfsymlink in.git/hooks/) and detects a non-defaultcore.hooksPathconfigured globally with two clear remediation paths.- CONTRIBUTING.md sections for "Git hooks" (one-line
bash scripts/install-hooks.shafter clone,gitleaks 8.16+requirement,--no-verifybypass) and an extended "Release process" listing intra-workspace dependency pins as a bump target. docs/ACK-WORKFLOW.md"Signature stability and service restarts" plus FR mirror: documents the four signature components, the critical dependency onhttp.route, thehttp.urlandurl.fullfallbacks, the curl recipe to verify producer-side instrumentation, and a "Carbon scoring scope" subsection clarifying that the cumulative counters reflect distinct request executions (userate()for trend dashboards).docs/METRICS.md"Scaphandre scrape counters" plus FR mirror: full label spec, pre-warming behavior, and two sample PromQL queries (5-minute success ratio, dominant failure reason over 1 hour).
Changed
- Helm chart 0.2.27 to 0.2.28,
appVersion0.5.24 to 0.5.25. Theartifacthub.io/imagesannotation is updated in lockstep. No chart template change. docs/INTEGRATION.mdslimmed by ~170 lines, redirecting operators to the focuseddocs/INSTRUMENTATION.md(producer-side) anddocs/CI.md(consumer-side) entry points. The carbon-scope text indocs/ACK-WORKFLOW.mdis rewritten to align with the cumulative Prometheus counter semantics rather than the earlier "double-count cross-restart" framing.- Operator guides condensation: verbose sections in
docs/INTEGRATION.md,docs/LIMITATIONS.md,docs/CI.md, the design notes underdocs/design/, and code comments across the workspace are condensed in favor of the .md sources. No behavior change. clap_complete = "4.6"to"4.6.3"explicit pin incrates/sentinel-cli/Cargo.tomlto surface the resolved version in the manifest itself rather than only inCargo.lock.
Notes
- The signature stability contract is conditional on
http.routebeing emitted by the producer. Standard OpenTelemetry agents (Spring Boot 3+ Java agent, ASP.NET Core .NET SDK, Express.js / Fastify / Koa@opentelemetry/instrumentation-*) emit it automatically. Whenhttp.routeis missing, perf-sentinel falls back tohttp.urlthenurl.full, ack signatures churn proportional to URL cardinality, deferred findings reappear at every new request id. The fallback is documented as a usability backstop, not a recommended posture. - The Scaphandre scrape counters are operational metadata, not GreenOps inputs. They never feed
IIS,Waste Ratio,top_offenders,green_summary, or any SCI v1.0 component. Daemon RSS impact is below 1 KB resident for the 8 fixed series. Throughput target above 100k events/sec is unaffected. - Carbon counter semantics:
perf_sentinel_findings_totalandperf_sentinel_avoidable_io_ops_totalaccumulate monotonically over the daemon process lifetime, with intra-batch dedup keyed on(trace_id, template, source_endpoint). Distinct traces (including those produced after a service restart) contribute separately because they represent distinct request executions. Userate()over short windows for trend dashboards, the absolute cumulative value is intentional Prometheus counter semantics. scripts/install-hooks.shdoes not modify your git config. Ifcore.hooksPathis set globally (common in dotfiles setups), the installer reports the situation and exits without writing anything. You eithergit config --local --unset core.hooksPathfor this repo, or chain the gitleaks invocation fromscripts/hooks/pre-commitinto your global hook manually.
Install
Pre-built static binaries are attached to this release for linux-amd64, linux-arm64, macos-arm64, and windows-amd64. Verify the SHA256 from SHA256SUMS.txt before extracting. Crate consumers can cargo install perf-sentinel --version 0.5.25 once the workflow finishes propagating.
Helm operators bump to chart 0.2.28 for the matching appVersion, no values.yaml change required.
Full Changelog: v0.5.24...v0.5.25