Skip to content

v0.5.13

Choose a tag to compare

@github-actions github-actions released this 29 Apr 20:04
· 580 commits to main since this release

What's new in v0.5.13

v0.5.12 shipped the Carbon scoring chip in the dashboard banner via green_summary.scoring_config, but the GreenOps tab in the rendered HTML only lights up when green_summary.co2 is non-null. On the daemon snapshot path (/api/export/report), the handler returned GreenSummary::disabled(0) and patched only scoring_config, so an operator piping curl /api/export/report | perf-sentinel report --input - got an HTML dashboard with no GreenOps tab and no chip. The 0.5.12 audit-trail promise was honored on analyze --format html over a trace file but not on daemon mode, which is the product's primary target. v0.5.13 closes that gap.

/api/export/report now serves a live green_summary refreshed by the event loop after every batch (regions, top offenders, avoidable I/O ratio, CO2 numbers, transport). The chip banner introduced in v0.5.12 surfaces naturally on the rendered HTML for any daemon configured with Electricity Maps once at least one batch has been processed. Plumbing detail: a new Arc<tokio::sync::RwLock<GreenSummary>> cell is built at daemon startup, mutated by the event loop after each score_green call, and read by the snapshot handler. scoring_config is re-applied on the read side from the daemon's startup config so the audit-trail metadata cannot drift, the cell holds whatever score_green produced.

The CLI input default max_payload_size jumps from 1 MiB to 16 MiB. A 1000-finding ringbuffer snapshot from /api/export/report already exceeds 1 MiB on a modest cluster (about 1.15 MiB), causing the canonical pipeline above to fail silently at the previous default. The new value sits exactly at the upper inclusive boundary of the comfort zone (warn_unusual_daemon_limits uses ..=16 * 1024 * 1024), so the default does not trigger a startup warning. The 100 MiB hard cap is unchanged. Configs with an explicit smaller max_payload_size value are unaffected.

The cold-start guard on /api/export/report is slightly tightened. The previous guard fired only on events_processed_total == 0, which left a window (up to trace_ttl_ms / 2, default 15 seconds) where events had been ingested but the first eviction tick had not yet fired and the green_summary cell was still disabled(0). Returning a meaningless 200 in that window confused operators piping the snapshot through perf-sentinel report --input - immediately after starting the daemon (the GreenOps tab would not render). The new guard waits until events_processed_total > 0 AND traces_analyzed_total > 0 before serving 200, so the cell is provably populated by a real batch on the read path.

A small terminal hardening rounds out the change. top_offenders[].endpoint, top_offenders[].service and regions[].region flow through the CLI print_green_summary renderer, originating from OTLP span attributes that an attacker-controlled sender or a hostile --input baseline can set freely. They were printed verbatim, mirroring 0.5.10/0.5.11 the three fields are now wrapped in sanitize_for_terminal at the print sink, defending against ANSI / OSC 8 / control-byte injection into the operator's terminal. The wrap matches the existing treatment of intensity_estimation_method and the Electricity Maps endpoint string.

The README JSON example was also refreshed for the audit-grade shape: code_location, suggested_fix, green_summary.scoring_config, per_endpoint_io_ops are now visible in the example, the CO2 model is updated to io_proxy_v3, the region resolves to eu-west-3 with monthly_hourly intensity, and the quality gate threshold matches the current default. A reproduction snippet under the example shows the minimal TOML config that produces an audit-grade JSON output from the demo fixture.

Added

  • Live green_summary on /api/export/report. The snapshot endpoint now serves a GreenSummary refreshed by the event loop after each batch, instead of GreenSummary::disabled(0). The chip banner introduced in v0.5.12 is now visible in the HTML rendered from a live daemon snapshot, not only in analyze --format html on a trace file.
  • New shared cell QueryApiState.green_summary: Arc<tokio::sync::RwLock<GreenSummary>>. Initialized to disabled(0) at daemon startup, mutated by the event loop after each score_green call (or after the disabled-branch disabled(total_io_ops) build when green_enabled = false), read by handle_export_report on every snapshot request. tokio::sync::RwLock was chosen over Mutex because the access pattern is asymmetric: writes happen at batch frequency (a few per second), reads at human or CI poll frequency (typically less than once per minute), and the read path benefits from concurrent access.
  • Test process_traces_publishes_green_summary_to_cell asserts the per-batch contract: each batch overwrites the cell so live snapshots pick up the latest CO2 picture.
  • Test handle_export_report_serves_live_green_summary_after_batch asserts that a value written into the cell flows back through the handler verbatim, with scoring_config patched on top.
  • Test handle_export_report_returns_503_when_events_in_but_no_batch_yet locks the new cold-start guard.

Changed

  • Default max_payload_size raised from 1 MiB to 16 MiB. A 1000-finding ringbuffer snapshot from /api/export/report already exceeds 1 MiB on a modest cluster, causing curl /api/export/report | perf-sentinel report --input - to fail silently at the previous default. The new default sits at the upper inclusive boundary of the comfort zone (warn_unusual_daemon_limits uses ..=16 MiB), so it does not trigger a startup warning. The 100 MiB hard cap is unchanged. A coupling comment near warn_outside_comfort_zone documents that a future bump of the default must also raise the ceiling, otherwise every fresh daemon would log a startup warning.
  • Doc-comment of handle_export_report rewritten. Now explicitly states that every numeric field under green_summary (total_io_ops, avoidable_io_ops, io_waste_ratio, co2.*, regions, top_offenders, transport_gco2) reflects the most recent batch only, not a daemon-lifetime aggregate. The analysis.events_processed and analysis.traces_analyzed fields stay lifetime counters for context. Operators wanting cumulative GreenOps numbers should scrape /metrics (Prometheus counters total_io_ops, avoidable_io_ops, io_waste_ratio).

Behavior

  • Cold-start guard slightly tightened. Returns 503 while either events_processed_total == 0 OR traces_analyzed_total == 0. The previous guard fired only on events_processed == 0, leaving a window (up to trace_ttl_ms / 2, default 15 seconds) where the cell was still disabled(0) while the handler returned 200. The new guard waits until at least one batch has been scored.
  • scoring_config continues to surface on snapshots whenever Electricity Maps is configured at daemon startup (introduced in 0.5.12). It is now applied on top of the live green summary in the handler: the event loop publishes the per-batch summary, the handler stitches scoring_config back from the daemon's startup config so the audit-trail metadata cannot drift.
  • Backward compat for explicit configs. A max_payload_size = 1048576 line in TOML still works exactly as before. The new default only applies when the field is absent. Three pre-existing tests (default_config_has_safe_defaults, parse_empty_toml_gives_defaults, parse_partial_toml) and one CLI test (load_config_returns_default_when_no_file) were updated in place. The e2e test cli_analyze_rejects_oversized_file now pins max_payload_size = 1048576 via a TOML config so the test stays cheap (writing a 16 MiB file just to trip the guard would balloon the test fixture).
  • No SARIF format change. No wire-format change to analyze --format json either, the same field that was already emitted in batch mode is now also emitted on the snapshot path.

Security

  • top_offenders[].endpoint, top_offenders[].service and regions[].region wrapped in sanitize_for_terminal in the CLI print_green_summary renderer. The three strings originate from OTLP span attributes (source.endpoint, service.name, cloud.region) that an attacker-controlled OTLP sender can set freely, or from a --input JSON baseline that bypasses the OTLP boundary validation. They were printed verbatim before this release, opening a path for ANSI / OSC 8 / control-byte injection into the operator's CI terminal log. The wrap mirrors the 0.5.10 and 0.5.11 treatment of intensity_estimation_method and the Electricity Maps endpoint string. JSON output is unaffected (serde_json auto-escapes), HTML output is unaffected (textContent and setAttribute auto-escape), only the colored terminal renderer is hardened.

Install

Prebuilt binaries (Linux amd64 / arm64, macOS arm64, Windows amd64):

curl -LO https://github.com/robintra/perf-sentinel/releases/download/v0.5.13/perf-sentinel-linux-amd64
chmod +x perf-sentinel-linux-amd64
sudo mv perf-sentinel-linux-amd64 /usr/local/bin/perf-sentinel

Linux binaries are statically linked against musl and run on any distribution (Alpine, Debian, RHEL, Ubuntu any version) regardless of glibc version, and inside FROM scratch images.

From crates.io:

cargo install perf-sentinel --version 0.5.13

Docker:

docker run --rm -p 4317:4317 -p 4318:4318 \
  ghcr.io/robintra/perf-sentinel:0.5.13 watch --listen-address 0.0.0.0

Also available on Docker Hub: robintrassard/perf-sentinel:0.5.13.

Helm (chart 0.2.16 ships 0.5.13 as its appVersion default):

helm install perf-sentinel oci://ghcr.io/robintra/charts/perf-sentinel \
  --version 0.2.16 \
  --namespace observability --create-namespace

Verify the binary against SHA256SUMS.txt:

curl -LO https://github.com/robintra/perf-sentinel/releases/download/v0.5.13/SHA256SUMS.txt
sha256sum -c SHA256SUMS.txt --ignore-missing

Full diff: v0.5.12...v0.5.13