Skip to content

feat(metrics): config-driven bounded request labels (adds service to oneshot_pipeline.duration)#545

Open
staging-devin-ai-integration[bot] wants to merge 4 commits into
mainfrom
devin/1780157746-oneshot-service-label
Open

feat(metrics): config-driven bounded request labels (adds service to oneshot_pipeline.duration)#545
staging-devin-ai-integration[bot] wants to merge 4 commits into
mainfrom
devin/1780157746-oneshot-service-label

Conversation

@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor

@staging-devin-ai-integration staging-devin-ai-integration Bot commented May 30, 2026

Summary

  • Adds a general, operator-configurable facility for attaching bounded labels to request metrics, sourced from trusted request headers — instead of hardcoding service names in the Rust binary. New dimensions/values become a skit.toml edit, not a recompile.
  • Config lives at [server.metrics] as a list of request_labels, each { name, header, allowed[], fallback }. The header value is trimmed + lowercased and matched against allowed; anything outside it — or a missing header — collapses to fallback (default "other"). This bounds cardinality: clients pick a bucket, they can't create new ones.
  • A small reusable resolver (apps/skit/src/metrics_labels.rs) is applied at two consumers to prove it's general: the shared http.server.* request middleware, and oneshot_pipeline.duration (all record sites — streaming ok / incomplete-drop, and all three early error paths). The middleware resolves once and stashes the result in request extensions; the oneshot handler reuses it (no double parsing).
  • Default config ships the service label from X-StreamKit-Service with allowlist {tts, stt}oneshot_pipeline.duration gains service ∈ {tts, stt, other} out of the box. The gateway should set this header per endpoint (sibling effort).
  • Config is validated at load: label names that collide with built-in keys (status, http.method, http.route, http.status_code) or duplicate each other are rejected (a duplicate label key makes Prometheus drop the series). Allowlists are normalized once at load so the per-request path only normalizes the incoming header.

Example to add a new dimension without touching Rust:

[[server.metrics.request_labels]]
name = "service"
header = "X-StreamKit-Service"
allowed = ["tts", "stt"]   # missing/unknown -> "other"

Review & Validation

  • Cardinality is bounded by config (value emitted only if normalized-in-allowed, else fallback; empty allowed ⇒ always fallback).
  • Reserved/duplicate label names fail config load (MetricsConfig::validate).
  • service appears at every oneshot_pipeline.duration record site and on http.server.*.
  • Default config (no [server.metrics] block) still yields service ∈ {tts,stt,other} — see sample_config_test.
  • just lint-skit and cargo test -p streamkit-server (metrics_labels + oneshot + config) pass.

Notes

  • Open scope decision (default-on, all routes): because the labels apply to the global HTTP middleware, the non-empty default adds a low-cardinality service dimension to http.server.* for everyone on upgrade, not just oneshot. Aggregate PromQL is unaffected, but it is a schema change. Discussing with the requester whether to keep default-on or make it opt-in (empty default).
  • Header parse failures (non-UTF-8) are treated as absent → fallback.
  • Scope: Rust under apps/skit/ only. Dashboard, docs, and the gateway header-setting are owned by sibling sessions; samples/skit.toml unchanged since defaults carry the behavior.

Link to Devin session: https://staging.itsdev.in/sessions/b5750eb705c84d388f3af4f5b6d6940b
Requested by: @streamer45


Devin Review

Status Commit
🕐 Outdated 3e84014 (HEAD is 010ca67)

Run Devin Review

Open in Devin Review (Staging)

Source a bounded `service` label ({tts,stt,other}) from the trusted X-StreamKit-Service request header so TTS, STT, and other oneshot requests are distinguishable without leaking arbitrary user-submitted pipeline names into metric cardinality.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

Copy link
Copy Markdown
Contributor Author

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment thread apps/skit/src/server/oneshot.rs Outdated
}
self.recorded = true;
let labels = [KeyValue::new("status", status)];
let labels = [KeyValue::new("status", status), KeyValue::new("service", self.service)];
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Existing aggregate PromQL should continue to work with the added label

The change adds a second label to oneshot_pipeline.duration observations at all metric recording sites in this handler. I did not flag this as a bug because the existing dashboard queries aggregate away all non-grouped labels: the error-rate query uses sum(rate(oneshot_pipeline_duration_count{status="error"}[5m])) and the latency query uses sum(rate(oneshot_pipeline_duration_bucket{status="ok"}[5m])) by (le) in samples/grafana-dashboard.json:164 and samples/grafana-dashboard.json:279, so adding service will increase series cardinality but should not change those aggregate panels' values.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment thread apps/skit/src/server/oneshot.rs Outdated
Comment on lines +52 to +57
fn classify_service(header: Option<&str>) -> &'static str {
match header.map(|h| h.trim().to_ascii_lowercase()).as_deref() {
Some("tts") => "tts",
Some("stt") => "stt",
_ => "other",
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Service header is intentionally low-cardinality and advisory

The new header classifier lowercases and trims the value, allows only tts and stt, and maps missing, invalid UTF-8, empty, or unknown values to other. That means clients can choose the service bucket for observability, but cannot create unbounded metric cardinality. I did not treat the unauthenticated/advisory nature of X-StreamKit-Service as a security bug because the label is only used for metrics and is not fed into authorization or pipeline behavior.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

@codecov
Copy link
Copy Markdown

codecov Bot commented May 30, 2026

Codecov Report

❌ Patch coverage is 94.53552% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.00%. Comparing base (6abe3ba) to head (010ca67).

Files with missing lines Patch % Lines
apps/skit/src/server/oneshot.rs 70.00% 9 Missing ⚠️
apps/skit/src/config.rs 98.82% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #545      +/-   ##
==========================================
+ Coverage   79.96%   80.00%   +0.03%     
==========================================
  Files         234      235       +1     
  Lines       68061    68234     +173     
  Branches     1846     1933      +87     
==========================================
+ Hits        54428    54591     +163     
- Misses      13627    13637      +10     
  Partials        6        6              
Flag Coverage Δ
backend 79.77% <94.53%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
core 85.37% <ø> (ø)
engine 82.94% <ø> (-0.10%) ⬇️
api 89.96% <ø> (ø)
nodes 76.84% <ø> (ø)
server 80.43% <94.53%> (+0.10%) ⬆️
plugin-native 83.70% <ø> (ø)
plugin-wasm 92.20% <ø> (ø)
ui-services 84.69% <ø> (ø)
ui-components 60.49% <ø> (ø)
Files with missing lines Coverage Δ
apps/skit/src/metrics_labels.rs 100.00% <100.00%> (ø)
apps/skit/src/server/mod.rs 80.72% <100.00%> (+0.11%) ⬆️
apps/skit/src/config.rs 97.70% <98.82%> (+0.16%) ⬆️
apps/skit/src/server/oneshot.rs 88.02% <70.00%> (-0.61%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Replace the hardcoded {tts,stt,other} allowlist with a general, operator-configurable [server.metrics.request_labels] facility: each label is sourced from a trusted request header, bounded by a configured allowlist, with a fallback (default "other"). A reusable resolver applies it to both oneshot_pipeline.duration and the shared http.server.* request metrics, so new dimensions need only a config edit, not a recompile. Defaults preserve the {tts,stt,other} service label.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
@staging-devin-ai-integration staging-devin-ai-integration Bot changed the title feat(oneshot): add bounded service label to oneshot_pipeline.duration feat(metrics): config-driven bounded request labels (adds service to oneshot_pipeline.duration) May 30, 2026
streamer45 and others added 2 commits May 31, 2026 10:53
…esolution

Address review on request-metric labels:
- Reject configured label names that collide with built-in metric keys (status, http.method, http.route, http.status_code) or duplicate each other, at config load (duplicate Prometheus label keys break scrape).
- Pre-normalize allowlist entries once at config load so the per-request hot path only normalizes the incoming header.
- Resolve labels once in metrics_middleware and stash them in request extensions; the oneshot handler reuses them (falls back to resolving when the layer is absent, e.g. unit tests).
- Record oneshot_pipeline.duration on the routing-task-aborted error arm for consistent error-path coverage.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants