feat: add Prometheus metrics and Grafana dashboard for observability#7
Merged
feat: add Prometheus metrics and Grafana dashboard for observability#7
Conversation
added 2 commits
April 6, 2026 19:17
Add yasha:* custom metrics (request latency, errors, model load time, per-usecase timing, client disconnects, cleanup errors) via ray.serve.metrics, gated behind YASHA_METRICS env var with zero-overhead no-op stubs when disabled. Include pre-built Grafana dashboard, /health endpoint, and documentation.
… Ray All metrics exported via Ray's metrics agent are prefixed with ray_, but the Grafana dashboard and docs referenced unprefixed names. This updates all queries to use the actual exported names (ray_yasha_*, ray_vllm_*, ray_serve_*). - Route vLLM native metrics through Ray via RayPrometheusStatLogger - Fix Ray Serve Internals panels to use metrics that exist in Ray 2.54 - Fix model load time panel to work for one-shot events (avg not rate) - Enable YASHA_METRICS=true by default in Dockerfiles and metrics.py - Expose port 8079 in devcontainer config - Update monitoring.md to reflect all metric name prefixes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Add yasha:* custom metrics (request latency, errors, model load time, per-usecase timing, client disconnects, cleanup errors) via ray.serve.metrics, gated behind YASHA_METRICS env var with zero-overhead no-op stubs when disabled. Include pre-built Grafana dashboard, /health endpoint, and documentation.
Why
Support for prometheus metrics
How to Test
Run tests
Checklist
ruff check .passesruff format --check .passespyrightpasses