Description
Add dedicated endpoints to gateway and LLM services with Prometheus-compatible metrics, separating observability concerns from health checks.
Acceptance Criteria
- Gateway exposes a endpoint with request counts, latencies, error breakdowns, and provider-level stats
- LLM service exposes a endpoint with request counts, latencies, error breakdowns, and provider-level stats
- Prometheus annotations () point to metrics endpoints instead of
- targets updated metrics paths/ports for gateway and LLM
- Documentation in distinguishes health vs. metrics endpoints and shows example metrics for all three services
Priority
Medium
Dependencies
- Existing gateway/LLM services
- Prometheus annotations and ServiceMonitor wiring from 8.3.1 (complete)
Risks & Mitigations
- Risk: Metrics endpoints increase CPU/latency under load
- Mitigation: Start with minimal, targeted metrics set and sample where possible
- Risk: Confusion between and
- Mitigation: Clearly document both surfaces and their intended use
Next Steps
- Implement endpoints for gateway and LLM with basic counters/histograms
- Update and values to new metrics endpoints
- Adjust to scrape new paths
- Extend �[0;32m[INFO]�[0m Checking prerequisites...
�[0;31m[ERROR]�[0m kubectl is not installed or not in PATH to validate gateway/LLM metrics endpoints (not just health)
- Update K8s deployment docs with example metric names and troubleshooting steps
Reference
See task 8.3.2 for full details.
Description
Add dedicated endpoints to gateway and LLM services with Prometheus-compatible metrics, separating observability concerns from health checks.
Acceptance Criteria
Priority
Medium
Dependencies
Risks & Mitigations
Next Steps
�[0;31m[ERROR]�[0m kubectl is not installed or not in PATH to validate gateway/LLM metrics endpoints (not just health)
Reference
See task 8.3.2 for full details.