-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
priority/mediumMedium priorityMedium prioritysize/MMedium: 3-5 daysMedium: 3-5 daystype/featureNew feature or functionalityNew feature or functionalitytype/infrastructureInfrastructure, CI/CD, DevOpsInfrastructure, CI/CD, DevOps
Description
Summary
Implement Prometheus metrics for monitoring conversion throughput, error rates, and resource utilization.
Parent Epic
- [Epic] Distributed Roboflow with Alibaba Cloud (OSS + ACK) #9 Distributed Roboflow with Alibaba Cloud
Dependencies
- Can start after Phase 3 complete
- Full integration after [Phase 9.1] Implement long-running Worker Deployment [READY TO START] #18 (Controller)
Tasks
7.1.1 Add Metrics Infrastructure
- Add
prometheuscrate to dependencies - Create
src/metrics/mod.rs - Define static metric registry
- Create metrics initialization function
- Feature-gate under
metricsflag (optional)
7.1.2 Define Conversion Metrics
roboflow_episodes_processed_total- Counter with labels: dataset_id, statusroboflow_episode_processing_duration_seconds- Histogramroboflow_frames_processed_total- Counterroboflow_bytes_read_total- Counter with source labelroboflow_bytes_written_total- Counter with destination, type labels
7.1.3 Define Upload Metrics
roboflow_upload_duration_seconds- Histogram with file_type labelroboflow_upload_retries_total- Counter with reason labelroboflow_upload_bytes_total- Counterroboflow_pending_uploads- Gauge
7.1.4 Define Resource Metrics
roboflow_buffer_usage_bytes- Gaugeroboflow_cache_hit_total- Counterroboflow_cache_miss_total- Counterroboflow_active_conversions- Gauge
7.1.5 Define Video Encoding Metrics
roboflow_video_encoding_duration_seconds- Histogramroboflow_video_frames_encoded_total- Counter with camera label
7.1.6 Instrument Code
- Add metrics to conversion loop
- Add metrics to storage operations
- Add metrics to upload coordinator
- Add metrics to video encoder
7.1.7 Create Metrics Endpoint
- Add
axumdependency for HTTP server - Create
/metricsendpoint - Create
/healthendpoint - Start metrics server in background
Acceptance Criteria
- All metrics defined and registered
- Conversion code instrumented
- Storage code instrumented
- Metrics endpoint serves Prometheus format
- Health endpoint works
- All tests pass
Files to Create
src/metrics/mod.rssrc/metrics/conversion.rssrc/metrics/storage.rssrc/metrics/server.rs
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
priority/mediumMedium priorityMedium prioritysize/MMedium: 3-5 daysMedium: 3-5 daystype/featureNew feature or functionalityNew feature or functionalitytype/infrastructureInfrastructure, CI/CD, DevOpsInfrastructure, CI/CD, DevOps