refactor!: Restructure metrics response for clear abstraction separation #3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR addresses critical architectural issues discovered during testing where metrics from different abstraction layers were being mixed, causing confusion and incorrect data representation. The most critical issue was server CPU metrics showing individual worker process CPU (6.39%) instead of actual server CPU (40%).
Changes
🔴 Critical Bug Fixes
Fixed jobs/minute calculation (5x multiplication error)
WorkerMetricsQueryService.php:270-284Fixed server CPU metrics confusion
worker_processes(process-level) andserver_resources(system-level)🏗️ Server Metrics Restructure
Separated server metrics into 4 clear abstraction tiers:
Application tier:
queue_workerscount: total, active, idle workersutilization.current_busy_percent: % workers busy RIGHT NOWutilization.lifetime_busy_percent: % TIME workers have been busyApplication tier:
job_processinglifetime: total_processed, total_failed, failure_rate_percentcurrent: jobs_per_minute (based on elapsed time), avg_duration_msProcess tier:
worker_processesSystem tier:
server_resourcesCapacity tier:
capacityBreaking changes:
📊 Queue Metrics Separation
Separated queue metrics by time scope with explicit windows:
depth: Instantaneous queue state (current snapshot)performance_60s: Windowed performance metricslifetime: Lifetime metrics since first jobworkers: Worker state and efficiencyBreaking changes:
⏱️ Trend Analysis Enhancement
Added comprehensive time window context to all trend methods:
New
time_windowobject in all trend responses:Breaking changes:
time_windowwrapper objectperiod_secondsintotime_window.window_seconds🎛️ Dashboard Filtering Updates
Updated all dashboard filter methods to:
Breaking Changes
API Response Changes
Server Metrics (
/api/metrics/workersorgetOverview())system_limits→server_resourcesutilization_rate→current_busy_percent+lifetime_busy_percentQueue Metrics (
/api/metrics/queuesorgetAllQueuesWithMetrics())depthobjectthroughput_per_minute→performance_60s.throughput_per_minuteutilization_rate→workers.current_busy_percent+lifetime_busy_percentTrend Metrics (
/api/metrics/trendsor trend analysis methods)time_windowwrapper objectperiod_seconds→time_window.window_secondsTest Plan
Test Results:
Files Modified
src/Services/WorkerMetricsQueryService.php- Server metrics restructure + jobs/min fixsrc/Services/QueueMetricsQueryService.php- Queue metrics separation + worker utilizationsrc/Services/OverviewQueryService.php- Dashboard filtering updatessrc/Services/TrendAnalysisService.php- Time window context additionsMigration Notes
Frontend consumers will need to update:
data.depth.totalinstead ofdata.depthperformance_60s.throughput_per_minuteinstead ofdata.throughput_per_minuteserver_resourcesinstead ofsystem_limitscurrent_busy_percentandlifetime_busy_percentfor different use casestime_windowobject