Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit cf3d1e4. Configure here.
| global _stats | ||
| if _stats is None: | ||
| _stats = PipielineStats(get_raw_metrics()) | ||
| return _stats |
There was a problem hiding this comment.
_stats retains stale raw metrics after reconfigure
Medium Severity
The _stats singleton captures the Metrics instance from get_raw_metrics() on its first call. When configure_metrics(force=True) later updates the global metrics backend, _stats continues to use the stale Metrics object. This routes subsequent pipeline stats to the wrong backend and causes test isolation issues.
Reviewed by Cursor Bugbot for commit cf3d1e4. Configure here.
| FLUSH_TIME = 10 | ||
|
|
||
|
|
||
| class PipielineStats: |
There was a problem hiding this comment.
Class name PipielineStats is misspelled
Low Severity
The newly introduced class is named PipielineStats (extra i) instead of PipelineStats, and the misspelling has propagated into the global type annotation, the get_stats() return type, the test imports, and the name-mangled attribute string _PipielineStats__last_flush_time. Renaming after wider adoption is harder than fixing it now while the symbol is still internal.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit cf3d1e4. Configure here.
| self._metrics.increment(Metric.ERRORS, value, tags) | ||
| for step, fvalue in self._timing_buffer.items(): | ||
| tags = {"step": step} | ||
| self._metrics.timing(Metric.DURATION, fvalue, tags) |
There was a problem hiding this comment.
step_timing reports max-of-window instead of distribution
Medium Severity
step_timing keeps only the largest value seen per step in a 10-second window and then emits it via Metrics.timing(Metric.DURATION, …). Downstream backends (Datadog dogstatsd timing, log backend) treat each timing sample as a histogram observation, so dashboards and percentiles built on duration will now see a single max-per-window sample instead of the previous per-call distribution. Effective p50/p95/p99 will collapse toward the max and the call rate carried by the timing metric disappears.
Reviewed by Cursor Bugbot for commit cf3d1e4. Configure here.


Recording metrics at every step is still having a major impact on throughput.
In this profile we ran the protobuf parsing of a message + metrics + message instantiation 1M times:
3/4 of the time is spent in recording metrics in the buffered backend.
This does not include sending the metrics, jsut the buffering logic.
This is not unexpected:
This PR makes the process lighter by introducing the PipelineStats abstraction which is a very lightweight way to collect stats per step. It is just passed a step name and, for each call it just increments a counter in a dictionary.
Also removing some metrics entirely