Skip to content

Conversation

@ofalvai
Copy link
Contributor

@ofalvai ofalvai commented Oct 24, 2025

Context

Rate-limited requests that get retried multiple times were skewing duration metrics. For example:

  • Actual API response: 500ms per attempt
  • Reported duration: 15 seconds (including 3 retry attempts and backoff times)

Summary

  • Switched to per-attempt tracking for appstoreconnect API requests
  • Each HTTP attempt (including retries) now generates a separate metric with is_retry field
  • Duration measurements are now accurate per-attempt without including retry backoff times

Changes

  • Added trackingRoundTripper that wraps a net/http transport and measures per-attempt duration
  • Integrated the above with retryablehttp library's RequestLogHook to mark retry attempts
  • Updated tracker to include isRetry field
  • Removed aggregate duration tracking from Client.Do()

Testing

  • All existing tests updated and passing
  • New tests in roundtripper_test.go verify:
    • Single request tracking
    • Multiple retry attempts with proper is_retry flagging
    • Per-attempt duration measurements are accurate

Follow-up work

Once this change is released in step updates, we'll need to:

  1. Update step-analytics service to use is_retry as a metric tag
  2. Update dashboards: the response time dashboard will automatically become more accurate, but we might want to create another dashboard and alerting for is_retry=true.
  3. Update the events schema with is_retry. This will allow us to analyze which projects/orgs have rate limit issues as the event schema already contains the build slug.

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

ofalvai and others added 2 commits October 24, 2025 10:33
Implemented per-attempt tracking for HTTP requests to eliminate retry inflation in duration metrics. Each HTTP attempt (including retries) now generates a separate metric event with an `is_retry` flag.

Changes:
- Added trackingRoundTripper wrapper that measures per-attempt duration
- Updated Tracker.TrackAPIRequest() to include isRetry boolean parameter
- Integrated with retryablehttp RequestLogHook to mark retry attempts
- Removed aggregate duration tracking from Client.Do()
- All attempts now report metrics independently

This allows accurate alerting based on individual response times rather than aggregate times that include retry backoff.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@ofalvai ofalvai changed the title Track per-attempt HTTP metrics without retry inflation Track per-attempt HTTP metrics Oct 24, 2025
@ofalvai ofalvai merged commit ff3adbc into master Oct 31, 2025
3 checks passed
@ofalvai ofalvai deleted the track-per-attempt-metrics branch October 31, 2025 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants