Skip to content

Issue 15#16

Merged
stardothosting merged 6 commits intomainfrom
issue-15
Jun 30, 2025
Merged

Issue 15#16
stardothosting merged 6 commits intomainfrom
issue-15

Conversation

@stardothosting
Copy link
Copy Markdown
Owner

Fix Timeout Issues and Add Production Error Monitoring

Addresses recurring cURL error 28 timeout failures with the Unwrangle API and adds proactive error monitoring.

What's Fixed

  • Smart timeout handling: Two-phase fetch strategy (30s quick → 45s full) with intelligent retry logic
  • Better error messages: Clear user feedback instead of generic timeout errors
  • Production alerting: Real-time Pushover notifications for critical errors (API timeouts, session expires, quota limits)
  • Alert throttling: Prevents notification spam with 15-30 min windows

Key Changes

  • Progressive timeout values (60s → 90s → 120s) with retry logic
  • Centralized AlertService with PushoverChannel integration
  • Enhanced error categorization and logging
  • Environment-aware alerting (log-only in dev, full alerts in prod)

Impact

  • Users get actionable error messages instead of cryptic timeouts
  • Production issues are immediately visible via mobile notifications
  • Smart retry logic reduces false failures
  • 33 new tests, all 157 tests passing

Transforms timeout failures from user-facing errors into proactively monitored events.

- Implement multi-attempt strategy: quick fetch (30s, 5 pages) then full fetch (45s, 10 pages)
- Add smart retry logic that only retries on timeout errors
- Improve error messages with detailed explanations and actionable advice
- Enhanced logging with attempt details and duration tracking
- Better differentiation between timeout vs other API errors

This addresses the recurring 'cURL error 28' timeouts from Unwrangle API
and provides users with clearer feedback when requests fail.
- Set timeouts to 60s, 90s, 120s with progressive page limits
- Add connectivity test before API calls
- Better HTTP client configuration with proper headers
- Categorize errors for easier debugging
- Skip connectivity test during testing
- Add API_TIMEOUT and CONNECTIVITY_ISSUE alert types
- Integrate timeout alerts into AmazonFetchService
- Send alerts for connection timeouts and connectivity failures
- Add throttling (15-30 min) to prevent alert spam
- Create test command for timeout alerts
- Updated LoggingServiceTest to match new improved error messages
- Fixed API_TIMEOUT/UNWRANGLE_TIMEOUT error types in test
- All tests now pass successfully
- Fixed test expectations for new timeout error messages
- Updated error types array structure test for API_TIMEOUT
- All tests now pass with improved timeout handling
@stardothosting stardothosting added the enhancement New feature or request label Jun 30, 2025
@stardothosting stardothosting merged commit dc591d5 into main Jun 30, 2025
0 of 2 checks passed
@stardothosting stardothosting deleted the issue-15 branch July 9, 2025 03:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant