Fix bundle analysis processor infinite retry loop from visibility timeout #633

drazisil-codecov · 2026-01-05T15:32:54Z

Problem

The bundle analysis processor was experiencing infinite retry loops due to two related bugs:

Wrong retry counter check: The code was checking self.request.retries instead of self.attempts when determining if max retries were exceeded
Task recreation resetting counter: The apply_async method was always resetting attempts to 1, even when recreating tasks that had already been retried

Root Cause

Bug #1: Wrong retry counter

self.request.retries only counts intentional retries via self.retry()
self.attempts includes visibility timeout re-deliveries via the attempts header

When a task times out and gets redelivered due to visibility timeout, self.request.retries stays the same but self.attempts increases, causing the max retry check to fail.

Bug #2: Task recreation resetting counter

apply_async was always setting attempts: 1 in headers, overwriting any existing attempts value
When tasks were recreated via apply_async (e.g., via chain or re-scheduling), the retry counter would reset to 1
This allowed tasks to retry indefinitely even after hitting max retries

Example

Task 61998622-ac1a-4861-a459-98bb4f51b8ed kept retrying after hitting max retries because:

self.request.retries was 10 (at max_retries)
self.attempts was 11+ (exceeded due to visibility timeout re-deliveries)
The code checked self.request.retries >= self.max_retries which didn't account for the additional attempts
If the task was recreated via apply_async, the attempts header would reset to 1, allowing infinite retries

Solution

Fix #1: Use correct retry counter

Use self.attempts instead of self.request.retries when checking max retries
Pass self.attempts to lock_manager.locked() as documented in the method signature
Pass max_retries to lock_manager.locked() for proper retry limit checking
Use self._has_exceeded_max_attempts() helper method which correctly uses self.attempts

Fix #2: Preserve attempts counter

Preserve existing attempts value from opt_headers if present when calling apply_async
Only default to attempts: 1 for new task creations
This ensures retry counters are maintained across task recreations

Changes

apps/worker/tasks/bundle_analysis_processor.py: Fixed retry check to use self.attempts instead of self.request.retries
apps/worker/tasks/base.py: Fixed apply_async to preserve existing attempts header instead of always resetting to 1
apps/worker/tasks/tests/unit/test_bundle_analysis_processor_task.py: Added test to catch visibility timeout re-delivery scenario

Testing

Added test test_bundle_analysis_processor_task_max_retries_exceeded_visibility_timeout that simulates:

self.request.retries = 5 (below max_retries of 10)
self.attempts = 11 (exceeds max_retries due to visibility timeout re-deliveries)
Verifies task stops retrying and returns previous_result instead of continuing to retry

This test would have caught both bugs before they were deployed.

Note

Addresses infinite retry loops in bundle analysis processing caused by visibility-timeout re-deliveries being ignored.

Use self.attempts and pass retry_num=self.attempts, max_retries=self.max_retries to LockManager.locked(); check limits via _has_exceeded_max_attempts() and enhance error logging
Preserve attempts in BaseCodecovTask.apply_async headers instead of resetting to 1
Add unit test test_bundle_analysis_processor_task_max_retries_exceeded_visibility_timeout to ensure tasks stop retrying when attempts exceed max

^{Written by Cursor Bugbot for commit 21fdf2f. This will update automatically on new commits. Configure here.}

…eout The bundle analysis processor was checking self.request.retries instead of self.attempts when determining if max retries were exceeded. This caused tasks to continue retrying after hitting max retries when visibility timeout caused re-deliveries, because: - self.request.retries only counts intentional retries via self.retry() - self.attempts includes visibility timeout re-deliveries via the attempts header When a task times out and gets redelivered due to visibility timeout, self.request.retries stays the same but self.attempts increases, causing the max retry check to fail. Changes: - Use self.attempts instead of self.request.retries when checking max retries - Pass self.attempts to lock_manager.locked() as documented - Pass max_retries to lock_manager.locked() for proper retry limit checking - Add test to catch visibility timeout re-delivery scenario Fixes infinite retry loop for task 61998622-ac1a-4861-a459-98bb4f51b8ed

sentry · 2026-01-05T15:42:03Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.90%. Comparing base (367d3f1) to head (21fdf2f).
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #633   +/-   ##
=======================================
  Coverage   93.90%   93.90%           
=======================================
  Files        1286     1286           
  Lines       46802    46803    +1     
  Branches     1517     1517           
=======================================
+ Hits        43951    43952    +1     
  Misses       2542     2542           
  Partials      309      309

Flag	Coverage Δ
workerintegration	`59.16% <100.00%> (+0.06%)`	⬆️
workerunit	`91.30% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

codecov-notifications · 2026-01-05T15:42:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

The apply_async method was always setting attempts=1, even when opt_headers already contained an attempts value. This caused tasks to reset their retry counter when recreated via apply_async (e.g., via chain or re-scheduling). Now we preserve the existing attempts value from opt_headers if present, only defaulting to 1 for new task creations. This ensures retry counters are properly maintained across task recreations.

thomasrockhu-codecov · 2026-01-06T18:41:01Z

apps/worker/tasks/bundle_analysis_processor.py

                log.error(
                    "Bundle analysis processor exceeded max retries",
                    extra={
+                        "attempts": attempts,


just do self.attempts here and remove the call of attempts = self.attempts

thomasrockhu-codecov · 2026-01-06T18:41:32Z

apps/worker/tasks/base.py

            **opt_headers,
            "created_timestamp": current_time.isoformat(),
-            "attempts": 1,
+            "attempts": attempts,


this can be inlined also with opt_headers.get("attempts", 1)

actually, is this needed? we call **opt_headers

drazisil-codecov · 2026-01-06T18:57:12Z

Regarding the question "is this needed? we call **opt_headers":

Yes, the fix is needed. While **opt_headers spreads the existing headers (including any attempts value), the problem is in the original code:

headers = {
    **opt_headers,           # Spreads existing headers including any "attempts"
    "created_timestamp": ...,
    "attempts": 1,           # ⚠️ ALWAYS overwrites to 1!
}

In Python dicts, later keys override earlier ones. So even if opt_headers contains {"attempts": 5}, the final "attempts": 1 overwrites it back to 1.

This is exactly the bug causing the infinite retry loop - when tasks are recreated via apply_async, the retry counter gets reset to 1, allowing tasks to retry indefinitely.

The fix ensures we preserve the existing attempts value (or default to 1 for new tasks):

headers = {
    **opt_headers,
    "created_timestamp": ...,
    "attempts": opt_headers.get("attempts", 1),  # Preserves existing, defaults to 1
}

I'll inline this as you suggested in the earlier comment.

- Add debug logging for session creation/cleanup in apply_async and run - Log session_id, task name, and whether transaction was open - Preserve attempts header from PR #633 to fix visibility timeout tracking - Logging helps verify session cleanup is working correctly in production

drazisil-codecov requested review from a team and thomasrockhu-codecov January 5, 2026 15:34

linting

7fa2fb5

thomasrockhu-codecov approved these changes Jan 6, 2026

View reviewed changes

style: inline variables per review feedback

21fdf2f

drazisil-codecov added this pull request to the merge queue Jan 7, 2026

Merged via the queue into main with commit 0360937 Jan 7, 2026
40 checks passed

drazisil-codecov deleted the fix/bundle-analysis-max-retries-visibility-timeout branch January 7, 2026 15:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix bundle analysis processor infinite retry loop from visibility timeout #633

Fix bundle analysis processor infinite retry loop from visibility timeout #633

Uh oh!

drazisil-codecov commented Jan 5, 2026 •

edited by cursor bot

Loading

Uh oh!

sentry bot commented Jan 5, 2026 •

edited

Loading

Uh oh!

codecov-notifications bot commented Jan 5, 2026 •

edited

Loading

Uh oh!

thomasrockhu-codecov Jan 6, 2026

Uh oh!

thomasrockhu-codecov Jan 6, 2026

Uh oh!

thomasrockhu-codecov Jan 6, 2026

Uh oh!

drazisil-codecov commented Jan 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix bundle analysis processor infinite retry loop from visibility timeout #633

Fix bundle analysis processor infinite retry loop from visibility timeout #633

Uh oh!

Conversation

drazisil-codecov commented Jan 5, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Cause

Example

Solution

Changes

Testing

Uh oh!

sentry bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

codecov-notifications bot commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

thomasrockhu-codecov Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

thomasrockhu-codecov Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

thomasrockhu-codecov Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

drazisil-codecov commented Jan 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

drazisil-codecov commented Jan 5, 2026 •

edited by cursor bot

Loading

sentry bot commented Jan 5, 2026 •

edited

Loading

codecov-notifications bot commented Jan 5, 2026 •

edited

Loading