chore(ws): add retry backoff and recovery grace to batch consumer#59597
Merged
Conversation
Contributor
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
posthog/temporal/data_imports/pipelines/pipeline_v3/postgres_queue/jobs_db.py:186-188
The docstring says the batch is eligible when `retry_backoff_base_seconds * s.attempt` seconds have elapsed, but the SQL multiplier is `GREATEST(COALESCE(s.attempt, 1), 1)` — which floors the factor at 1 so a `waiting_retry` row with `attempt=0` never becomes immediately eligible. The description understates this: it implies `attempt=0` would produce a zero wait, which the code deliberately avoids.
```suggestion
``retry_backoff_base_seconds`` gates the ``waiting_retry`` branch on
the age of the latest status row: a batch is only eligible when
``now() - s.created_at >= retry_backoff_base_seconds * GREATEST(s.attempt, 1)``
(attempt is floored at 1 so that a zero-attempt row still waits at least one
base period).
```
Reviews (1): Last reviewed commit: "add backoff and grace period" | Re-trigger Greptile |
Gilbert09
approved these changes
May 22, 2026
MarconLP
approved these changes
May 26, 2026
danielcarletti
approved these changes
May 26, 2026
Contributor
Migration SQL ChangesHey 👋, we've detected some migrations on this PR. Here's the SQL output for each migration, make sure they make sense:
|
Contributor
🔍 Migration Risk AnalysisWe've analyzed your migrations for potential risks. Summary: 0 Safe | 1 Needs Review | 0 Blocked
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
waiting_retrywas eligible to be re-picked on the very next poll cycle, so transient failures (network blips, downstream pressure) could be retried tightly untilmax_attemptswas burned through.executingrow whose advisory lock wasn't held as stale, including rows that had only existed for milliseconds. A worker that briefly dropped its psycopg session or one whose advisory lock was probed between status-write and lock-acquire could be racily reclaimed by another worker.(team_id, schema_id)were processed independently, so a failure onbatch_index=Ndid not stopN+1from running. For pipelines where ordering within a run matters, this could surface downstream state inconsistencies.Changes
How did you test this code?
Local run
Tests pass
Publish to changelog?
NO
Docs update
NO