Skip to content

fix: audit 2026-05-02 P1 follow-ups#108

Merged
pratyush618 merged 5 commits intomasterfrom
fix/audit-2026-05-02-p1
May 2, 2026
Merged

fix: audit 2026-05-02 P1 follow-ups#108
pratyush618 merged 5 commits intomasterfrom
fix/audit-2026-05-02-p1

Conversation

@pratyush618
Copy link
Copy Markdown
Collaborator

Summary

Five P1 follow-up fixes from the 2026-05-02 pre-release audit, after the four P0s shipped (#104, #105, #106, #107). One commit per fix.

  • Redis status discriminantsarchive_old_jobs, purge_completed, purge_completed_with_ttl, reap_stale_jobs, expire_pending_jobs now cast JobStatus::Foo as i32 instead of using magic numbers. Reordering the enum will fail the build instead of silently archiving the wrong buckets.
  • result() / aresult() — re-poll once inside the deadline branch so a terminal failure that landed during the previous poll surfaces as TaskFailedError / MaxRetriesExceededError / TaskCancelledError, not TimeoutError.
  • result_handler.rsFailure branch fetched get_job up to three times per call (queue context + two DLQ moves). Restructured to one fetch + a small DLQ closure.
  • run_maybe_async — explicit detection of a running event loop with a clear taskito-specific error, instead of the cryptic asyncio.run() cannot be called from a running event loop.
  • ResourcePool._active_count — increment moved to after the factory call returns successfully. Failure path no longer needs (or has) a decrement, so a wedged factory can't underflow active. Failed attempts also stop counting against total_acquisitions.

Test plan

  • cargo test --workspace — 89 Rust tests pass
  • cargo check --workspace --features postgres — clean
  • cargo check --workspace --features redis — clean
  • cargo clippy --workspace -- -D warnings — clean
  • uv run python -m pytest tests/python/ -v496 passed, 9 skipped (was 488; +8 new across test_result_race.py, test_run_maybe_async.py, test_resource_system_full.py::test_pool_factory_failure_does_not_underflow_active_count)
  • uv run ruff check py_src/ tests/ — clean
  • uv run mypy py_src/taskito/ --no-incremental — clean

Sourcing status integers from the enum variant guarantees the build fails if
`JobStatus` is reordered, instead of silently archiving or purging the wrong
buckets. Same anti-pattern fixed across `archival.rs` and `jobs/maintenance.rs`.
`result()` / `aresult()` could raise `TimeoutError` even when the job had
already failed/died/cancelled, when the terminal state landed during the
final poll. A defensive re-poll inside the deadline branch lets the caller
see the real exception class.
The Failure branch fetched the job up to three times — once for the queue
context, once for `!should_retry` DLQ move, once for retry-exhausted DLQ
move. One fetch is enough; subsequent uses hand the same `Option<&Job>` to
a small DLQ closure.
Previously hit `asyncio.run() cannot be called from a running event loop`,
which doesn't tell a user calling sync API from async code what to do.
Detect the running-loop case explicitly and raise a taskito-specific error
that points at the async API and `await`.
Previously the counter incremented before the factory ran and decremented in
the failure branch. Restructured so the increment runs once we hold an
instance — failure path simply releases the semaphore. Fixes a possible
negative `active` in `stats()` and stops counting failed attempts as
acquisitions.
@pratyush618 pratyush618 merged commit d5d8bee into master May 2, 2026
19 checks passed
@pratyush618 pratyush618 deleted the fix/audit-2026-05-02-p1 branch May 2, 2026 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant