fix: audit 2026-05-02 P1 follow-ups by pratyush618 · Pull Request #108 · ByteVeda/taskito

pratyush618 · 2026-05-02T17:54:22Z

Summary

Five P1 follow-up fixes from the 2026-05-02 pre-release audit, after the four P0s shipped (#104, #105, #106, #107). One commit per fix.

Redis status discriminants — archive_old_jobs, purge_completed, purge_completed_with_ttl, reap_stale_jobs, expire_pending_jobs now cast JobStatus::Foo as i32 instead of using magic numbers. Reordering the enum will fail the build instead of silently archiving the wrong buckets.
result() / aresult() — re-poll once inside the deadline branch so a terminal failure that landed during the previous poll surfaces as TaskFailedError / MaxRetriesExceededError / TaskCancelledError, not TimeoutError.
result_handler.rs — Failure branch fetched get_job up to three times per call (queue context + two DLQ moves). Restructured to one fetch + a small DLQ closure.
run_maybe_async — explicit detection of a running event loop with a clear taskito-specific error, instead of the cryptic asyncio.run() cannot be called from a running event loop.
ResourcePool._active_count — increment moved to after the factory call returns successfully. Failure path no longer needs (or has) a decrement, so a wedged factory can't underflow active. Failed attempts also stop counting against total_acquisitions.

Test plan

cargo test --workspace — 89 Rust tests pass
cargo check --workspace --features postgres — clean
cargo check --workspace --features redis — clean
cargo clippy --workspace -- -D warnings — clean
uv run python -m pytest tests/python/ -v — 496 passed, 9 skipped (was 488; +8 new across test_result_race.py, test_run_maybe_async.py, test_resource_system_full.py::test_pool_factory_failure_does_not_underflow_active_count)
uv run ruff check py_src/ tests/ — clean
uv run mypy py_src/taskito/ --no-incremental — clean

Sourcing status integers from the enum variant guarantees the build fails if `JobStatus` is reordered, instead of silently archiving or purging the wrong buckets. Same anti-pattern fixed across `archival.rs` and `jobs/maintenance.rs`.

`result()` / `aresult()` could raise `TimeoutError` even when the job had already failed/died/cancelled, when the terminal state landed during the final poll. A defensive re-poll inside the deadline branch lets the caller see the real exception class.

The Failure branch fetched the job up to three times — once for the queue context, once for `!should_retry` DLQ move, once for retry-exhausted DLQ move. One fetch is enough; subsequent uses hand the same `Option<&Job>` to a small DLQ closure.

Previously hit `asyncio.run() cannot be called from a running event loop`, which doesn't tell a user calling sync API from async code what to do. Detect the running-loop case explicitly and raise a taskito-specific error that points at the async API and `await`.

Previously the counter incremented before the factory ran and decremented in the failure branch. Restructured so the increment runs once we hold an instance — failure path simply releases the semaphore. Fixes a possible negative `active` in `stats()` and stops counting failed attempts as acquisitions.

pratyush618 added 5 commits May 2, 2026 23:08

github-actions Bot added rust python storage scheduler tests labels May 2, 2026

pratyush618 merged commit d5d8bee into master May 2, 2026
19 checks passed

pratyush618 deleted the fix/audit-2026-05-02-p1 branch May 2, 2026 18:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: audit 2026-05-02 P1 follow-ups#108

fix: audit 2026-05-02 P1 follow-ups#108
pratyush618 merged 5 commits intomasterfrom
fix/audit-2026-05-02-p1

pratyush618 commented May 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pratyush618 commented May 2, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant