Skip to content

feat(sqs): rate-limited async message move tasks with cancellation#765

Merged
vieiralucas merged 2 commits intomainfrom
worktree-batch8-sqs-message-move
Apr 25, 2026
Merged

feat(sqs): rate-limited async message move tasks with cancellation#765
vieiralucas merged 2 commits intomainfrom
worktree-batch8-sqs-message-move

Conversation

@vieiralucas
Copy link
Copy Markdown
Member

@vieiralucas vieiralucas commented Apr 25, 2026

Summary

Replaces the synchronous-only message-move flow with a rate-aware async mover so callers can observe a moving task in RUNNING state, cancel it mid-flight, and rely on MaxNumberOfMessagesPerSecond as a real ceiling. Part of batch 8/10 of the gap-fill plan.

  • StartMessageMoveTask with a rate spawns a background mover that drains one message per tick.
  • CancelMessageMoveTask flips an AtomicBool cancellation flag and marks the task CANCELLING; the mover observes the flag on its next tick and finalizes CANCELLED.
  • The mover updates messages_moved per move so ListMessageMoveTasks reflects real progress.
  • Without a rate, the existing fast path still drains synchronously to COMPLETED (no behavior change for callers that didn't set a rate).
  • Cancellation flag lives in a #[serde(skip)] field on MessageMoveTask; snapshots unaffected.

Critical detail noted in the commit body: the mover acquires the state write lock for exactly one move per tick and drops it before sleeping or finalizing — finalize_move_task also takes the write lock and parking_lot's RwLock is non-reentrant.

Test plan

  • 129 SQS unit tests pass (cargo test -p fakecloud-sqs --lib)
  • 2 new E2E tests in crates/fakecloud-e2e/tests/sqs_message_move.rs (rate-limited drain + mid-flight cancel)
  • cargo clippy --workspace --all-targets -- -D warnings clean
  • cargo fmt --all clean

Summary by cubic

Adds a rate-aware async worker for moving SQS messages with observable progress and mid-flight cancellation. Also fixes an edge case to prevent message loss if the destination queue disappears during a move.

  • New Features

    • StartMessageMoveTask with a rate spawns a background worker that moves one message per tick and updates messages_moved.
    • CancelMessageMoveTask sets a cancellation flag and marks CANCELLING; the worker finalizes CANCELLED on the next tick.
    • Progress is visible via ListMessageMoveTasks; tasks end as COMPLETED on drain or FAILED if queues/tasks disappear.
    • No-rate path keeps the previous synchronous drain to COMPLETED.
    • Locking: the worker holds the write lock for one move per tick and releases it before sleeping/finalizing to avoid deadlocks with the non-reentrant RwLock.
    • The cancellation flag is stored in #[serde(skip)] on MessageMoveTask, so snapshots remain unchanged.
  • Bug Fixes

    • Prevent message loss when the target queue disappears mid-flight by validating the destination before popping, pushing the message back if a race occurs, and advancing the round-robin index only after a successful commit.

Written for commit 911e906. Summary will update on new commits.

Replaces the synchronous-only message-move flow with a rate-aware async
mover so callers can observe a moving task in RUNNING state, cancel it
mid-flight, and rely on MaxNumberOfMessagesPerSecond as a real ceiling.

- StartMessageMoveTask: when MaxNumberOfMessagesPerSecond is set, mint a
  RUNNING task with a real UUID-derived TaskHandle and spawn a
  background mover that drains one message per tick at the requested
  rate. Without a rate, keep the existing fast-path that drains
  synchronously to COMPLETED.
- CancelMessageMoveTask: signal cancellation via an AtomicBool flag on
  the task; the mover observes it on its next tick and finalizes the
  task as CANCELLED. The request flips status to CANCELLING immediately
  so listings reflect the in-flight transition.
- Background mover updates messages_moved on each successful move so
  ListMessageMoveTasks reports real progress; finalizes COMPLETED on
  drain or FAILED if the source/destination queue disappears.
- Critical: the mover acquires the state write lock for exactly one
  move per tick and drops it before sleeping or finalizing.
  finalize_move_task also takes the write lock, so holding it across
  the finalize call would deadlock with parking_lot's non-reentrant
  RwLock.
- MessageMoveTask carries the cancel flag in a #[serde(skip)] field so
  on-disk snapshots are unaffected; restores get a fresh non-cancelled
  flag.

Adds 2 E2E tests (rate-limited drain + mid-flight cancel) and migrates
one unit test to tokio::test now that the rate path spawns.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="crates/fakecloud-sqs/src/service.rs">

<violation number="1" location="crates/fakecloud-sqs/src/service.rs:852">
P1: Message loss when target queue disappears mid-flight: `pop_front()` removes the message from the source before confirming the destination exists. If the target queue was deleted (or the URL can't be resolved), `msg` is dropped and the message is silently lost. Push the message back onto the source queue on failure, or verify the target before popping.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread crates/fakecloud-sqs/src/service.rs Outdated
Cubic flagged P1: pop_front removed the message from source before
verifying the destination still existed, dropping the message if the
destination queue was deleted between ticks.

Refactored mover loop to:
- Verify both source and destination queues exist before popping.
- Pop the source message only after target validation.
- If get_mut on the resolved target races and returns None (queue
  deleted between resolve and commit), push the message back to the
  front of the source queue so it is retried or remains durable.
- Bump the round-robin index only after a successful commit so a failed
  attempt doesn't skip a target.
@vieiralucas vieiralucas merged commit 1f37812 into main Apr 25, 2026
37 checks passed
@vieiralucas vieiralucas deleted the worktree-batch8-sqs-message-move branch April 25, 2026 21:57
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 25, 2026

Codecov Report

❌ Patch coverage is 37.01923% with 131 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/fakecloud-sqs/src/service.rs 37.37% 129 Missing ⚠️
crates/fakecloud-sqs/src/state.rs 0.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant