Skip to content

Add queue-storage COPY producer#206

Merged
hardbyte merged 4 commits into
mainfrom
codex/queue-storage-copy-producer
May 1, 2026
Merged

Add queue-storage COPY producer#206
hardbyte merged 4 commits into
mainfrom
codex/queue-storage-copy-producer

Conversation

@hardbyte
Copy link
Copy Markdown
Owner

@hardbyte hardbyte commented May 1, 2026

Summary

  • add QueueStorage::enqueue_params_copy for direct COPY into ready_entries and deferred_jobs
  • expose first-class Python APIs: AsyncClient.enqueue_many_copy(...) / Client.enqueue_many_copy(...), plus raw _sync bindings and type stubs
  • preserve queue-storage enqueue semantics: striping, job IDs, lane seq allocation, uniqueness claims, lane counters, and notifications
  • add live Postgres coverage plus ADR, architecture, test-plan, Python README, and TLA mapping updates
  • stabilize related integration tests: make claim-ring rotation deterministic, reset queue-storage transition state in canonical fixtures, serialize dedicated DB bootstrap, and clear stale queue-storage test connections before schema reset
  • address review follow-ups: concurrent COPY lane sequencing, striped queues, CSV escaping/sentinel coverage, rollback lane counters, bounded COPY chunks, one job-id allocation for mixed ready/deferred batches, and executable test-plan command

Local throughput note

Local microbench from the ignored queue-storage COPY benchmark, 8,192 jobs on local Postgres:

  • batch_size=64: batch 12,770 jobs/sec, COPY 10,809 jobs/sec, COPY speedup 0.85x
  • batch_size=128: batch 19,850 jobs/sec, COPY 20,564 jobs/sec, COPY speedup 1.04x

These are hardware/local-DB numbers, not a release claim; they show the direct COPY path is not universally faster at smaller batches but is competitive/slightly ahead at 128 in this run.

Validation

  • cargo fmt --check
  • cargo test -p awa-model --lib
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test cargo test -p awa-model --test queue_storage_copy_test
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test cargo test -p awa-model --test queue_storage_copy_test queue_storage_copy_benchmark_batch_vs_copy -- --ignored --exact --nocapture (batch 64 and 128)
  • ./correctness/run-tlc.sh storage/AwaSegmentedStorage.tla storage/AwaSegmentedStorage.cfg
  • .venv/bin/maturin develop (from awa-python/)
  • PYO3_PYTHON=/Users/brian/dev/awa/awa-python/.venv/bin/python cargo test --manifest-path awa-python/Cargo.toml --lib
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test cargo test -p awa --test queue_storage_runtime_test test_lease_claim_rotation_isolation -- --exact --nocapture (10 repeated passes)
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test cargo test -p awa --test queue_storage_runtime_test
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test cargo test -p awa --test scale_test
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test cargo test -p awa --test validation_test
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test cargo test --workspace
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test .venv/bin/pytest tests/test_awa.py::test_enqueue_many_copy_queue_storage tests/test_awa.py::test_enqueue_many_copy_requires_queue_storage tests/test_sync.py::test_enqueue_many_copy_sync_queue_storage -v
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test .venv/bin/pytest tests/test_sync.py::test_insert_many_copy_sync tests/test_awa.py::test_transaction_insert_many -v
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test .venv/bin/pytest tests/test_start_config.py -v (20 passed)
  • DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test .venv/bin/pytest tests/test_awa.py tests/test_sync.py -v (53 passed, 1 optional pydantic skip)

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

📝 Walkthrough

Walkthrough

This PR introduces a new COPY-based batch ingestion path for queue storage. The public API QueueStorage::enqueue_params_copy streams jobs directly into PostgreSQL via COPY commands, handling job ID allocation, lane sequence reservation, uniqueness synchronization, and queue notifications within a single transaction—mirroring the semantics of the existing batch enqueue method but using PostgreSQL COPY for efficiency.

Changes

Cohort / File(s) Summary
Core Implementation
awa-model/src/queue_storage.rs
New public method enqueue_params_copy (402 lines) implementing COPY-based job ingestion with CSV serialization helpers for strings, JSON, RFC3339 datetimes, and optional types; groups ready insertions by queue/priority, synchronizes uniqueness claims, and manages lane counters within a single transaction.
Test Coverage
awa-model/tests/queue_storage_copy_test.rs
New test suite (167 lines) validating enqueue_params_copy behavior: verifies ready entries are created with correct ordering and payloads, scheduled jobs create deferred entries, lane availability updates, and uniqueness conflict handling with transactional rollback.
Documentation & Mapping
awa-model/README.md, correctness/storage/MAPPING.md, docs/adr/008-copy-batch-ingestion.md, docs/architecture.md
Updated documentation describing the new COPY ingestion path: clarifies that enqueue_params_copy enables direct COPY into queue-storage tables, revises ADR 008 to detail the direct COPY mechanism (rather than compat routing), adds enqueue_params_copy workflow description to architecture, and updates correctness mappings for EnqueueReady/EnqueueDeferred actions.
Test Plan
docs/test-plan.md
Added new scenario (QSC1-QSC2) for queue-storage COPY production coverage, including uniqueness-conflict rollback validation.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant QS as QueueStorage
    participant TX as Transaction
    participant DB as PostgreSQL
    participant LN as Lane State
    participant UQ as Uniqueness Claims

    Client->>QS: enqueue_params_copy(jobs)
    QS->>TX: begin transaction
    TX->>DB: allocate job IDs (sequence)
    DB-->>TX: job_ids, next_seq
    TX->>LN: reserve lane seq-ranges per queue
    LN-->>TX: lane_seq ranges
    TX->>UQ: synchronize uniqueness claims
    UQ-->>TX: conflict check result
    alt Conflict Detected
        TX->>DB: ROLLBACK
        DB-->>QS: error (UniqueConflict)
        QS-->>Client: Result::Err
    else No Conflict
        TX->>DB: COPY ready_entries (CSV format)
        TX->>DB: COPY deferred_jobs (CSV format)
        TX->>LN: update lane counters in batch
        TX->>DB: queue logical notifications
        TX->>DB: COMMIT
        DB-->>QS: rows_inserted
        QS-->>Client: Result::Ok(usize)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • PR #178: Introduces the core QueueStorage redesign and foundational COPY insert paths for ready_entries and deferred_jobs—this PR builds on that design by adding the enqueue_params_copy public producer API with full integration of uniqueness and lane state management.

Poem

🐰 A speedy warren path unfolds,
Where COPY streams cascade like gold,
No row-by-row delays to bear,
Just CSV through crystal air—
One transaction, swift and whole,
A batched enqueue from soul to soul! 📮

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add queue-storage COPY producer' directly and clearly describes the main change: adding a new COPY-based producer method to QueueStorage for batch job ingestion.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/queue-storage-copy-producer

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
docs/test-plan.md (1)

171-171: ⚡ Quick win

Add an explicit run command for the new QSC scenario.

Line 171 adds the matrix row, but the “Running Tests” section does not yet include the corresponding awa-model command. Adding it keeps the plan directly executable.

Proposed doc update
 # COPY integration tests
 DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test cargo test --package awa --test copy_test -- --nocapture

+# Queue-storage COPY producer tests
+DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test cargo test --package awa-model --test queue_storage_copy_test -- --nocapture
+
 # Python sync tests
 cd awa-python && DATABASE_URL=postgres://postgres:test@localhost:15432/awa_test .venv/bin/pytest tests/test_sync.py -v
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/test-plan.md` at line 171, The new matrix row QSC1-QSC2 (Queue storage
COPY producer: ready/deferred direct COPY, unique-conflict rollback) needs a
matching executable entry in the "Running Tests" section: add the corresponding
awa-model run command for QSC1-QSC2 (using the same invocation pattern as other
rows) so the test plan is directly executable and the scenario can be run
without manual mapping.
awa-model/tests/queue_storage_copy_test.rs (1)

158-167: ⚡ Quick win

Strengthen rollback assertion to include lane-counter side effects.

Line 166 verifies no ready rows persisted; consider also asserting queue lane availability remains zero so transaction rollback guarantees cover counters too.

Suggested assertion addition
     assert_eq!(ready_count, 0);
+
+    let lane_available: i64 = sqlx::query_scalar(&format!(
+        "SELECT COALESCE(sum(available_count), 0)::bigint FROM {}.queue_lanes WHERE queue = $1",
+        store.schema()
+    ))
+    .bind(queue)
+    .fetch_one(&pool)
+    .await
+    .expect("count lane availability");
+    assert_eq!(lane_available, 0);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@awa-model/tests/queue_storage_copy_test.rs` around lines 158 - 167, The test
currently asserts ready_entries count is zero but misses verifying lane-counter
side effects; add a second SQL scalar query (similar to the ready_count query)
against format!("SELECT count(*)::bigint FROM {}.lane_availability WHERE queue =
$1", store.schema()) (or the project's lane-counter table name) binding queue
and using pool, .await, .expect("count lane availability rows"), then
assert_eq!(lane_count, 0) to ensure transaction rollback also cleared lane
counters; place this directly after the ready_count assertion.
awa-model/src/queue_storage.rs (2)

3863-3954: 🏗️ Heavy lift

Factor the COPY enqueue planning out of the duplicated code.

insert_ready_rows_copy_tx, insert_deferred_rows_copy_tx, and enqueue_params_copy now repeat the batch path's lane allocation, uniqueness sync, state filtering, and notification prep almost verbatim. Because the contract here is “same semantics as enqueue_params_batch”, this copy/paste is likely to drift on the next behavior change. A shared planner plus pluggable sink (INSERT vs COPY) would keep the paths aligned.

Also applies to: 4097-4137, 4574-4677

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@awa-model/src/queue_storage.rs` around lines 3863 - 3954, Extract the common
enqueue planning logic (lane allocation, uniqueness sync, state filtering, and
notification preparation) into a shared planner function (e.g.,
prepare_enqueue_plan or build_enqueue_plan) that returns a neutral "plan" type
(e.g., Vec<RuntimeReadyInsert> or a Plan struct with ready_rows and
count_deltas) and accept a pluggable sink strategy (an enum or trait for
Sink::Insert vs Sink::Copy) so the caller can apply either execute_ready_copy_tx
or the INSERT path; replace the duplicate logic in insert_ready_rows_copy_tx,
insert_deferred_rows_copy_tx, and enqueue_params_copy to call the new planner
and then perform only the sink-specific actions (calling execute_ready_copy_tx
or the INSERT executor) and adjust_lane_counts_batch, ensuring you reuse
existing helpers like next_job_ids, ensure_lane, sync_unique_claim, and
adjust_lane_counts_batch to keep semantics identical to enqueue_params_batch.

410-539: ⚡ Quick win

Add a regression matrix around the handwritten CSV encoder.

This encoder now owns NULL-sentinel handling, quote/newline escaping, bytea hex, and JSON serialization. A small live-Postgres case matrix for values like ",", "\"", "\n", "\\", "", and literal "__AWA_NULL__" would make this much safer to evolve.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@awa-model/src/queue_storage.rs` around lines 410 - 539, The CSV encoder lacks
regression tests for NULL-sentinel, quoting/escaping, bytea hex and JSON
serialization; add a test matrix that exercises write_copy_field,
write_copy_json, write_copy_optional_bytes and the row writers
(write_ready_copy_row, write_deferred_copy_row) with cases: comma (","),
double-quote ("\""), newline ("\n"), backslash ("\\"), empty string (""), and
the literal COPY_NULL_SENTINEL ("__AWA_NULL__"); for each case produce the
encoded buffer, then either parse it back with Postgres COPY (preferred) or a
robust CSV parser to assert round-trip semantics and correct NULL handling/bytea
hex encoding for Optional<Vec<u8>> and correct JSON stringification for
serde_json::Value; implement tests named like
test_copy_encoder_regression_matrix and test_copy_rows_regression_matrix that
build sample RuntimeReadyInsert/DeferredJobRow structs with these values and
assert expected decoded columns (including NULL when sentinel is present).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@awa-model/src/queue_storage.rs`:
- Around line 3751-3765: The current COPY implementation builds a full in-memory
CSV (csv_buf) by calling write_ready_copy_row for every row and only then
calling copy_in.send, which retains upstream row vectors and spikes memory;
change the logic to stream rows into copy_in in bounded chunks: allocate a
reusable buffer, iterate rows and append via write_ready_copy_row, and whenever
the buffer exceeds a configured threshold (e.g., ~64KB-1MB) call
copy_in.send(buffer_slice) then clear/truncate the buffer (or replace with a new
Vec) so memory is bounded and upstream rows can be dropped; apply the same
streaming-chunk pattern to the other COPY sites that use
write_ready_copy_row/csv_buf/copy_in (the other copy_in_raw usages) and ensure
tx.as_mut().copy_in_raw(...).await is used before sending chunks and
copy_in.finish() is called after the final chunk.

---

Nitpick comments:
In `@awa-model/src/queue_storage.rs`:
- Around line 3863-3954: Extract the common enqueue planning logic (lane
allocation, uniqueness sync, state filtering, and notification preparation) into
a shared planner function (e.g., prepare_enqueue_plan or build_enqueue_plan)
that returns a neutral "plan" type (e.g., Vec<RuntimeReadyInsert> or a Plan
struct with ready_rows and count_deltas) and accept a pluggable sink strategy
(an enum or trait for Sink::Insert vs Sink::Copy) so the caller can apply either
execute_ready_copy_tx or the INSERT path; replace the duplicate logic in
insert_ready_rows_copy_tx, insert_deferred_rows_copy_tx, and enqueue_params_copy
to call the new planner and then perform only the sink-specific actions (calling
execute_ready_copy_tx or the INSERT executor) and adjust_lane_counts_batch,
ensuring you reuse existing helpers like next_job_ids, ensure_lane,
sync_unique_claim, and adjust_lane_counts_batch to keep semantics identical to
enqueue_params_batch.
- Around line 410-539: The CSV encoder lacks regression tests for NULL-sentinel,
quoting/escaping, bytea hex and JSON serialization; add a test matrix that
exercises write_copy_field, write_copy_json, write_copy_optional_bytes and the
row writers (write_ready_copy_row, write_deferred_copy_row) with cases: comma
(","), double-quote ("\""), newline ("\n"), backslash ("\\"), empty string (""),
and the literal COPY_NULL_SENTINEL ("__AWA_NULL__"); for each case produce the
encoded buffer, then either parse it back with Postgres COPY (preferred) or a
robust CSV parser to assert round-trip semantics and correct NULL handling/bytea
hex encoding for Optional<Vec<u8>> and correct JSON stringification for
serde_json::Value; implement tests named like
test_copy_encoder_regression_matrix and test_copy_rows_regression_matrix that
build sample RuntimeReadyInsert/DeferredJobRow structs with these values and
assert expected decoded columns (including NULL when sentinel is present).

In `@awa-model/tests/queue_storage_copy_test.rs`:
- Around line 158-167: The test currently asserts ready_entries count is zero
but misses verifying lane-counter side effects; add a second SQL scalar query
(similar to the ready_count query) against format!("SELECT count(*)::bigint FROM
{}.lane_availability WHERE queue = $1", store.schema()) (or the project's
lane-counter table name) binding queue and using pool, .await, .expect("count
lane availability rows"), then assert_eq!(lane_count, 0) to ensure transaction
rollback also cleared lane counters; place this directly after the ready_count
assertion.

In `@docs/test-plan.md`:
- Line 171: The new matrix row QSC1-QSC2 (Queue storage COPY producer:
ready/deferred direct COPY, unique-conflict rollback) needs a matching
executable entry in the "Running Tests" section: add the corresponding awa-model
run command for QSC1-QSC2 (using the same invocation pattern as other rows) so
the test plan is directly executable and the scenario can be run without manual
mapping.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ee3b655b-2dda-4fae-81a3-bdd1f060744f

📥 Commits

Reviewing files that changed from the base of the PR and between b8cba93 and 30c078d.

📒 Files selected for processing (7)
  • awa-model/README.md
  • awa-model/src/queue_storage.rs
  • awa-model/tests/queue_storage_copy_test.rs
  • correctness/storage/MAPPING.md
  • docs/adr/008-copy-batch-ingestion.md
  • docs/architecture.md
  • docs/test-plan.md

Comment thread awa-model/src/queue_storage.rs Outdated
Copy link
Copy Markdown
Owner Author

@hardbyte hardbyte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still todo TLA/python

@hardbyte hardbyte merged commit 7cdcd90 into main May 1, 2026
12 of 13 checks passed
hardbyte added a commit to hardbyte/postgresql-job-queue-benchmarking that referenced this pull request May 1, 2026
…pha.1)

The COPY-based producer landed in awa 0.6.0-alpha.1
(hardbyte/awa#206 + #208). On the bulk-everywhere bench shape it's
~9% faster than the previous enqueue_params_batch (multi-row INSERT)
path at 128 workers.

Default this adapter to the COPY path. Keep AWA_QS_PRODUCER_PATH=batch
as a diagnostic A/B knob in case anyone wants to compare.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant