Skip to content

Ensure release workflow uploads architecture-specific assets#1884

Merged
oferchen merged 1 commit into
masterfrom
fix-artifact-upload-for-all-platforms
Nov 1, 2025
Merged

Ensure release workflow uploads architecture-specific assets#1884
oferchen merged 1 commit into
masterfrom
fix-artifact-upload-for-all-platforms

Conversation

@oferchen
Copy link
Copy Markdown
Owner

@oferchen oferchen commented Nov 1, 2025

Summary

  • expand the release asset glob so nested per-target outputs are included when publishing

Testing

  • not run

https://chatgpt.com/codex/tasks/task_e_690627a22fb08323928eb59dce24e2a6

@oferchen oferchen merged commit c8b9089 into master Nov 1, 2025
@oferchen oferchen deleted the fix-artifact-upload-for-all-platforms branch November 1, 2025 15:34
oferchen added a commit that referenced this pull request May 7, 2026
…1564) (#3886)

Document the BTreeMap and ring-buffer reorder buffer impls, estimate
worst-case memory at 100K queued tails (~21-33 MB), outline a dhat-rs
profile plan, and propose per-window cap plus spill-to-tempfile bound
design with cross-reference to #1884.
oferchen added a commit that referenced this pull request May 7, 2026
Concise design for a receiver-side multi-file delta-apply pipeline
that overlaps per-file apply across in-flight files while preserving
strict wire-NDX order at acknowledgement and disk-commit boundaries.
Composes the existing ReceiverDeltaPipeline trait (#1543), bounded
work queue, and BoundedReorderBuffer (#1407). Threshold gate at
PARALLEL_STAT_THRESHOLD=64 (#1547). Risks documented: head-of-line
stall (#1883) and spill-to-tempfile pending (#1884).
oferchen added a commit that referenced this pull request May 13, 2026
…3982)

When the in-memory reorder buffer exceeds a configurable threshold
(default 64 MB), excess items are serialized to a temporary file and
reloaded transparently on delivery. This bounds memory for 100K+ file
transfers where head-of-line stalls cause successor accumulation.

Key components:
- SpillCodec trait for item serialization (length-prefixed binary)
- SpillableReorderBuffer<T> wrapper with same API as ReorderBuffer
- SpillCodec implementation for DeltaResult
- ReorderBuffer::take() for non-advancing slot extraction
- SpooledTempFile (in-memory up to 1 MB, then disk) via tempfile crate
- Hot-zone protection keeps items near next_expected in memory
- RAII cleanup: temp files removed automatically on drop
oferchen added a commit that referenced this pull request May 13, 2026
…#4015)

Audit confirms that ThresholdDeltaPipeline::new_bypass routes through
ParallelDeltaPipeline::new_bypass -> DeltaConsumer::spawn_bypass ->
ReorderBuffer::passthrough end-to-end, statically avoiding the
spill-aware ordered path designed for task #1884. The constructor
choice at the dispatcher boundary is the single decision point, and
selecting it based on ServerConfig::write::delay_updates is the
correct caller-side signal.
oferchen added a commit that referenced this pull request May 17, 2026
…4204)

Add a Criterion benchmark that synthesises 100K, 500K, and 1M out-of-order
inserts across drift windows of 32, 256, 2048, and 16K, then reports
insert+drain throughput together with the peak occupancy via the
`metrics().max_depth` accessor.

The benchmark pre-allocates the drifted permutation outside the timed
section and prints `max_depth` once per (count, drift) pair so operators
can compare against in-flight dispatch capacity and decide whether the
spill (#1884) or adaptive-sizing (#1834) paths are warranted. The 1M case
is gated behind `BENCH_REORDER_MEMORY_1M=1` to keep default runs fast.
oferchen added a commit that referenced this pull request May 17, 2026
…1884) (#4228)

Design note covering the bounded-memory reorder-buffer spill path.
Documents that SpillableReorderBuffer already exists and is hardened,
but is not wired into the consumer; recommends extending the
reorderbuffer_memory bench with a synthetic huge-drift scenario before
enabling the spill on the default path.
oferchen added a commit that referenced this pull request May 17, 2026
…apply (#4319)

Wires the existing SpillableReorderBuffer into DeltaConsumer behind a new
opt-in ConcurrentDeltaConfig, and lands the parallel receive-side delta
apply scaffold behind the parallel-receive-delta cargo feature.

SpillableReorderBuffer wiring (#1884)
- New ConcurrentDeltaConfig { spill_threshold_bytes, spill_dir } selects
  between the bare ReorderBuffer (default, behaviour unchanged) and the
  bounded-memory SpillableReorderBuffer when a threshold is supplied.
- DeltaConsumer::spawn_with_config dispatches via a ReorderMode enum so
  spawn / spawn_bypass / spawn_with_config share one inner loop entry.
- DeltaConsumerStats surfaces the cumulative spill_events counter via a
  lock-free AtomicU64 published by the reorder thread.
- Spill backend construction or I/O failures map to DeltaResult::failed
  for the offending sequence so the receiver maps to upstream exit code
  11 (FileIo) and aborts. Existing histogram/metrics machinery on the
  bare path is preserved verbatim.

Parallel receive delta apply (#1368)
- New parallel-receive-delta feature on engine (forwarded from transfer).
  Default off so production receivers continue to drive the sequential
  apply loop in receiver/transfer.rs.
- engine::concurrent_delta::parallel_apply adds DeltaChunk and
  ParallelDeltaApplier. Per-file Mutex serialises destination writes,
  per-file ReorderBuffer replays chunks in submission order, and
  rayon::join / par_iter fans the verify step across the rayon pool
  while keeping per-file byte order exact.
- ReceiverContext::enable_parallel_receive_delta installs the existing
  ParallelDeltaPipeline only when the feature is compiled in, leaving
  the default receiver loop untouched.

Re-exports the union of ConcurrentDeltaConfig, DeltaConsumerStats, and
(feature-gated) DeltaChunk / ParallelDeltaApplier from
crates/engine/src/concurrent_delta/mod.rs alongside the existing
HistogramStats, ReorderMetrics, and ReorderBuffer surface.

Tests
- spillable_consumer_preserves_order_under_pressure drives 1000 items
  through a 1 KiB budget with a deliberately delayed head-of-line item
  and asserts both in-order delivery and spill_events > 0.
- spillable_consumer_matches_bare_output_byte_for_byte compares spill
  vs bare paths via SpillCodec encoding.
- spawn_with_config_off_matches_spawn and stats_zero_when_spill_disabled
  pin the default-off invariants.
- parallel_apply: in-order, shuffled, and batched byte-equality tests
  plus a proptest over random chunk sizes / deterministic permutations.

Replaces the conflict-stalled PRs #4299 and #4300 with a single
combined change on top of current master.

Closes #1884
Closes #1368
oferchen added a commit that referenced this pull request May 18, 2026
…1564) (#3886)

Document the BTreeMap and ring-buffer reorder buffer impls, estimate
worst-case memory at 100K queued tails (~21-33 MB), outline a dhat-rs
profile plan, and propose per-window cap plus spill-to-tempfile bound
design with cross-reference to #1884.
oferchen added a commit that referenced this pull request May 18, 2026
Concise design for a receiver-side multi-file delta-apply pipeline
that overlaps per-file apply across in-flight files while preserving
strict wire-NDX order at acknowledgement and disk-commit boundaries.
Composes the existing ReceiverDeltaPipeline trait (#1543), bounded
work queue, and BoundedReorderBuffer (#1407). Threshold gate at
PARALLEL_STAT_THRESHOLD=64 (#1547). Risks documented: head-of-line
stall (#1883) and spill-to-tempfile pending (#1884).
oferchen added a commit that referenced this pull request May 18, 2026
…3982)

When the in-memory reorder buffer exceeds a configurable threshold
(default 64 MB), excess items are serialized to a temporary file and
reloaded transparently on delivery. This bounds memory for 100K+ file
transfers where head-of-line stalls cause successor accumulation.

Key components:
- SpillCodec trait for item serialization (length-prefixed binary)
- SpillableReorderBuffer<T> wrapper with same API as ReorderBuffer
- SpillCodec implementation for DeltaResult
- ReorderBuffer::take() for non-advancing slot extraction
- SpooledTempFile (in-memory up to 1 MB, then disk) via tempfile crate
- Hot-zone protection keeps items near next_expected in memory
- RAII cleanup: temp files removed automatically on drop
oferchen added a commit that referenced this pull request May 18, 2026
…#4015)

Audit confirms that ThresholdDeltaPipeline::new_bypass routes through
ParallelDeltaPipeline::new_bypass -> DeltaConsumer::spawn_bypass ->
ReorderBuffer::passthrough end-to-end, statically avoiding the
spill-aware ordered path designed for task #1884. The constructor
choice at the dispatcher boundary is the single decision point, and
selecting it based on ServerConfig::write::delay_updates is the
correct caller-side signal.
oferchen added a commit that referenced this pull request May 18, 2026
…4204)

Add a Criterion benchmark that synthesises 100K, 500K, and 1M out-of-order
inserts across drift windows of 32, 256, 2048, and 16K, then reports
insert+drain throughput together with the peak occupancy via the
`metrics().max_depth` accessor.

The benchmark pre-allocates the drifted permutation outside the timed
section and prints `max_depth` once per (count, drift) pair so operators
can compare against in-flight dispatch capacity and decide whether the
spill (#1884) or adaptive-sizing (#1834) paths are warranted. The 1M case
is gated behind `BENCH_REORDER_MEMORY_1M=1` to keep default runs fast.
oferchen added a commit that referenced this pull request May 18, 2026
…1884) (#4228)

Design note covering the bounded-memory reorder-buffer spill path.
Documents that SpillableReorderBuffer already exists and is hardened,
but is not wired into the consumer; recommends extending the
reorderbuffer_memory bench with a synthetic huge-drift scenario before
enabling the spill on the default path.
oferchen added a commit that referenced this pull request May 18, 2026
…apply (#4319)

Wires the existing SpillableReorderBuffer into DeltaConsumer behind a new
opt-in ConcurrentDeltaConfig, and lands the parallel receive-side delta
apply scaffold behind the parallel-receive-delta cargo feature.

SpillableReorderBuffer wiring (#1884)
- New ConcurrentDeltaConfig { spill_threshold_bytes, spill_dir } selects
  between the bare ReorderBuffer (default, behaviour unchanged) and the
  bounded-memory SpillableReorderBuffer when a threshold is supplied.
- DeltaConsumer::spawn_with_config dispatches via a ReorderMode enum so
  spawn / spawn_bypass / spawn_with_config share one inner loop entry.
- DeltaConsumerStats surfaces the cumulative spill_events counter via a
  lock-free AtomicU64 published by the reorder thread.
- Spill backend construction or I/O failures map to DeltaResult::failed
  for the offending sequence so the receiver maps to upstream exit code
  11 (FileIo) and aborts. Existing histogram/metrics machinery on the
  bare path is preserved verbatim.

Parallel receive delta apply (#1368)
- New parallel-receive-delta feature on engine (forwarded from transfer).
  Default off so production receivers continue to drive the sequential
  apply loop in receiver/transfer.rs.
- engine::concurrent_delta::parallel_apply adds DeltaChunk and
  ParallelDeltaApplier. Per-file Mutex serialises destination writes,
  per-file ReorderBuffer replays chunks in submission order, and
  rayon::join / par_iter fans the verify step across the rayon pool
  while keeping per-file byte order exact.
- ReceiverContext::enable_parallel_receive_delta installs the existing
  ParallelDeltaPipeline only when the feature is compiled in, leaving
  the default receiver loop untouched.

Re-exports the union of ConcurrentDeltaConfig, DeltaConsumerStats, and
(feature-gated) DeltaChunk / ParallelDeltaApplier from
crates/engine/src/concurrent_delta/mod.rs alongside the existing
HistogramStats, ReorderMetrics, and ReorderBuffer surface.

Tests
- spillable_consumer_preserves_order_under_pressure drives 1000 items
  through a 1 KiB budget with a deliberately delayed head-of-line item
  and asserts both in-order delivery and spill_events > 0.
- spillable_consumer_matches_bare_output_byte_for_byte compares spill
  vs bare paths via SpillCodec encoding.
- spawn_with_config_off_matches_spawn and stats_zero_when_spill_disabled
  pin the default-off invariants.
- parallel_apply: in-order, shuffled, and batched byte-equality tests
  plus a proptest over random chunk sizes / deterministic permutations.

Replaces the conflict-stalled PRs #4299 and #4300 with a single
combined change on top of current master.

Closes #1884
Closes #1368
oferchen added a commit that referenced this pull request May 18, 2026
…4204)

Add a Criterion benchmark that synthesises 100K, 500K, and 1M out-of-order
inserts across drift windows of 32, 256, 2048, and 16K, then reports
insert+drain throughput together with the peak occupancy via the
`metrics().max_depth` accessor.

The benchmark pre-allocates the drifted permutation outside the timed
section and prints `max_depth` once per (count, drift) pair so operators
can compare against in-flight dispatch capacity and decide whether the
spill (#1884) or adaptive-sizing (#1834) paths are warranted. The 1M case
is gated behind `BENCH_REORDER_MEMORY_1M=1` to keep default runs fast.
oferchen added a commit that referenced this pull request May 18, 2026
…1884) (#4228)

Design note covering the bounded-memory reorder-buffer spill path.
Documents that SpillableReorderBuffer already exists and is hardened,
but is not wired into the consumer; recommends extending the
reorderbuffer_memory bench with a synthetic huge-drift scenario before
enabling the spill on the default path.
oferchen added a commit that referenced this pull request May 18, 2026
…apply (#4319)

Wires the existing SpillableReorderBuffer into DeltaConsumer behind a new
opt-in ConcurrentDeltaConfig, and lands the parallel receive-side delta
apply scaffold behind the parallel-receive-delta cargo feature.

SpillableReorderBuffer wiring (#1884)
- New ConcurrentDeltaConfig { spill_threshold_bytes, spill_dir } selects
  between the bare ReorderBuffer (default, behaviour unchanged) and the
  bounded-memory SpillableReorderBuffer when a threshold is supplied.
- DeltaConsumer::spawn_with_config dispatches via a ReorderMode enum so
  spawn / spawn_bypass / spawn_with_config share one inner loop entry.
- DeltaConsumerStats surfaces the cumulative spill_events counter via a
  lock-free AtomicU64 published by the reorder thread.
- Spill backend construction or I/O failures map to DeltaResult::failed
  for the offending sequence so the receiver maps to upstream exit code
  11 (FileIo) and aborts. Existing histogram/metrics machinery on the
  bare path is preserved verbatim.

Parallel receive delta apply (#1368)
- New parallel-receive-delta feature on engine (forwarded from transfer).
  Default off so production receivers continue to drive the sequential
  apply loop in receiver/transfer.rs.
- engine::concurrent_delta::parallel_apply adds DeltaChunk and
  ParallelDeltaApplier. Per-file Mutex serialises destination writes,
  per-file ReorderBuffer replays chunks in submission order, and
  rayon::join / par_iter fans the verify step across the rayon pool
  while keeping per-file byte order exact.
- ReceiverContext::enable_parallel_receive_delta installs the existing
  ParallelDeltaPipeline only when the feature is compiled in, leaving
  the default receiver loop untouched.

Re-exports the union of ConcurrentDeltaConfig, DeltaConsumerStats, and
(feature-gated) DeltaChunk / ParallelDeltaApplier from
crates/engine/src/concurrent_delta/mod.rs alongside the existing
HistogramStats, ReorderMetrics, and ReorderBuffer surface.

Tests
- spillable_consumer_preserves_order_under_pressure drives 1000 items
  through a 1 KiB budget with a deliberately delayed head-of-line item
  and asserts both in-order delivery and spill_events > 0.
- spillable_consumer_matches_bare_output_byte_for_byte compares spill
  vs bare paths via SpillCodec encoding.
- spawn_with_config_off_matches_spawn and stats_zero_when_spill_disabled
  pin the default-off invariants.
- parallel_apply: in-order, shuffled, and batched byte-equality tests
  plus a proptest over random chunk sizes / deterministic permutations.

Replaces the conflict-stalled PRs #4299 and #4300 with a single
combined change on top of current master.

Closes #1884
Closes #1368
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant