perf: cut create-export latency by ~50% — three independent fixes by jaredLunde · Pull Request #57 · beyondoss/glidefs

jaredLunde · 2026-05-24T19:56:43Z

Summary

End-to-end VM create (NATS claim → Running) dropped from ~1063ms → ~535ms on identical traces (same image, same agent-hash, same host). The "GlideFS CoW" span that was the visible long pole in Beyond's trace UI went from 415ms → 171ms — almost all of which is now the unavoidable S3 PUT for save_export.

metric	before	after	delta
GlideFS CoW span	415 ms	171 ms	−244 ms (−59%)
`boot_duration_ms`	584 ms	265 ms	−319 ms (−55%)
NATS claim → VM Running	1063 ms	535 ms	−528 ms (−50%)
`register_device_ms`	252 ms	1 ms	−251 ms
`START_DEV` kernel ioctl	250 ms	0.5 ms	−249 ms
Sync sysfs queue tuning	99 ms	0 ms	−99 ms
`manifest_fetch_ms` (warm)	123 ms	0 ms	−123 ms

The three fixes

1. Snapshot manifest cache (`router.rs`)

snapshots/{name}/{seq:020} is keyed by a monotonic, append-only sequence — write-once by construction, byte-identical for any given (s3_prefix, name, seq) triple. A misleading comment in the fork path claimed "snapshots are mutable" and the code refused to cache them, so every fork-from-snapshot paid a fresh ~123ms S3 GET. box-manager's ensure_derived_snapshot forks every VM from the same staging snapshot, so every single VM was re-fetching the identical bytes.

Adds a bounded HashMap<(s3_prefix, manifest_name, sequence), Arc<VolumeManifest>> next to base_manifest_cache. Pre-populates from snapshot_export so the daemon that wrote a snapshot can serve forks of it for free.

2. Background the sysfs queue-tuning writes (`ublk/device.rs`)

wbt_lat_usec=0 and scheduler=none were written synchronously inside register_inner, costing ~50ms each — the block layer reconfigure is surprisingly heavy on this kernel. They're tuning hints; the device is fully functional without them. spawn_blocking-ing them off the response path saves ~100ms per device-create.

3. Tick the executor before `io_uring_enter` (`ublk/worker_pool.rs`) — the big one

The biggest win and the most surprising bug. One block of code moved up bought 250ms per device-create.

The kernel's ublk_ctrl_start_dev blocks on wait_for_completion_interruptible(&ub->completion) until every queue's nr_io_ready reaches queue_depth — i.e. until every io_task has submitted its initial UBLK_IO_FETCH_REQ uring_cmd.

The worker loop order was:

drain inbox (handle_add_queue spawns 64 io_task futures per queue)
submit_with_args(to_wait=1, ...) ← blocks for up to WORKER_IDLE_NSEC = 250ms
drain CQEs
executor.tick()

io_tasks submit their FETCH_REQ SQEs on first poll — but the first poll only ran after the io_uring_enter wait. So the worker slept the full 250ms timeout waiting for CQEs that physically couldn't arrive (no SQE submitted yet), while START_DEV sat blocked on the matching completion the kernel was waiting for.

Moving the executor tick to before the submit flushes the FETCH_REQ SQEs into the ring first, the submit pushes them to the kernel immediately, ublk_mark_io_ready fires, complete_all(&ub->completion) runs, and START_DEV returns essentially instantly. The 250ms kernel ioctl is now 0.5-1ms.

How we found it

Structured timing logs at target="glidefs.timing" on each step of create_export, register_inner, the tokio::join! legs in api.rs, and inside ublk-core's start_dev (prep/wait_buf_reg/start_ioctl breakdown). These were essential to localize the 250ms — initial guesses (S3 latency, partition scan, udev blkid) were disproven by an empirical warmup experiment before we found the actual cause in the worker loop ordering.

The instrumentation stays in for ongoing observability.

Incidental cleanup

ublk-core doctests referenced libublk::* (old vendored crate name) and UblkQueue<'_> (lifetime removed in a prior refactor). 10/10 doctests now pass.

Test plan

cargo test -p glidefs --features ublk — 893 / 893 pass
cargo test -p ublk-core — 10 / 10 pass (including doctests)
cargo clippy -p glidefs --features ublk --all-targets — no errors
Live deploy via systemctl reload glidefs (zero-downtime handoff) and verified create-VM E2E timing against the homelab. Numbers in the table above are from real boxman vm logs traces.
Watch first few production VM creates for any regression in steady-state I/O (the worker loop change has a per-iteration extra executor.tick() call which is O(woken-tasks) — cheap when no work, but worth eyeballing once.)

🤖 Generated with Claude Code

End-to-end VM create (NATS claim → Running) dropped from ~1063ms to ~535ms on identical traces (same image, same agent-hash). The "GlideFS CoW" span that was the visible long pole in trace UI went from 415ms to 171ms — almost all of which is now the unavoidable S3 PUT for save_export. Three independent fixes, each landed against measured numbers from structured timing logs added alongside. 1. Snapshot manifest cache (router.rs) `snapshots/{name}/{seq:020}` is keyed by a monotonic, append-only sequence — write-once by construction, byte-identical for any given (s3_prefix, name, seq) triple. A misleading comment claimed "snapshots are mutable" and the code refused to cache them, so every fork-from-snapshot paid a fresh ~123ms S3 GET. box-manager's ensure_derived_snapshot flow forks every VM from the same staging snapshot — i.e. every single VM was re-fetching the same bytes. Adds a bounded HashMap keyed by (s3_prefix, manifest_name, sequence) alongside base_manifest_cache. Pre-populates on snapshot_export so the daemon that wrote the snapshot can serve forks of it for free. Cache hit on warm path drops manifest_fetch_ms from 123 → 0. 2. Background sysfs queue tuning (ublk/device.rs) The wbt_lat_usec and scheduler sysfs writes ran inside register_inner and cost ~50ms each on this kernel — the block layer reconfigure is surprisingly heavy. They're tuning hints; the device works fine without them. spawn_blocking them off the response path saves ~100ms per device-create. 3. Tick the executor BEFORE io_uring_enter (ublk/worker_pool.rs) The biggest win and the most surprising bug. The kernel's `ublk_ctrl_start_dev` blocks on `wait_for_completion_interruptible` until every queue's `nr_io_ready` reaches `queue_depth` — i.e. until every io_task has submitted its initial UBLK_IO_FETCH_REQ uring_cmd. The worker loop order was: drain inbox (handle_add_queue spawns 64 io_task futures per queue), `submit_with_args(to_wait=1, ...)`, drain CQEs, then finally `executor.tick()`. But io_tasks submit their FETCH_REQ SQEs on first poll — and the first poll only ran AFTER the io_uring_enter wait. So the worker slept the entire `WORKER_IDLE_NSEC = 250_000_000` (250ms) timeout waiting for CQEs that physically couldn't arrive, while START_DEV sat blocked on the matching completion. Moving the executor tick to BEFORE the submit flushes the FETCH_REQ SQEs into the ring first, the submit pushes them to the kernel, ublk_mark_io_ready fires, complete_all(&ub->completion) runs, and START_DEV returns essentially instantly. One block-of-code moved up bought 250ms per device-create. The 250ms ublk START_DEV ioctl is now 0.5-1ms. Verification | metric | before | after | delta | | GlideFS CoW span | 415 ms | 171 ms | -244 ms | | boot_duration_ms | 584 ms | 265 ms | -319 ms | | NATS claim → VM Running | 1063 ms | 535 ms | -528 ms | | register_device_ms | 252 ms | 1 ms | -251 ms | | START_DEV kernel ioctl | 250 ms | 0.5 ms | -249 ms | | sysfs queue-tuning (sync) | 99 ms | 0 ms | -99 ms | | manifest_fetch_ms (warm) | 123 ms | 0 ms | -123 ms | cargo test -p glidefs --features ublk: 893 / 893 pass cargo test -p ublk-core: 10 / 10 pass Also: structured tracing logs at target="glidefs.timing" on each step of create_export, register_inner, the tokio::join legs in api.rs, and inside ublk-core's start_dev (prep/wait_buf_reg/start_ioctl breakdown). These are what made the bug findable in the first place — keeping them in for ongoing observability. Incidental: ublk-core doctests referenced `libublk::*` (old vendored crate name) and `UblkQueue<'_>` (lifetime removed in a prior refactor). Fixed those — 10/10 doctests now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous commit added `state.executor.tick()` unconditionally before every `submit_with_args` so newly spawned `io_task` futures could flush their initial FETCH_REQ SQEs into the ring before the worker blocked in `io_uring_enter`. That cut `START_DEV` latency from 250ms to <1ms in production — verified across many VM creates. But the docker-tests `test_overwrite_survives_restart_ublk` (and likely other tests exercising the shutdown → restart cycle) hung with that change. Without the change: passes in 3s. With it: indefinite. Root cause is a steady-state interaction I could not isolate without running the test on the homelab host (which I can't do safely — that path wedged the kernel earlier in this session). Scope the tick to ONLY the iteration of the worker loop that just processed an AddQueue message. In steady-state I/O — and during shutdown / RemoveQueue drain — behavior is byte-for-byte identical to the pre-fix code, so whatever invariant the test depends on is preserved. The AddQueue speedup is unchanged: handle_add_queue spawns the io_tasks, the new tick polls them, FETCH_REQs land in the ring, the same iteration's submit_with_args pushes them to the kernel, and `start_dev`'s `wait_for_completion_interruptible` returns essentially instantly. Verification - `test_overwrite_survives_restart_ublk`: passes in 2.33s (previously hung indefinitely with the broad fix). - Production VM create on this binary: `register_device_ms=1`, `start_ioctl_us=478` cold / 419 warm, `PUT total_ms=49` warm. Same end-to-end speedup as before — ~10x. - 65-device recovery on daemon handoff: every `start_dev_us` sub-millisecond (60-2000 µs), no 250ms outliers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The earlier `spawn_blocking` of `wbt_lat_usec=0` + `scheduler=none` saved ~99ms per device-create by moving those sysfs writes off the critical path. Production VM-create kept working because the firecracker boot (~150ms of guest kernel boot) gave the async sysfs writes plenty of time to land before any I/O hit the device. But four `docker_integration` ublk tests hung in CI: - test_unwritten_blocks_return_zeros_ublk - test_overwrite_survives_restart_ublk - test_cold_wake_from_different_node_ublk - test_export_discovery_from_s3_ublk All four issue I/O to the device almost immediately after add — no firecracker boot in between. With the sysfs writes backgrounded the device still had the default `mq-deadline` scheduler when those reads landed, and mq-deadline's deadline queue appears to hold single, idle-device requests long enough that the tests don't make progress within their timeout. The simple `test_unwritten_blocks_return_zeros_ublk` case — single server, single read at offset 512KB, no restart cycle — was the clearest fingerprint. Restore the synchronous writes. Costs us the 99ms back. The tick fix in `worker_pool.rs` (250ms START_DEV → 1ms) is unaffected. Verified locally with the four tests above all passing in 2.0-2.6s after the revert. Future direction: apply `scheduler=none` BEFORE `add_disk` rather than after — either via a `udev` rule keyed on `KERNEL=="ublkb*"` or via a kernel-side ublk_param. Either path eliminates the post-add tuning window entirely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reverts 1c3bb8d. That revert was based on a wrong attribution. When the prior commit landed on the PR, CI reported four ublk tests `running for over 60s` and I assumed the sysfs backgrounding was at fault — the mq-deadline-vs-sync-scheduler-write theory was plausible. Reverted it to "be safe." But running the failing tests locally in --test-threads=4 parallel mode (matching CI's contention model) under three configurations, 10 runs each: PR with ASYNC sysfs: 8/10 pass, 2/10 fail PR with SYNC sysfs: 6/10 pass, 4/10 fail MAIN, no PR changes: 6/10 pass, 4/10 fail ← same as sync! The flakes are pre-existing on `main` — most likely MinIO under parallel-test contention (each test spawns its own testcontainer, 4 of them compete for host resources). The CI "hanging" reports were these intermittent EIO failures surfacing as "still running" status before the panic; not actual hangs. So the sync sysfs version isn't fixing anything. Restoring the async path reclaims ~99ms per device-create with no observable downside vs the sync path. Net PR result: snapshot cache + conservative tick + sysfs bg → ~470ms savings per warm-cache create, same flake rate as main. Flake fix tracked separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The `docker_integration` job has been intermittently failing with EIO on reads and empty discover_exports asserts. Hit rate suggested ~40% on this PR; verified the same rate on `main` (10x runs of the 4 most-flaky ublk tests with --test-threads=4: 6 pass / 4 fail on main; 5/5 pass with --test-threads=1). Root cause is contention from running multiple testcontainers in parallel — each test calls `TestContext::new()` which spawns its own MinIO container. On a 4-vCPU CI runner with the default cargo test parallelism (= num_cpus = 4), four MinIO containers compete for host resources, and MinIO returns transient errors that bubble out as either `Input/output error (os error 5)` on data reads (handler.read_into → cache.read → content_store S3 GET failure) or as empty list results (`should discover at least one export from S3`). Cleanest near-term fix: run docker_integration tests one at a time. Adds ~5min to the docker_integration job (137 tests at ~3-7s each instead of /4 parallelism) but removes the flake. A more elegant follow-up would be to share a single MinIO container across tests via per-test bucket prefixes, but that's structural test-harness work that doesn't need to ride this PR. The integrity-suite job (filter=integrity_suite) already has --ignored --nocapture and runs few tests, so it's not affected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Each TestContext used to spawn its own MinIO testcontainer. Under parallel test execution (cargo default = num_cpus) this produced ~40% flake rate on the homelab + CI runners: transient S3 errors that surfaced as EIO on reads, empty discover_exports listings, and "ublk read failed". Verified pre-existing on `main` — not a PR regression — but worth fixing properly since the prior workaround of "--test-threads=1 in CI" papered over the contention rather than removing it. The previous setup tested glidefs against four MinIOs competing for host CPU/IO, not against a single S3 endpoint. That isn't representative of production (one S3 backend per glidefs daemon, even when serving many concurrent VMs). This commit: - Spins up ONE MinIO process-wide via a `tokio::sync::OnceCell`, reused for every `TestContext`. Container lives for the duration of the test process; teardown happens automatically at exit. - Each `TestContext::new()` allocates a unique bucket (`test-bucket-NNNNNN`) from a monotonic counter, giving each test a fully isolated S3 namespace. - Adds a `/minio/health/ready` probe loop on container startup — `start()` returns before the HTTP listener actually answers on heavily loaded hosts, which produced spurious "connection refused" failures during bucket creation. Verification Before: PASS=6 / FAIL=4 / 10 runs (--test-threads=4, four MinIOs) After: PASS=10 / FAIL=0 / 10 runs (--test-threads=4, one MinIO) Each run is also ~25% faster (no per-test container startup): 2.6-3.5s vs 3.5-4.5s. Re-enables parallel CI execution by reverting the `--test-threads=1` workaround. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The shared-MinIO refactor (dd26dbf) eliminated the per-test MinIO contention we'd been seeing, and locally the full ublk suite (30 tests) runs reliably in --test-threads=4 with the shared MinIO. But CI hung again on test_unwritten_blocks_return_zeros_ublk on the most recent push. Bisecting across the PR commits, all of them pass 10/10 locally in isolation for that test (main, 1c3bb8d, 23dc94f, dd26dbf). So the regression isn't tied to any single commit. The hang seems to depend on something specific about the CI runner (kernel version, num_cpus=4, ext-of-test concurrency). Falling back to --test-threads=1 in CI: we don't have a story for what specifically races, and running storage tests serially when we can't reproduce the failure is the conservative call. Locally with --test-threads=1 we measured ~7s per run instead of ~3.5s parallel — adds maybe 5min to docker_integration CI total. This is *not* a satisfying resolution. Track-down items: - The hang reproduces on host runs at ~1/10 in isolation when the kernel has hundreds of leaked QUIESCED ublk devices (from prior SIGTERM'd test runs) but passes 10/10 when device count is low. Suggests kernel ublk resource pressure interacts with our daemon path, but the specific deadlock is unidentified. - The shared-MinIO refactor in dd26dbf was a real improvement and stays in; the bug we found there was real (per-test MinIO contention caused 40% flake at threads=4 on main). - A real follow-up should investigate the test_unwritten ublk hang with kernel tracing in a CI-shaped environment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This reverts commit cb4e45e.

Found the actual cause of the docker_integration ublk test hangs. Standing up an Ubuntu 24.04 VM with the same kernel CI runs (linux-image-6.17.0-1013-azure) and capturing the kernel stack of the hung thread reveals: blk_mq_freeze_queue_wait+0x97/0xe0 blk_mq_freeze_queue_nomemsave+0x22/0x30 elevator_change+0x79/0x180 elv_iosched_store+0x18b/0x1e0 queue_attr_store+0xe4/0x120 sysfs_kf_write+0x4c/0x60 ... This is the `tokio::task::spawn_blocking` task that writes `/sys/block/ublkbN/queue/scheduler=none`. On 6.17 the kernel's `elv_iosched_store` calls `blk_mq_freeze_queue` which waits for in-flight requests to drain — and the kernel counts our armed FETCH_REQ uring_cmds as in-flight. They never "complete" because they're long-lived (parked waiting for the next I/O). The freeze waits forever. The spawn_blocking task hangs, the device is otherwise functional but our test process eventually times out waiting on something downstream that depends on it. (The same code on kernel 6.12 happens to work — either earlier kernels don't count uring_cmds toward the freeze or the timing happens to never overlap. Either way, 6.17 made it deterministic.) Fix: drop the `scheduler=none` write. Keep `wbt_lat_usec=0` (a simple per-queue store, no freeze, safe on any kernel). The default `mq-deadline` scheduler costs us some throughput overhead under heavy load but is functionally fine for ublk. Reclaiming the perf cleanly requires either a udev rule that fires during `add_disk`'s KOBJ_ADD uevent (BEFORE FETCH_REQs are armed) or a kernel-side ublk_param flag — tracked as follow-up. Verification - 6.17 VM (the failing kernel): 30/30 ublk tests pass in --test-threads=4, 22s (before this fix: test_export_discovery_from_s3_ublk hangs indefinitely with the kernel stack above) - 6.12 homelab (production kernel): 29/30 ublk tests pass; the one failure (test_fs_crash_fsync_honored_ublk) is a pre-existing parallel-test flake unrelated to this fix — passes 1/1 isolated, same flake exists on `main`. Also retires the 56-day-old memory `project_ublk_617.md` ("START_DEV hangs on Azure 6.17, tests skip until fixed"). The hang wasn't in START_DEV; it was in our sysfs cleanup running after `add_disk`. The fix is in our code, not in skipping the kernel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reclaim scheduler=none + the other tunables we wanted, properly. The previous commit dropped the post-add_disk sysfs write because it deadlocks on kernel 6.17 (`elv_iosched_store` → `blk_mq_freeze_queue _wait` blocks forever waiting on armed FETCH_REQ uring_cmds to "complete"). That fixed the hang but left us running with the kernel default `mq-deadline` scheduler — functional, but a real performance loss under load. Real fix: apply the tunables via a udev rule that fires during the kernel's `add_disk` KOBJ_ADD uevent — BEFORE userspace can open the device and BEFORE any bios are routed through it. At that moment there are no in-flight requests and no held queue references, so the `blk_mq_freeze_queue_wait` inside `elv_iosched_store` completes immediately. Verified on the Azure 6.17.0-1013 kernel: previously- hanging tests pass with `scheduler=[none]` active on every ublk device. Files: - `deploy/udev/99-glidefs-ublk.rules` (new): the rule. Sets scheduler=none, wbt_lat_usec=0, add_random=0, read_ahead_kb=0 on ublkb* device-add. Each tunable is documented inline with WHY it applies to ublk specifically (different from spinning rust / default-SSD assumptions baked into the kernel's defaults). - `glidefs/src/cli/server.rs`: `run_server` now calls `install_ublk_udev_rule()` at startup. The rule body is `include_str!`-embedded from the file above, so the binary is the source of truth — operators can't accidentally ship a stale rule, and there's no out-of-band file to keep in sync. Idempotent: reads the existing file and skips the write+udevadm-reload if content already matches. Non-fatal on failure (read-only fs, no udevadm on PATH, etc.): daemon comes up with a warning and the devices fall back to kernel defaults. - `glidefs/src/block/ublk/device.rs`: removed the `tokio::task:: spawn_blocking` that was writing wbt_lat_usec post-add_disk. Redundant now that udev sets it at add-time, plus the spawn was a detached task that could leak its thread if the write ever blocked (as we proved it could on 6.17). No changes needed in beyond/ansible — the binary handles installation itself. Verification Manually applied the rule on the 6.17 VM and ran the previously- hanging test set: 30/30 ublk tests pass at --test-threads=4 in 22.58s `scheduler=[none]` active on every ublk device On 6.12 (homelab): no behavior change — the rule overrides what the old in-code write was already doing, just via a different mechanism. 29/30 tests pass at --test-threads=4; the one parallel flake (test_fs_crash_fsync_honored_ublk) is pre-existing and unrelated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Without this, a 503 from PUT /api/exports leaves the export in `self.exports` but absent from S3. The next retry hits `create_export`'s idempotency check, returns 200 immediately, but `export.json` is still missing — so the export silently vanishes on the next daemon restart. `cleanup_failed_create` drops the in-memory entry, removes any kernel device, tears down flush/prefetch tasks, and clears local cache files so a retry re-runs `create_export` from scratch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PutFailingStore wraps InMemory and fails put_opts on demand; the test arms it, fires PUT /api/exports/vol1 through the real handler, and asserts the full Stage 2b contract: 1. response is 503 2. GET /api/exports/vol1 returns 404 (in-memory state torn down) 3. retry after un-arming returns 201 (not 200) — proves the path re-ran create_export rather than hitting the idempotency check Without the fix the retry would 200 from the idempotency branch and S3 would still have no export.json. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`run_server_as_successor` skipped the install_ublk_udev_rule() that `run_server` calls on cold start. Result: on rolling deploys (handoff predecessor → successor) the rule never lands on the host, and any new ublk device created by the successor came up with default tunables (mq-deadline, wbt_lat_usec=2000us, kernel readahead) — a silent regression on every handoff. The install function is idempotent (compares content, skips if matches) and non-fatal on failure, so calling it from both paths is safe. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces base_manifest_cache + snapshot_cache (two count-bounded HashMaps with refusal-on-full) with a single foyer::Cache keyed by encoded prefix ("b:..."/"s:...") and weighted by VolumeManifest's estimated heap bytes. Two problems the previous design had: 1. Count-bounded, not memory-bounded. A 128GB-volume manifest is ~70KB; a 10TB-volume manifest is ~5.5MB. The same 64-entry cap sized the cache at 4.5MB or 350MB depending on the working-set geometry — invisible to the operator either way. 2. Refusal-on-full evicts nothing. The first 64 distinct manifests pinned the cache and every miss after that re-fetched from S3 forever. Fine for tiny base fleets, broken once snapshot churn or volume diversity entered the mix. S3-FIFO eviction (same policy as the block cache) handles the working-set drift. 64MiB default budget, configurable via RouterConfig.manifest_cache_bytes. Entries are immutable by construction (base = sealed at bless, snapshot = monotonic-sequence addressed), so no staleness concern regardless of eviction policy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

I missed the standalone integration tests when adding the new RouterConfig field. Build was failing in CI on every job that compiled the integration test crates (Build and Test, Data Integrity Suite, Docker Integration Tests, Kernel Devices, Clippy). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Workspace-wide autofix for safely widening casts (`x as u64` where x is a narrower unsigned → `u64::from(x)`). 38 files, mechanical. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jaredLunde and others added 16 commits May 24, 2026 12:52

Revert "ci: serialize docker_integration after observing CI hang"

91ed545

This reverts commit cb4e45e.

chore: cargo clippy --fix -W clippy::cast_lossless

7ec90da

Workspace-wide autofix for safely widening casts (`x as u64` where x is a narrower unsigned → `u64::from(x)`). 38 files, mechanical. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jaredLunde merged commit 6a93e02 into main May 25, 2026
21 checks passed

jaredLunde deleted the jared/long-pole branch May 25, 2026 05:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: cut create-export latency by ~50% — three independent fixes#57

perf: cut create-export latency by ~50% — three independent fixes#57
jaredLunde merged 16 commits into
mainfrom
jared/long-pole

jaredLunde commented May 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jaredLunde commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The three fixes

1. Snapshot manifest cache (router.rs)

2. Background the sysfs queue-tuning writes (ublk/device.rs)

3. Tick the executor before io_uring_enter (ublk/worker_pool.rs) — the big one

How we found it

Incidental cleanup

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jaredLunde commented May 24, 2026 •

edited

Loading

1. Snapshot manifest cache (`router.rs`)

2. Background the sysfs queue-tuning writes (`ublk/device.rs`)

3. Tick the executor before `io_uring_enter` (`ublk/worker_pool.rs`) — the big one