Releases: jamesgober/mmap-io
v1.0.0 - Stable
Release Notes for v1.0.0 - Stable
Version 1.0.0 - 2026-05-18
The first stable release. API surface is locked under SemVer. Breaking changes from here require a major-version bump; additive features ship as minor bumps; bug fixes as patch bumps. CI enforces this via
cargo-semver-checksand a newcargo public-apidiff workflow that fails any PR which changes the public surface without an explicit snapshot update.
API-compatible with 0.9.11. Callers on 0.9.11 upgrade by bumping the version string in Cargo.toml and nothing else. Callers still on 0.9.6 or earlier should read docs/MIGRATION_0.9_TO_1.0.md for the recovery path.
What ships in 1.0.0
- Anonymous mappings.
AnonymousMmap::new(size)returns a process-local, RW memory region with no backing file. Pages are zero-initialized on first touch; memory is released when the value drops. Useful for shared scratch memory between threads, large temporary allocations that bypass the heap, or as the kernel-side substrate for fd-passing IPC patterns. Closes audit F1, the longest-standing gap in the inventory. - Hugepage runtime introspection.
mmap.is_hugepage_backed() -> Option<bool>answers the question "did the kernel actually back my mapping with huge pages?" On Linux, parses/proc/self/smapsand inspects theAnonHugePages/Private_Hugetlb/Shared_Hugetlbfields. On other platforms returnsNone. Pairs with the existinghugepagesfeature to close the loop on a previously best-effort capability. Closes audit F4. - Multi-process IPC integration test.
tests/ipc_cross_process.rsverifies bidirectional cross-process byte visibility through a single mmap'd file: parent writes a magic value, spawns child viastd::process::Command::new(std::env::current_exe()), child reads parent's writes and writes its own bytes, parent verifies the child's writes. The crate has always supported this pattern; now it has a dedicated regression test. Closes audit T6. - Sparse-file behavior documented.
create_rwandopen_or_createnow explicitly document thatset_len(size)produces a sparse file on every supported platform: a 1 TBdefault_sizedoes not consume 1 TB of free disk until pages are written. Closes audit F6. # Errors/# Panicsrustdoc completeness pass. EveryResult-returning public method now documents its error conditions; every method that calls.expect()documents its panic conditions. The crate already enforced#![deny(missing_docs)]; nowclippy::missing_errors_docandclippy::missing_panics_docalso pass clean.public-api.txtsnapshot committed. Acargo public-api --simplified --all-featuresdump of the locked surface lives in the repo root. A new CI workflow regenerates it on every PR and fails the build if the diff is non-empty without an accompanying snapshot update. This catches accidental API changes at PR review time, not at release time. The 0.9.7 incident (a patch release that silently brokeas_slice) cannot happen again.
Stability commitment
From this release forward:
- Major version bumps (
2.0.0+) are reserved for genuine breaking changes. They will be telegraphed in advance through deprecation warnings and migration notes. - Minor bumps (
1.1.0+) ship additive features. New methods, new types, new feature flags. Existing code keeps compiling. - Patch bumps (
1.0.1+) ship bug fixes, perf improvements, internal refactors, and documentation polish. No public-surface changes.
cargo-semver-checks runs on every PR against the latest published version. The public-api workflow runs on every PR against the committed snapshot. Either check failing is a blocking gate, not advisory.
The path from 0.9.x
mmap-io was first published at 0.1.0 on 2025-08-05. The 0.9.x series ran from 0.9.0 through 0.9.11 over the course of the prior year, with the bulk of the load-bearing work in the final months:
0.9.5: Three critical correctness bugs closed.flush_rangeaccumulator math,EveryMillisflush-policy lifetime, atomic-view UAF across resize.0.9.6: Unsafe audit. Everyunsafeblock re-justified with platform-spec citations; property-test suite added.0.9.7: Performance milestone. Zero-copy iterators (13.2x at 4 KiB, 475x at 64 KiB), zero-copyas_sliceon RW mappings,touch_pagestight-loop rewrite (~50-100x). This release also accidentally shipped breaking signature changes as a patch bump, a SemVer violation that carried for four releases.0.9.8: Ergonomic surface (audit E1-E7, F2, F5, F9).open_or_create,from_file,unmap, raw pointer access, prefetch, builder polish.0.9.9: Native file-change watchers viainotify/ FSEvents /ReadDirectoryChangesW.0.9.10: Pre-1.0 stabilization. Fuzz scaffold, ten focused examples,docs/PERFORMANCE.mdwith measured numbers,cargo-auditandcargo-semver-checksCI workflows, bench-regression hard gate.0.9.11: Field-issue response. Compat shims for the 0.9.7 violation (as_slice_bytes,for_each_mut_legacy), runtime-agnostic async (smol works),bytes::Bytesintegration,io::Read+io::Seekcursor,AsFd/AsRawFd/AsHandle/AsRawHandletrait impls.
1.0.0 is the result of that work. The code base is the same one that has been running through 0.9.11; this release adds the four 1.0-bracket items (F1, F4, T6, F6), the doc-completeness pass, and the SemVer enforcement infrastructure.
What 1.0.0 does NOT include
A handful of items from the audit are explicitly deferred to post-1.0:
sync_file_range(Linux). A niche fast path for some DB workloads. No external request to date.- Custom allocator hook. Speculative; will land when a real consumer asks.
- Additional platform targets (FreeBSD, illumos). Mostly CI plumbing; the POSIX path already covers them in source.
MmapInventory::all_open()diagnostic. Probably belongs in a separate diagnostics crate.- Memory-pressure callbacks. Speculative; pattern unclear without a concrete consumer.
These are tracked in .dev/ROADMAP.md under "post-1.0 long-term work" and will ship as additive 1.x minor bumps when there is a real consumer.
Tests
- 158 tests pass under
--all-features(up from 140 in 0.9.11), 1 ignored (the unrelated hugepages-fallback test), 0 failed. - 17 new tests in
tests/v1_0_0_additions.rscover theAnonymousMmapsurface andis_hugepage_backed. - 1 new test in
tests/ipc_cross_process.rscovers cross-process byte visibility. cargo build(default / no-default / all-features),cargo +1.75 build --all-features(MSRV),cargo fmt --check,cargo clippy --all-targets --all-features -D warnings,cargo doc --no-deps --all-features,cargo audit --deny warnings,cargo semver-checks check-release, andcargo public-apisnapshot diff all clean.
Internals
- New module
src/anonymous.rsexportingAnonymousMmap. is_hugepage_backedand the smaps parser live insrc/mmap.rsbehind#[cfg(target_os = "linux")](returnsNoneon other platforms).MappedSliceMutgains apub(crate) fn guardedconstructor so the anonymous module can build write-guarded slices without duplicating the field-access boilerplate.- Bench-only fix:
Arcimport inbenches/mmap_bench.rsgated behind#[cfg(feature = "atomic")]to match its sole call site. - New CI workflow
.github/workflows/public-api.ymlruns the diff check.
Notes
- MSRV unchanged at Rust 1.75.
- No new dependencies in the production tree.
tempfileremains dev-only. - The
0.9.7-introduced API surface (MappedSlice, the zero-copy iterators, the flattenedfor_each_mut) is the recommended path for new code; the compat shims from 0.9.11 (as_slice_bytes,for_each_mut_legacy,chunks_owned) are preserved in 1.0.0 and will not be deprecated.
Full Changelog: v0.9.11...v1.0.0
v0.9.11 — Plays well with others
Release Notes for v0.9.11 - Plays well with others
Version 0.9.11 - 2026-05-14
smol works now. The
asyncfeature is runtime-agnostic since 0.9.11 — drop tokio from your tree and driveupdate_region_async/flush_asyncstraight from smol, async-std, or any custom executor. See "Smol support, in detail" below for the implementation story; jump to the "Runtime-agnostic async" highlight for the quick version.
Patch release responding to two issues filed against 0.9.10. bbqsrc in #6 flagged that the as_slice signature change in 0.9.7 was a semver violation against Cargo's pre-1.0 compatibility rules (^0.9.6 resolves to <0.10.0, so 0.9.7 should not have shipped a breaking change). ararog asked for smol runtime support, which the tokio-locked async feature did not provide. Both fixed here, plus an opportunistic ecosystem polish pass: bytes::Bytes integration, io::Read + io::Seek cursor, and AsFd / AsRawFd / AsHandle / AsRawHandle trait impls. Everything additive; no API breaks.
Highlights
- Runtime-agnostic async (smol unlocked).
update_region_async,flush_async,flush_range_async, and themanager::async::*helpers now route throughblocking::unblockinstead oftokio::task::spawn_blocking. Existing tokio users see no API change. The futures drive to completion on any executor (tokio, smol, async-std, embassy on hosted, custom executors). Thetokiodirect dep is gone under--features async; the much smallerblockingcrate replaces it. The transitive dep tree shrinks notably. smol-based projects can finally use the async surface without dragging tokio in. - Compat shims for the 0.9.7 semver violation.
MemoryMappedFile::as_slice_bytes(offset, len) -> Result<&[u8]>mirrors the 0.9.6as_slicesignature exactly: returns&[u8]directly onReadOnlyandCopyOnWritemappings, returnsMmapIoError::InvalidModeonReadWritematching the 0.9.6 behavior. Code that was broken by the 0.9.7 return-type change recovers with a one-method-name rename.ChunkIteratorMut::for_each_mut_legacy<F, E>(F) -> Result<Result<(), E>>mirrors the 0.9.6 nested-Result signature. Internally uses the same single-held-write-guard loop as the flattenedfor_each_mut, so the H2 perf win from 0.9.7 is preserved.chunks_owned()(already shipped in 0.9.7) covers the iterator-item migration path.
bytes::Bytesintegration (feature = "bytes"). NewMemoryMappedFile::read_bytes(offset, len) -> Result<bytes::Bytes>plusFrom<MappedSlice<'_>>/From<&MappedSlice<'_>>forbytes::Bytes. One allocation + memcpy at the conversion boundary; the resultingBytesis mapping-lifetime-independent and travels freely through hyper / tower / tonic / axum / reqwest. Opt-in feature, zero cost when disabled.io::Read+io::Seekcursor.mmap.reader()returns anMmapReader<'_>that plugs the mapping into every parser / decoder that takes a genericR: Read:serde_json::from_reader,flate2::read::GzDecoder,tar::Archive::new,image::ImageReader::new, the standard library'sBufReader, etc. Bounds-checked under the hood.position()andset_position()for direct cursor control.AsFd+AsRawFd(Unix) andAsHandle+AsRawHandle(Windows) trait impls onMemoryMappedFile. The std-blessed way to hand the underlying OS handle to FFI code,nix,rustix,polling, etc., without going throughunmap.
The semver story
The 0.9.7 release changed as_slice from Result<&[u8]> to Result<MappedSlice<'_>>, changed iterator Item from Result<Vec<u8>> to MappedSlice<'a>, and flattened for_each_mut's closure return shape. Those were real wins (the iterator change alone is 13-475x faster than the old allocating form per the 0.9.10 measurements). The mistake was the version number: 0.9.6 → 0.9.7 is treated as a minor / patch bump by Cargo's resolver (Rust pre-1.0 semver rule: the leftmost non-zero digit acts as "major"), so ^0.9.6 users were silently upgraded into broken code.
The break carried for four releases (0.9.7 through 0.9.10) without acknowledgement. The cargo-semver-checks workflow added in 0.9.10 would have caught this exact case at PR time but did not exist yet in 0.9.7. The compat shims in this release give downstream callers a one-line recovery path. Apologies to bbqsrc and to anyone else whose 0.9.6 code stopped compiling on 0.9.7. The CHANGELOG and the README's new "Migrating from 0.9.6" section document the breaks and the recovery path explicitly so the issue is visible in the public record.
Smol support, in detail
The async feature implementation through 0.9.10 wrapped synchronous calls in tokio::task::spawn_blocking and required a tokio runtime to drive the returned future. smol users calling mmap.update_region_async(...).await from a smol-based application got a future that never executed because spawn_blocking depends on tokio's runtime being installed.
Since 0.9.11 we use blocking::unblock instead. The blocking crate is the runtime-agnostic primitive used by smol, async-fs, async-net, async-process, and other "small async" crates. tokio's own runtime drives futures from blocking just fine because they're standard Futures. The cost of the swap is one extra dep (blocking plus its small set of transitive deps), more than offset by dropping tokio's much larger tree.
The integration test async_surface_runs_under_a_non_tokio_executor proves the case directly: it constructs a block_on from std::thread::park (the same primitive smol and pollster use), drives the async methods through it, and verifies the bytes round-trip. No tokio runtime, no panic, full async functionality.
Tests
- 140 tests pass under
--all-features(up from 127 in 0.9.10), 1 ignored (the unrelated hugepages-fallback test), 0 failed. - 13 new tests in
tests/v0_9_11_additions.rscover every new method and the smol-compat validation. cargo build(default / no-default / all-features),cargo +1.75 build --all-features(MSRV),cargo fmt --check,cargo clippy --all-targets --all-features -D warnings,cargo doc --no-deps --all-features, andcargo audit --deny warningsall clean.- 124 transitive crates in the lockfile (was ~130 with tokio in
[dependencies]); the dep tree is smaller despite the new additions. - Banned-words scan zero hits.
Internals
tokioremoved from[dependencies]; added to[dev-dependencies]purely so the existing#[tokio::test]test suite continues to drive the runtime-agnostic async surface. Downstream consumers no longer pull tokio via--features async.blocking 1.6added under--features async.bytes 1added under--features bytes.
Notes
- MSRV unchanged at Rust 1.75.
- All 0.9.7-introduced API surface (
MappedSlice, the new iterator items, the flattenedfor_each_mut) remains the recommended path. The compat shims are explicitly migration aids; the rustdoc on each shim points back at the modern equivalent. bytes::Bytes::from(MappedSlice<'_>)is one allocation + memcpy. For zero-copy networking, callers should useas_slicedirectly and pass the borrowed&[u8]through the chain; theBytesconversion is the right call when ownership has to cross a thread or process boundary.
Full Changelog: v0.9.10...v0.9.11
v0.9.9 — Feature complete
Release Notes for v0.9.9 - Native eyes
Version 0.9.9 - 2026-05-12
The watch feature gets its real backend. Through 0.9.8 the implementation was a polling thread that re-stat'd the file at 100 ms intervals, which was reasonable on Linux/macOS but unreliable on Windows where the mtime granularity is too coarse for sub-second changes (three watch tests were #[cfg_attr(windows, ignore)]'d as a consequence). This release replaces all of that with the OS-native event source on every platform: inotify on Linux, FSEvents on macOS, and ReadDirectoryChangesW on Windows. Backed by the notify 6 crate gated on the watch feature, with default-features off and only the macos_fsevent selector enabled to keep the transitive set tight.
Public API is unchanged: MemoryMappedFile::watch(callback) still returns a Result<WatchHandle>, the callback still receives ChangeEvent { offset, len, kind } with the same 3-variant ChangeKind. The breaking aspect is implementation-side timing. Polling delivered changes after a ~100 ms detection window; native backends deliver in <1 ms on Linux, <10 ms on Windows, <50 ms on macOS (FSEvents coalesces by design). Callers who used the previous 100 ms delivery as a de-facto debounce floor will see events sooner. The three previously-ignored Windows watch tests run live on every platform now, plus a new tests/watch_native.rs file with five integration tests covering modify, truncate, extend, rapid-sequence, and removed.
This release also formalizes the 1.0.0 hold: the codebase is structurally ready for 1.0 after 0.9.10 closes (fuzz, examples, performance docs, migration guide), but 1.0.0 is on indefinite hold pending cross-repo presentation work (consistent headers/branding/SECURITY.md across the project family). The previously-planned 1.0.0-rc.1 candidate phase is dropped: hyphenated tags caused tooling issues in prior cycles, soak happens on the last 0.9.x in real-world deployment, and 1.0.0 ships directly when it ships. ROADMAP updated accordingly.
Highlights
- Native backends on every platform.
inotifyon Linux delivers events as the kernel sees them (typical <1 ms from file change to callback). FSEvents on macOS batches at ~50 ms granularity by design.ReadDirectoryChangesWon Windows delivers in <10 ms. The platform-behavior table is in the rustdoc onMemoryMappedFile::watchand indocs/API.md. notify 6as the abstraction layer.notify::recommended_watcherpicks the best backend at compile time; we wrap it with ourWatchHandletype that owns the underlying watcher plus a dispatcher thread that drains the event channel and translates eachnotify::EventKindinto our coarserChangeKind. The translation is inmap_notify_kindand is intentionally conservative: anything that's not clearly metadata or removal surfaces asModifiedso callers refresh rather than miss a real change.- Three previously-ignored Windows watch tests run live.
watch::tests::test_watch_file_changes,watch::tests::test_multiple_watchers, andtests/feature_integration.rs::test_all_features_integrationall lose their#[cfg_attr(windows, ignore = "...")]markers and pass on every platform. - Five new integration tests in
tests/watch_native.rs:watch_modify_detected,watch_truncate_detected,watch_extend_detected,watch_rapid_sequence_coalesces_or_reports_each,watch_removed_event_terminates_dispatcher. Each usesstd::fsAPI writes from a separate file handle to simulate the real-world "another process modified the file" scenario. WatchHandle::Droptears down cleanly. Drops the underlyingnotify::RecommendedWatcherfirst (which closes the OS subscription synchronously and the internal channel), then detaches the dispatcher thread via a join wrapper so the dropping thread is never blocked past the OS's own teardown time.WatchHandle::is_active()is now part of the surface (was previously#[allow(dead_code)]); useful for tests and diagnostics.
Bug-class changes
- The Windows polling-watch flakiness is gone. The root cause was Windows mtime granularity:
std::fs::metadata().modified()returned timestamps quantized to the FS-level resolution, so two writes within the same tick looked identical and the watch loop missed the second.ReadDirectoryChangesWoperates on raw filesystem change notifications and does not depend on mtime, so the granularity issue disappears entirely. - Per-watcher thread leak is gone. The old polling implementation spawned a thread per watch call that ran a
loop { sleep; poll }until the file was deleted;WatchHandle::Dropsignaled it via anAtomicBool(audit H5 fix in 0.9.5). The new implementation's thread exits as soon as the channel closes, which happens synchronously when theRecommendedWatcherdrops. NoAtomicBool, no polling interval to wait for.
Important note: mmap-write detection
mmap-side writes (mmap.update_region(...) followed by mmap.flush()) are not a reliable trigger for any platform's native FS watcher. The writes go through the page cache and only reach the watcher at OS-decided writeback time, which is platform-dependent and never well-bounded. This is the same story as on every other native filesystem watcher in existence (notify-rs's docs note this; inotify's man page documents the limitation).
For reliable cross-platform detection, modify the file through the std::fs API from another process or another file handle. This matches the actual real-world use case for watch: detect changes made by something other than the current mapping holder. The rustdoc on MemoryMappedFile::watch calls this out, the README has a one-line note in the watch example, and the integration tests are written this way (write via std::fs::OpenOptions + write_all + sync_all).
Breaking changes
No public API breaks. The full surface of MemoryMappedFile::watch, WatchHandle, ChangeEvent, and ChangeKind is unchanged.
The implementation-side change in event timing (faster delivery) is the only behavioral difference visible to callers. If you were relying on the old ~100 ms polling delay as an implicit debounce floor, add explicit debouncing on top of the callback (wait N ms after the last event before reacting). Most callers will simply observe their changed flag flip faster.
Tests
- 127 tests pass under
--all-features(up from 121 in 0.9.8), 1 ignored (the unrelated hugepages-fallback test), 0 failed. The 3 Windows-ignored watch tests are now live, and 5 new integration tests landed intests/watch_native.rs. - CI matrix combos still clean locally for
--no-default-featuresand--no-default-features --features "cow locking advise". cargo fmt --checkclean.cargo clippy --all-targets --all-features -D warningsclean.- Banned-words scan zero hits.
- MSRV unchanged at Rust 1.75.
cargo +1.75 build --all-featuresclean.notify 6.1.xadvertises MSRV 1.60; the transitive set addscrossbeam-channel,mio(Linux),filetime, andwindows-sys-derived shims (Windows).
Documentation
docs/API.mdwatch section fully rewritten. Platform-behavior table (Linux <1 ms / macOS <50 ms / Windows <10 ms typical latencies), coalescing notes, error contract, the mmap-write caveat. Install snippets bumped to 0.9.9. Version history entry added.README.mdfeature-table entry rewritten ("Native file-change notifications: inotify (Linux), FSEvents (macOS), ReadDirectoryChangesW (Windows)"). Watch example block updated. mmap-write caveat note added.REPS.mdwatch surface markedSince 0.9.9: backed by notify..dev/ROADMAP.mdrestructured: 1.0.0 placed on indefinite hold with the rationale documented (cross-repo presentation cleanup),1.0.0-rc.1removed entirely (hyphenated tags caused prior tooling issues), versioning strategy through hold documented as "continue with 0.9.x; jump to 0.10.0 if necessary, but prefer 0.9 to 1.0 trajectory."
Roadmap status
| Version | Goal | Status |
|---|---|---|
| 0.9.5 | Correctness bugfix release | shipped |
| 0.9.6 | Unsafe audit + property tests | shipped |
| 0.9.7 | Performance milestone (zero-copy iteration, touch_pages tight loop) | shipped |
| 0.9.8 | Ergonomic API expansion | shipped |
| 0.9.9 | Native watch backends | this release |
| 0.9.10 | Pre-1.0 stabilization (fuzz, examples, performance docs, migration guide) | next |
| 1.0.0 | Stable | on hold pending cross-repo cleanup |
Notes
- No source-code change to the watch public API. Callers compile unchanged. The dispatcher thread is named
mmap-io-watch:<path>so it's easy to identify intop/ Task Manager / debugger views. notifywas chosen over hand-rolled per-platform code because it has been stress-tested by thousands of downstream crates over multiple years across all three platforms; the per-platform quirks (inotify mask handling, FSEvents event types, ReadDirectoryChangesW overlapped-I/O semantics) are deep enough to make hand-rolling a multi-week investment with no functional payoff. The dep cost (one direct dep, ~5 transitive items withdefault-featuresoff) is bounded and stable.- F1 (anonymous shared-memory mapping) remains open for the same structural reason it was held in 0.9.8: it requires
Inner.file: Option<File>plus sentinel-path handling threaded through resize / prefetch / async-flush. Tractable but its own focused milestone.
Full Changelog: v0.9.8...v0.9.9
v0.9.8 — Surface Finish
Release Notes for v0.9.8 - Surface Finish
Version 0.9.8 - 2026-05-12
The ergonomic pass. Eight new methods land on the public surface, all additive, all carrying their full safety contract and a working code example. open_or_create answers the most common question every user asks first ("how do I open if it's there and create if it's not?") with one call. from_file is the escape hatch for callers who already opened the file with their own OpenOptions (O_DIRECT, O_NOATIME, an inherited fd from a parent process) and want to mmap it without re-opening. unmap is the inverse: consume the mapping, drop the underlying memory in safe order, hand back the bare File so the caller can keep using it. flush_policy and pending_bytes expose the durability accumulator for diagnostics and dashboards. as_ptr / as_mut_ptr are the FFI escape hatches for handing the mapping to a C library by base pointer + length. prefetch_range issues posix_fadvise(POSIX_FADV_WILLNEED) on Linux to warm the page cache from the file side, complementary to the existing advise(WillNeed) that warms it from the VM side.
Under the surface, a real bug closed in the time-based flusher: a Duration subtraction that could underflow and panic if thread::sleep overshot under heavy scheduler contention. Fixed with saturating_sub. Bounds-check helpers (ensure_in_bounds, slice_range) and the small-and-hot accessors (len, is_empty, mode, flush_policy, pending_bytes) gained #[inline] so the optimiser can fold them into the call site every time. The two-branch bounds check merged into a single saturating_add comparison. Audit items E1, E2, E6, E7, F2, F5, and F9 all close in this release.
Highlights
open_or_create(path, default_size)opens the file if it exists, creates it atdefault_sizebytes if it does not. The existing-file path ignoresdefault_sizeand keeps the file's current length. The builder gets a matching terminal method:MemoryMappedFile::builder(path).mode(...).size(...).flush_policy(...).open_or_create().from_file(file, mode, path)wraps a pre-openedstd::fs::Filein aMemoryMappedFile. Callers needing customOpenOptions(Direct I/O, no-atime, security descriptors, inherited file descriptors) construct theFilethemselves and hand it off. Thepathargument is informational forpath()/ error messages; no syscall happens on it.unmap(self) -> Result<File, Self>consumes the mapping and returns the underlyingFile. Drop order is enforced: the background flusher stops first, then the mapping releases its virtual address space, then the file handle is yielded to the caller. ReturnsErr(self)unchanged if otherMemoryMappedFileclones are alive (theFileis shared viaArc<Inner>and cannot be extracted while other handles hold references).flush_policy()/pending_bytes()are#[inline]O(1)accessors over the durability accumulator. Useful for observability dashboards on long-running writers: pollpending_bytes()to see how close you are to the next auto-flush underEveryBytes/EveryWrites.unsafe fn as_ptr(&self) -> *const u8andunsafe fn as_mut_ptr(&self) -> Result<*mut u8>expose raw base pointers for FFI use cases that need aconst void */void *plus length. The rustdoc spells out the safety contract: do not dereference pastlen(), do not hold the pointer acrossresize(), do not alias the mutable pointer with any live Rust&reference to the same bytes.prefetch_range(offset, len)issuesposix_fadvise(POSIX_FADV_WILLNEED)against the file descriptor on Linux (and Android), kicking off kernel-side readahead. Documented as a no-op on other platforms. Bounds-checked. Complementary toadvise(MmapAdvice::WillNeed): file-side readahead viaposix_fadviseversus VM-side viamadvise. Issuing both helps cold reads of huge files.- 17 new integration tests in
tests/ergonomic_api.rscover every new method, both happy and error paths: open_or_create on both create / open paths, builder open_or_create on both, from_file across RO / RW / zero-length, unmap unique vs shared, flush_policy default vs explicit, pending_bytes through a threshold-crossing write, as_ptr / as_mut_ptr roundtrips againstread_into, and prefetch_range in-bounds / OOB / zero-length.
Performance
- Bounds-check helpers
#[inline]-ed.ensure_in_boundsandslice_rangeare called from every bounds-checked public method (as_slice,as_slice_mut,read_into,update_region,flush_range,touch_pages_range,prefetch_range, advise, lock, segment access). Inlining removes the function-call boundary on every read/write. The two-branch bounds check also collapsed into a singlesaturating_addcomparison: the previous form didif offset > totalthenif offset + len > total; the new form checksoffset + len > total || offset > totalin one expression (saturating-add catches the overflow case the first branch was redundantly guarding against). - Hot-path accessors marked
#[inline]:len(),is_empty(),mode(),flush_policy(),pending_bytes(). All trivial (one field read or one lock read of au64), so inlining is a clear win. align_upmarked#[inline]: called fromflush_range(microflush page alignment),touch_pages_range, and prefetch alignment paths. Trivial bit math.
Bug fix
flush::TimeBasedFlusherDuration underflow. The thread loop computed the next sleep slice asshutdown_poll.min(interval - elapsed). Ifthread::sleepovershot (heavy scheduler contention, system suspend/resume, oversubscribed runtime)elapsedcould exceedintervaland the subtraction would panic on Duration underflow, killing the flusher thread. Fixed by usinginterval.saturating_sub(elapsed): when overshoot occurs the remaining slice clamps to zero and the loop yields immediately, re-checks the shutdown flag, and either fires the callback or exits. No behavioural change in the common case.
Tests
- 121 tests pass under
--all-features(up from 101 in 0.9.7), 4 ignored (3 polling-watch tests gated on Windows mtime granularity, 1 hugepages fallback), 0 failed. - CI matrix combos:
--no-default-featuresand--no-default-features --features "cow locking advise"both clean locally; doctest counts grew from 13 to 16 with the new method examples. - Banned-words scan zero hits.
cargo fmt --checkclean.cargo clippy --all-targets --all-features -D warningsclean on default lints. - MSRV unchanged at Rust 1.75.
cargo +1.75 build --all-featuresclean.
Documentation
Cargo.tomlSEO sweep. Description leads with the unique selling point ("Zero-copy memory-mapped file I/O for Rust"), names the supported platforms, and lists the concrete use cases (databases, log structures, caches, game runtimes, IPC). Keywords tightened to the five highest-volume search terms:mmap,memory-mapped,zero-copy,filesystem,io. Categories:filesystem,data-structures,concurrency,database-implementations.- README opening rewritten around the actual differentiators: zero-copy on every mode, zero-allocation iteration, lock-free atomic views, configurable durability, the C1/C2/C3 audit closures. The "Quick start" snippet now shows
open_or_createas the everyday pattern alongsideopen_ro. docs/API.mdhas full sections for all eight new methods, TOC updated, install snippets bumped to 0.9.8, Version History entry added.REPS.mdsection 4 now lists every public method, including the new ones with// Since 0.9.8markers and the builder addition.
Notes
- No new runtime dependencies. Linux
posix_fadviseuses the already-requiredlibccrate. - MSRV unchanged at Rust 1.75.
MappedSliceandMappedSliceMutare re-exported from the crate root since 0.9.7; that has not changed.- The
unsafeexposure of raw base pointers viaas_ptr/as_mut_ptris a deliberate FFI escape hatch. The Rust API surface remains safe; the unsafe marker on those two methods forces callers to acknowledge the documented contract.
Deferred (with documented reason)
- F1 (anonymous shared-memory mapping)
new_anonymous(size)is the one item from the audit's ergonomic / functionality cluster that did not land in this release. The reason is structural: anonymous mappings have no backingFileand no meaningfulPath, so adopting them requires changingInner.file: FiletoOption<File>and threading sentinel-path handling through the resize / prefetch / async-flush paths. That refactor is sized for its own focused milestone rather than rolled into the ergonomic pass. - H3 (lock-free RW reads via arc-swap or
UnsafeCelldesign). The currentRwLock<MmapMut>design is sound and bounded; reads on RW mappings are concurrent across readers thanks to parking_lot's RwLock. Replacing the lock entirely is a memory-model question (do we accept torn reads from concurrent intra-process writers?), not a tuning question, and is rescoped to a 1.0 design conversation. - Native watch backends (inotify / FSEvents /
ReadDirectoryChangesW) and fuzz / MIRI runs remain on the roadmap for 0.9.9 and 0.9.10 respectively.
Full Changelog: v0.9.7...v0.9.8
v0.9.7 — Performance Unleashed
Release Notes for v0.9.7 - Performance Unleashed
Version 0.9.7 - 2026-05-12
The performance milestone. Every read path that previously allocated or memcpy'd is now zero-copy, and the hot loops that grabbed locks per chunk now grab them once. as_slice works uniformly on all three mapping modes (ReadOnly, CopyOnWrite, and ReadWrite); it returns a new MappedSlice<'_> wrapper that derefs to &[u8] and, on RW, holds a read guard for its lifetime so concurrent resize() blocks until the slice is dropped. The chunks() / pages() iterators no longer allocate a Vec<u8> per item: they yield MappedSlice<'a> borrowed directly from the mapped region. touch_pages was rewritten as a tight ptr::read_volatile loop holding the lock once. chunks_mut().for_each_mut(...) similarly holds the write guard once for the entire iteration and dropped its triple-nested Result<Result<(), E>> for a flat Result<()>.
The headline numbers come from the H1 redesign: a 1 GiB scan at 4 KiB chunks goes from 262,144 heap allocations and 2x memory bandwidth (mmap → buffer → clone → caller) to zero allocations and 1x bandwidth (direct slice into the mapping). Audit items H1, H2, H4, and E4 all close in this release. H3 (lock-free RW reads via arc-swap) intentionally stays open and is rescoped to a 1.0 design conversation rather than a tactical fix; the current RwLock<MmapMut> is sound and bounded, and replacing it is a memory-model question, not a tuning question.
Highlights
MappedSlice<'a>wrapper: the unifying read-side type. ImplementsDeref<Target = [u8]>,AsRef<[u8]>,Debug, andPartialEqagainst[u8]/&[u8]/[u8; N]/&[u8; N]so call sites work as if it were a byte slice. On RO and COW the wrapper is theOwned(&'a [u8])variant (lock-free, the underlying mapping is immutable). On RW the wrapper is theGuarded { guard, range }variant that holds theRwLockread guard for its lifetime. Re-exported from the crate root.as_sliceworks on RW (BREAKING): previously returnedMmapIoError::InvalidModeon RW, forcing callers toread_into(which copies). Now returnsResult<MappedSlice<'_>>uniformly across all three modes. Callers that previously caughtInvalidModeshould remove the branch.- Iterator zero-copy (BREAKING):
ChunkIterator::ItemandPageIterator::Itemare nowMappedSlice<'a>(wasResult<Vec<u8>>). The iterator captures the mapping's base pointer and total length once at construction; eachnext()is a pointer arithmetic +slice::from_raw_partswith no heap traffic. The iterator holds the RW read guard for its full lifetime, blocking concurrentresize()until iteration is done. Migration aidschunks_owned()andpages_owned()preserve theVec<u8>ergonomics for the (rare) case where callers genuinely need owned buffers. for_each_mutflattened (BREAKING):ChunkIteratorMut::for_each_mut(F) -> Result<()>whereF: FnMut(u64, &mut [u8]) -> Result<()>. The old triple-nestedResult<Result<(), E>>is gone, and the write guard is acquired ONCE for the entire iteration instead of per-chunk. Callers that returnedOk::<(), std::io::Error>(())should returnOk(())and map foreign errors intoMmapIoError::Io(...)before returning.touch_pages/touch_pages_rangerewritten (H2): previously calledread_into(offset, &mut [0u8; 1])per page, which acquired the lock, validated bounds, and memcpy'd one byte 262,144 times for a 1 GiB file. The new implementation acquires the lock ONCE, walks the mapping withstd::ptr::read_volatile::<u8>wrapped instd::hint::black_box, and steps bypage_size(). Expected speedup on multi-GiB files: 50-100x.- Workload-pattern benches:
sequential_read(1 / 16 / 256 MiB,as_slicevsread_into),random_read(xorshift64 PRNG, no new dep),sequential_writeunder three flush policies includingEveryMillis(10)now that C2 made it actually work,iterator_throughputcomparing zero-copychunks()tochunks_owned()to show the H1 win directly,touch_pages_largeon 1 GiB, andatomic_contentionacross 1 / 2 / 4 / 8 threads. - CI workflow
bench-regression.yml: runs the full bench suite on every push and PR, uploadstarget/criterion/as an artifact for diff against the checked-in baseline. The hard >10% regression gate is rescoped to 0.9.10 alongside the rest of the pre-1.0 stabilization pass.
Breaking changes
Three API breaks land in this release. All have mechanical migration paths.
-
as_slicereturn type. Old:Result<&[u8]>(errors withInvalidModeon RW). New:Result<MappedSlice<'_>>for all three modes.MappedSlicederefs to[u8], so most call sites compile unchanged because indexing, iteration, and.len()go throughDeref. Sites that bound the result aslet s: &[u8] = mmap.as_slice(...)?need to change tolet s = mmap.as_slice(...)?;(let the type be inferred) orlet s: &[u8] = &*mmap.as_slice(...)?(deref explicitly). Sites that caughtInvalidModeon the RW path should remove that branch. -
Iterator items. Old:
Iterator<Item = Result<Vec<u8>>>. New:Iterator<Item = MappedSlice<'a>>. Patterns likeif let Some(Ok(chunk)) = iter.next()becomeif let Some(chunk) = iter.next();for chunk in mmap.chunks(N) { let chunk = chunk?; ... }becomesfor chunk in mmap.chunks(N) { ... }. Sites that genuinely need owned buffers (e.g., handing data to a thread that outlives the mapping borrow) should switch tochunks_owned()/pages_owned(), which preserve the oldResult<Vec<u8>>shape. -
ChunkIteratorMut::for_each_mutsignature. Old:fn for_each_mut<F, E>(F) -> Result<Result<(), E>>. New:fn for_each_mut<F>(F) -> Result<()>whereF: FnMut(u64, &mut [u8]) -> Result<()>(the crate'sResult, notstd::result::Result<(), E>). Migration: returnOk(())instead ofOk::<(), io::Error>(())and map foreign errors with.map_err(|e| MmapIoError::Io(...))before returning. The double??unwrap pattern at the call site collapses to a single?.
Performance
- Iterator zero-copy (H1): 1 GiB scan at 4 KiB chunks. Before: 262,144 heap allocations + 2x memory bandwidth (read into iterator buffer, clone buffer to return ownership). After: 0 allocations, 1x bandwidth, pointer arithmetic only. The
iterator_throughputbench compareschunks()tochunks_owned()directly so the delta is on the bench output sheet. touch_pagestight loop (H2): 1 GiB file with 4 KiB pages = 262,144 page touches. Before: per-pageread_into(offset, &mut buf[..1])= lock acquisition + bounds check + memcpy of 1 byte, 262,144 times. After: one lock acquisition +ptr::read_volatilestep loop. Expected ~50-100x speedup on thetouch_pages_largebench.for_each_mutsingle guard (E4 follow-on): total time-window the write lock is held is unchanged (the iteration was always exclusive), but the per-chunk lock-acquire / lock-release overhead is gone. Tight RMW loops over many small chunks (e.g., zeroing a 1 GiB file at 4 KiB) save the per-iteration uncontended parking_lot overhead.MappedSliceoverhead: zero on RO and COW (theOwned(&[u8])variant is a thin wrapper). On RW the wrapper holds anRwLockReadGuardwhose destructor releases the lock. No allocation, no virtual dispatch.Dereflowers to a direct pointer access at the use site.
Tests
- 15 property tests (256-1024 cases per property) continue to run; one obsolete property (
as_slice_rw_invalid_mode) was rewritten asas_slice_rw_returns_mapped_sliceto verify the new RW path. Plus new iterator tests for the zero-copy shape,chunks_ownedmigration aid, andfor_each_mutsingle-guard behavior. - 101 tests total under
--all-features(up from 99 in 0.9.6), 4 ignored (3 polling-watch tests gated on Windows mtime granularity, 1 hugepages fallback), 0 failed. - CI matrix combos green locally for
--no-default-features(60 + doctests) and--no-default-features --features "cow locking advise"(75 + doctests). Banned-words scan zero hits. - MSRV unchanged at 1.75.
cargo +1.75 build --all-featuresclean. The iterator'sSend + Syncimpls and theMappedSliceDeref / PartialEq stack work on 1.75 without GAT use or other recent features.
Notes
- No new runtime dependencies.
proptest(from 0.9.6) remains the only[dev-dependencies]addition for this milestone. The random-offset benches use a hand-rolled xorshift64 PRNG rather than pulling inrand. MappedSliceandMappedSliceMutare now part of the stable-through-0.9.x public surface. Both are re-exported from the crate root.as_slice_mutwas already returningMappedSliceMut<'_>before this release. This release addsDeref<Target = [u8]>andDerefMutimpls to it so it can be used as a&mut [u8]directly. Pluslen()andis_empty()accessors.
Deferred (with documented reason)
- H3 (lock-free RW reads via arc-swap) stays open. The current
RwLock<MmapMut>design is sound and bounded; replacing it witharc-swapor anUnsafeCelldesign is a memory-model question (do we accept torn reads from concurrent writers on the same mapping?), not a tuning question. Rescoped from "tactical 0.9.x fix" to a 1.0 design conversation. - Bench-regression hard gate: the new
bench-regression.ymlruns the suite and uploads artifacts on every push. The >10% regression threshold check is part of the 0.9.10 pre-1.0 stabilization milestone alongsidecargo-semver-checksandcargo-fuzz. docs/PERFORMANCE.md: now that the workload-pattern benches exist, the next step is running them on the maintainer's reference machine and publishing measured P50 / P99 numbers per workload. That doc is part of the same 0.9.10 sweep.- 0.9.8 is the async-polish release: cancellation-safety review per async method, possible `read_...
v0.9.6 - Safety contract sealed
Release Notes for v0.9.6 - Unsafe audit and property tests
Version 0.9.6 - 2026-05-12
The safety-hardening pass that locks in the foundation. Every unsafe block in the crate (29 blocks across 6 files) now carries a // SAFETY: comment that names the syscall contract it relies on, demonstrates how local context establishes the preconditions, and cites the relevant man page or MSDN reference. docs/SAFETY.md joins the repo as the authoritative catalog of those invariants, grouped by category (mapping construction, advise, locking, atomic views, flush, platform shims, test helpers). Audit findings S1, S2, and S3 are now closed.
Alongside the comment overhaul, proptest 1.5 joins the dev-dependency set and three new integration test files (tests/proptest_bounds.rs, tests/proptest_atomic.rs, tests/proptest_flush.rs) exercise bounds-checking, atomic alignment, and FlushPolicy state transitions across roughly ten thousand randomized cases per cargo test run. The C1 regression scenario from the 0.9.5 audit is now under continuous property coverage. Setting PROPTEST_CASES=10000 runs the deep sweep, which the maintainer uses before tagging.
Highlights
- 29 unsafe blocks audited. Every block in
src/advise.rs,src/lock.rs,src/mmap.rs,src/utils.rs,src/atomic.rs, andsrc/watch.rsnow has a SAFETY comment that states the syscall preconditions explicitly. Where the contract is platform-defined (POSIXmadvise,mlock,msync,utime; WindowsVirtualLock,VirtualUnlock,PrefetchVirtualMemory,GetSystemInfo), the comment cites the man page or MSDN page directly. Closes audit findings S1 (Windowswindows_page_size), S2 (shallow comments throughout advise/lock/mmap), and S3 (lock-then-release-then-use-pointer pattern documented). docs/SAFETY.md— the authoritative inventory grouped by category. Each section names the kernel surface, lists the invariants the crate establishes, links to the man page or MSDN reference, and explains how cross-process aliasing is handled (intra-process viaparking_lot::RwLock, cross-process documented out-of-scope per REPS.md section 5.1).- Property test framework:
proptest 1.5pinned for MSRV 1.75 (default-features off;std,bit-set,fork,timeoutopted in). 15 properties × 256-1024 cases per property covers (a) bounds-checking onas_slice/as_slice_mut/read_into/update_region/flush_range, (b) alignment on every atomic method including the slice variants, (c)FlushPolicy::EveryBytesmixed-write/flush_range invariants (the C1 regression scenario),EveryWritestriggering at N, andManualnever auto-flushing. - CI matrix fix:
tests/atomic_view_resize_safety.rs(added in 0.9.5) now carries#![cfg(feature = "atomic")]. The matrix runscargo test --no-default-features --features "<combo>"across 7 feature combos × 3 platforms; none of those combos include theatomicfeature, so 21 shards failed compile on the 0.9.5 push. Thefull-buildjob uses--all-featuresand never caught it. One-line gate, full matrix green. - Node 24 migration:
actions/checkout@v4(Node 20, deprecated 2025-09-19) bumped toactions/checkout@v5(Node 24) across all four CI workflow occurrences. Node 20 is forced to Node 24 starting 2026-06-02 and removed 2026-09-16; v5 is the recommended migration target. - Documentation reconciliation:
docs/API.mdandREPS.mdwalked end-to-end against the actual code. Stale claims corrected: MSRV 1.76 → 1.75 in API.md, version examples bumped to 0.9.6, atomic return types updated to the C3 wrapper forms (AtomicView<'_, T>/AtomicSliceView<'_, T>), stray character removed from theSegmentMutsection, async section completed withupdate_region_async/flush_async/flush_range_async. REPS section 4 now reflects the actual surface:TouchHint { Never, Eager, Lazy },as_slice_mutreturningMappedSliceMut<'_>, the fullFlushPolicyvariant set,atomic_u32_sliceadded, lock API names corrected (lock/unlock/lock_all/unlock_all),ChangeKind { Modified, Metadata, Removed }aligned with the polling backend.
Bug fixes discovered by the new CI
The matrix CI on the 0.9.5 push surfaced one latent issue from that release that the all-features build had masked:
tests/atomic_view_resize_safety.rscompile failure. The test was authored as the C3 regression test in the 0.9.5 cycle. It usesmmap.atomic_u64(...)andmmap.atomic_u64_slice(...), both gated behindfeature = "atomic". The test file itself was missing the crate-level#![cfg(feature = "atomic")]attribute. The 0.9.5 verification ran--all-features(which compiles the test fine) and--no-default-features(which builds the crate without compiling integration tests under partial feature flags), so the gap was invisible locally. The matrix CI runscargo test --no-default-features --features "<combo>"for seven combinations ("",cow,locking,advise,cow locking,locking advise,cow locking advise) on Linux/macOS/Windows, and none of those combinations includeatomic. Every one of the 21 shards therefore failed at compile time. The fix is a single line at the top of the test file; the deeper takeaway is that any new test depending on a feature-gated API must carry the matching crate gate.
Breaking changes
None in 0.9.6. The C3 atomic-view wrapper change shipped in 0.9.5 and is reflected in this release's documentation. Public API surface for 0.9.6 is purely additive at the implementation level (SAFETY comments are non-functional; property tests are dev-only; proptest is a [dev-dependencies] addition that does not affect downstream consumers).
Tests
- 15 new property tests, minimum 256 cases per
proptest!and 1024 cases on most (PROPTEST_CASES=10000for the deep sweep before releases). - 99 unit + integration tests total under
--all-features(84 pre-existing + 15 new property tests), 4 ignored (3 polling-watch tests gated on Windows mtime granularity, 1 hugepages fallback test). - CI matrix: 21 build-test shards (7 feature combos × 3 OS) all green on the fix-CI push. Code Quality (
cargo fmt --all -- --check,cargo clippy --all-targets --all-features -- -D warnings) green.full-build(all-features check) andbenchmarks(criterion under no-default and all-features) green. Local MSRV verification oncargo +1.75 build --all-featuresalso clean. - Banned-words scan (per
.dev/DIRECTIVES.mdsection 9): zero hits forcomprehensive,robust,seamless,leverageacrosssrc/,tests/,examples/,benches/,docs/,README.md,CHANGELOG.md.
Notes
- MSRV unchanged at Rust 1.75.
proptest 1.5was specifically chosen because the 1.x line holds 1.66+, well under our floor. - No new runtime dependencies.
proptestis a[dev-dependencies]addition; downstream crates that depend onmmap-iosee no change to their dependency graph or compile times. - No public API changes. The only
Cargo.tomlchange is the version bump and the dev-dep addition. All breaking changes for this milestone are documentation reconciliation, not code-level. - The directives policy itself caught the Node 24 migration:
.dev/DIRECTIVES.mdsection 5 hadactions/checkout@v5 (Node 24-compatible). Older @v4 is acceptable through 2026-Q3 but should be migrated.as a standing item. The GitHub deprecation warning on the 0.9.5 push triggered the move.
Deferred (with documented reason)
- Iterator zero-copy redesign (audit H1) is the headline item for 0.9.7.
ChunkIterator/PageIteratorcurrently allocate a freshVec<u8>per chunk; the redesign returns aMappedSlice<'_>wrapper that holds the read guard for the iterator's lifetime. This is a breaking change to the iterator API and is scoped withMappedSlicefor RW reads (audit H4) alongside. - Lock-free reads for RW mappings (audit H3) — the current
RwLock<MmapMut>serializes all RW access through a single lock. Moving toarc-swapor equivalent is a substantive design change; tentatively scoped to 0.10.0 or 1.0.0. - Native watch backends (inotify / FSEvents /
ReadDirectoryChangesW) ship in 0.9.9 and unignore the three Windows watch tests that polling cannot serve reliably. - Fuzz suite (
cargo-fuzz) and MIRI runs on the atomic module land in 0.9.10 as part of the pre-1.0 stabilization. - 0.9.8 is the async-polish release: cancellation-safety documentation per async method, possible
read_into_async/read_slice_asyncadditions, and thefor_each_mutreturn-type flatten (audit E4).
Full Changelog: v0.9.4...v0.9.6
v0.9.4-b - Production-Ready Performance
mmap-io: Production-Ready Performance
This release marks the evolution of mmap-io from a powerful library to a production-grade storage toolkit. With a suite of new features engineered specifically for high-performance systems, this version delivers unparalleled control over memory behavior, I/O latency, and durability.
nternal operations have been heavily optimized and benchmarked to eliminate performance cliffs and ensure predictable, nanosecond-level access for the most demanding workloads.
Engineered for Predictability and Speed
This release moves beyond raw speed to focus on performance consistency—eliminating the guesswork and variability that kill performance in real-world, heavy-load scenarios.
- Page Pre-warming (
touch_pages): Eliminate first-access latency and prevent performance jitter from page faults. The newtouch_pages()method andTouchHint::Eagerbuilder option pre-fault memory pages into RAM, ensuring your critical operations always run at memory speed.
- Intelligent Microflush Optimization: Small, frequent writes—critical for workloads like a Write-Ahead Log (WAL)—are now significantly more efficient. The
flush_range()method automatically detects sub-page-size flushes and expands them to page boundaries, reducing syscall overhead and aligning with how the underlying hardware operates. - Multi-Tier Huge Page Handling: The
hugepagesfeature is now smarter than ever. It uses a multi-tier fallback system on Linux, attempting to secure optimized huge pages first before gracefully falling back to Transparent Huge Pages (THP) and then to standard pages, ensuring you get the best possible performance without sacrificing reliability.
Automated Durability & Developer Experience
- Time-Based Automatic Flushing: The
FlushPolicy::EveryMillispolicy is now fully implemented, spawning a background thread to handle periodic flushes automatically. This provides a simple, "set-it-and-forget-it" mechanism for balancing performance and durability without manual intervention. - Comprehensive Benchmarking Suite: We've added dedicated benchmarks to measure page fault costs, microflush overhead, and the real-world impact of page pre-warming, ensuring all performance claims are validated.
Known Issues Resolved
The previously noted performance dip in small-op (<4KB) cases has been resolved. The new microflush optimization directly targets and fixes this issue, ensuring high throughput across all write sizes.
Ready for Production
The API surface is stable and the internals are robust. mmap-io is ready for your most serious workloads, from database engines like HiveDB to high-frequency trading systems, cache engines, and low-latency data pipelines.
Built for developers who demand nanosecond precision, zero-overhead memory I/O, and full platform correctness.
Feedback Wanted
This is a Beta release — API surface is stable, but we're still tuning the guts for edge-case performance. If you're using this in serious workloads (like HiveDB, cache engines, or low-latency pipelines), drop issues or suggestions and we’ll make it better.
This beta is stable, cross-platform, async-ready, and poised for production-level memory IO workloads.
v0.9.3-Beta (Stable)
mmap-io: Production-Ready Performance
This release marks the evolution of mmap-io from a powerful library to a production-grade storage toolkit. With a suite of new features engineered specifically for high-performance systems, this version delivers unparalleled control over memory behavior, I/O latency, and durability.
nternal operations have been heavily optimized and benchmarked to eliminate performance cliffs and ensure predictable, nanosecond-level access for the most demanding workloads.
Engineered for Predictability and Speed
This release moves beyond raw speed to focus on performance consistency—eliminating the guesswork and variability that kill performance in real-world, heavy-load scenarios.
- Page Pre-warming (
touch_pages): Eliminate first-access latency and prevent performance jitter from page faults. The newtouch_pages()method andTouchHint::Eagerbuilder option pre-fault memory pages into RAM, ensuring your critical operations always run at memory speed.
- Intelligent Microflush Optimization: Small, frequent writes—critical for workloads like a Write-Ahead Log (WAL)—are now significantly more efficient. The
flush_range()method automatically detects sub-page-size flushes and expands them to page boundaries, reducing syscall overhead and aligning with how the underlying hardware operates. - Multi-Tier Huge Page Handling: The
hugepagesfeature is now smarter than ever. It uses a multi-tier fallback system on Linux, attempting to secure optimized huge pages first before gracefully falling back to Transparent Huge Pages (THP) and then to standard pages, ensuring you get the best possible performance without sacrificing reliability.
Automated Durability & Developer Experience
- Time-Based Automatic Flushing: The
FlushPolicy::EveryMillispolicy is now fully implemented, spawning a background thread to handle periodic flushes automatically. This provides a simple, "set-it-and-forget-it" mechanism for balancing performance and durability without manual intervention. - Comprehensive Benchmarking Suite: We've added dedicated benchmarks to measure page fault costs, microflush overhead, and the real-world impact of page pre-warming, ensuring all performance claims are validated.
Known Issues Resolved
The previously noted performance dip in small-op (<4KB) cases has been resolved. The new microflush optimization directly targets and fixes this issue, ensuring high throughput across all write sizes.
Ready for Production
The API surface is stable and the internals are robust. mmap-io is ready for your most serious workloads, from database engines like HiveDB to high-frequency trading systems, cache engines, and low-latency data pipelines.
Built for developers who demand nanosecond precision, zero-overhead memory I/O, and full platform correctness.
Feedback Wanted
This is a Beta release — API surface is stable, but we're still tuning the guts for edge-case performance. If you're using this in serious workloads (like HiveDB, cache engines, or low-latency pipelines), drop issues or suggestions and we’ll make it better.
This beta is stable, cross-platform, async-ready, and poised for production-level memory IO workloads.
Full Changelog: v0.9.0...v0.9.3
v0.9.0 Beta — stable
mmap-io: stable beta release
This release marks a major leap forward in stability, cross-platform reliability, and raw performance. Flush behavior is smarter, visibility is guaranteed across OS boundaries, and internal operations have been benchmarked and optimized for modern workloads like HiveDB.
Flying Fast, Flush Smarter
- New FlushPolicy Engine: Select from
Manual,Always,EveryBytes, andEveryWritesfor optimized write persistence - Improved Huge Page Handling (soft fallback enabled)
- Massive speedups in:
- Chunked iteration
- File resizing
- COW open performance
- Region-based flushing
- Memory Advice Support: Smarter access patterns via
.advise() - Benchmarked across
4096Bto1MBops with live regression tracking
Known Regression Notes:
Some small-op (<4KB) cases saw slight performance dips (~5–15%) — being monitored. Real-world impact: minimal.
Feedback Wanted
This is a Beta release — API surface is stable, but we're still tuning the guts for edge-case performance. If you're using this in serious workloads (like HiveDB, cache engines, or low-latency pipelines), drop issues or suggestions and we’ll make it better.
Built for devs who care about nanosecond precision, zero-overhead memory I/O, and full platform correctness.
This beta is stable, cross-platform, async-ready, and poised for production-level memory IO workloads.
Full Changelog: v0.8.0...v0.9.0
v0.8.0 Beta — stable
Smarter Flushes, Platform Parity, and Performance Wins
This release marks a major leap forward in stability, cross-platform reliability, and raw performance. Flush behavior is smarter, visibility is guaranteed across OS boundaries, and internal operations have been benchmarked and optimized for modern workloads like HiveDB.
Major Changes
-
Smart Flush Logic
Flush operations now intelligently determine what needs to be persisted, reducing unnecessary I/O without sacrificing data safety.flush()andflush_range()honor platform-level durability semantics. -
Platform Parity: Flush Visibility
Full parity ensured across Linux, macOS, and Windows. After callingflush()orflush_range(), a fresh read-only mapping on all OSes will reflect the persisted data. This is explicitly tested and contractually guaranteed. -
Performance Enhancements
- Up to +122% throughput gains on large flush+update workloads.
- Faster region updates (
update_only,update_plus_flush, etc.). - Chunked iterators and resize ops now significantly faster.
- Flush thresholds improved (tunable for advanced use cases).
- Memory advice (
MADV_WILLNEED,MADV_SEQUENTIAL) now shows measurable wins.
New Tests & Benchmarks
-
✅
tests/platform_parity.rs: Verifies post-flush visibility across platforms. -
✅
tests/async_flush.rs: Confirms async and threshold flushing work as expected. -
✅ Full benchmark suite overhauled with
criterion.rs, capturing:- Region updates
- Full vs range flushes
- Read/write throughput
- Resize and COW performance
Fixes & Internals
- Flush thresholds now configurable and actually used during real workloads.
- Reduced dead_code warnings;
huge_pagesnow properly surfaced inDebug. - Clean fallbacks and recoveries from failed mappings.
- General code cleanup and traceability improvements for contributors.
⚠️ Notes
watch-based features now rely on explicit flushes for guaranteed change detection. If you're usingwatchin tests or production, be sure to call.flush()after writes, or manually force a flush in the watcher path.huge_pagessupport exists but is not yet wired to retain successful mappings (planned forv0.9.X).
Feedback Wanted
This is a Beta release — API surface is stable, but we're still tuning the guts for edge-case performance. If you're using this in serious workloads (like HiveDB, cache engines, or low-latency pipelines), drop issues or suggestions and we’ll make it better.
Built for devs who care about nanosecond precision, zero-overhead memory I/O, and full platform correctness.
Full Changelog: v0.7.5...v0.8.0