Skip to content

[diskann-disk] Decouple buffer memory alignment from disk block_size#984

Merged
doliawu merged 2 commits intomicrosoft:mainfrom
doliawu:decouple-alignments
Apr 29, 2026
Merged

[diskann-disk] Decouple buffer memory alignment from disk block_size#984
doliawu merged 2 commits intomicrosoft:mainfrom
doliawu:decouple-alignments

Conversation

@doliawu
Copy link
Copy Markdown
Contributor

@doliawu doliawu commented Apr 28, 2026

DiskSectorGraph previously allocated sectors_data aligned to block_size (typically 4096), conflating the on-disk stride with the reader's hardware/OS memory-placement requirement. Move the requirement where it belongs: declare it on the reader as
AlignedFileReader::MEMORY_ALIGNMENT (default 1; direct-I/O readers override to 512) and have the buffer allocator honor that.

Drop the per-request alignment check from AlignedRead::new — that constraint is the caller's responsibility (DiskSectorGraph's offsets and lengths are already block_size-aligned by construction). Making construction infallible cleans up call sites.

Refresh VertexProvider / VertexProviderFactory docs to drop stale references to ANNWrapper and JetVertexProvider.

Comment thread diskann-disk/src/search/provider/disk_provider.rs
@wuw92
Copy link
Copy Markdown
Contributor

wuw92 commented Apr 28, 2026

For awareness, this PR overlaps with @hildebrandmw's review direction on #965: #965 (review)

His two concrete asks there both touch this PR:

  1. Add PowerOfTwo / AlignedAllocator const aliases → the four PowerOfTwo::new_or_panic(...) callsites added here would collapse to those aliases, and the helper itself becomes unnecessary.
  2. Delete aligned_slice.rs, use Poly directly → the two aligned_slice(...) callsites in disk_sector_graph.rs would migrate accordingly.

Comment thread diskann-quantization/src/num.rs Outdated
@doliawu doliawu force-pushed the decouple-alignments branch from c2da97c to ab9e3fb Compare April 28, 2026 19:54
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.63%. Comparing base (f458cf6) to head (37fdf00).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #984      +/-   ##
==========================================
+ Coverage   89.48%   90.63%   +1.14%     
==========================================
  Files         448      448              
  Lines       84081    84095      +14     
==========================================
+ Hits        75239    76216     +977     
+ Misses       8842     7879     -963     
Flag Coverage Δ
miri 90.63% <100.00%> (+1.14%) ⬆️
unittests 90.59% <100.00%> (+1.26%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-disk/src/search/provider/disk_provider.rs 90.89% <ø> (ø)
...kann-disk/src/search/provider/disk_sector_graph.rs 96.95% <100.00%> (-0.02%) ⬇️
...disk/src/utils/aligned_file_reader/aligned_read.rs 100.00% <100.00%> (ø)
...s/aligned_file_reader/linux_aligned_file_reader.rs 82.65% <100.00%> (ø)
...ile_reader/storage_provider_aligned_file_reader.rs 98.36% <100.00%> (ø)

... and 37 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread diskann-disk/src/search/traits/vertex_provider.rs Outdated
Comment thread diskann-disk/src/utils/aligned_file_reader/linux_aligned_file_reader.rs Outdated
Comment thread diskann-disk/src/utils/aligned_file_reader/traits/aligned_file_reader.rs Outdated
@doliawu doliawu force-pushed the decouple-alignments branch 3 times, most recently from 25bd2e2 to b730cda Compare April 28, 2026 23:06
@doliawu doliawu marked this pull request as ready for review April 29, 2026 00:15
@doliawu doliawu requested review from a team and Copilot April 29, 2026 00:15
@doliawu doliawu force-pushed the decouple-alignments branch from b730cda to 3ee1e17 Compare April 29, 2026 00:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR decouples in-memory buffer alignment requirements from the on-disk block_size by moving the memory-alignment contract onto the AlignedFileReader itself, and updates the aligned-read request type and documentation accordingly.

Changes:

  • Introduces AlignedFileReader::MEMORY_ALIGNMENT (default 1; direct-I/O readers override to 512) plus a shared validate_alignment() helper.
  • Replaces the fallible AlignedRead type with an infallible ReadRequest data carrier and updates all call sites, tests, and benchmarks.
  • Updates DiskSectorGraph buffer allocation to align to the reader’s MEMORY_ALIGNMENT rather than block_size, and refreshes VertexProvider docs.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
diskann-disk/src/utils/aligned_file_reader/windows_aligned_file_reader.rs Switches to ReadRequest, defines MEMORY_ALIGNMENT = 512, and validates alignment before issuing unbuffered reads.
diskann-disk/src/utils/aligned_file_reader/linux_aligned_file_reader.rs Switches to ReadRequest, defines MEMORY_ALIGNMENT = 512, and validates alignment before O_DIRECT io_uring reads.
diskann-disk/src/utils/aligned_file_reader/storage_provider_aligned_file_reader.rs Switches to ReadRequest for the non-direct storage provider reader.
diskann-disk/src/utils/aligned_file_reader/traits/aligned_file_reader.rs Adds MEMORY_ALIGNMENT contract + shared validate_alignment() and unit tests.
diskann-disk/src/utils/aligned_file_reader/read_request.rs Adds new infallible request carrier type used by readers/callers.
diskann-disk/src/utils/aligned_file_reader/mod.rs Stops exporting AlignedRead and exports ReadRequest instead.
diskann-disk/src/utils/aligned_file_reader/aligned_read.rs Removes the old AlignedRead type and its constructor-time alignment checks.
diskann-disk/src/search/provider/disk_sector_graph.rs Allocates sector buffers aligned to AlignedReaderType::MEMORY_ALIGNMENT instead of block_size; uses ReadRequest.
diskann-disk/src/search/provider/disk_vertex_provider_factory.rs Updates header read path to use ReadRequest.
diskann-disk/src/search/traits/vertex_provider_factory.rs Refreshes trait documentation to remove stale references and clarify parameters/functions.
diskann-disk/src/search/traits/vertex_provider.rs Refreshes trait documentation (and converts several comments to doc comments).
diskann-disk/src/search/provider/disk_provider.rs Minor formatting-only change.
diskann-disk/benches/benchmarks/aligned_file_reader_bench.rs Updates benchmark to use ReadRequest.
diskann-disk/benches/benchmarks_iai/aligned_file_reader_bench_iai.rs Updates IAI benchmark to use ReadRequest.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diskann-disk/src/utils/aligned_file_reader/mod.rs
Comment thread diskann-disk/src/utils/aligned_file_reader/read_request.rs Outdated
Copy link
Copy Markdown
Contributor

@arkrishn94 arkrishn94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on Dongliang, just have one comment. I know it's a little more work than might be needed but I think it it'll pay off to keep this API clean.

Comment thread diskann-disk/src/utils/aligned_file_reader/traits/aligned_file_reader.rs Outdated
Drop references to ANNWrapper and JetVertexProvider (neither exist),
fix typo GraphProvider -> VertexProvider, fix wrong type parameter
(Data, not GraphMetadata) and wrong create_vertex_provider signature
description, and convert four `//` comments that should have been
`///` doc comments.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@doliawu doliawu force-pushed the decouple-alignments branch from 71bd0db to 3954f7a Compare April 29, 2026 21:30
Copy link
Copy Markdown
Contributor

@arkrishn94 arkrishn94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks Dongliang.

Decouple buffer-placement memory alignment from the disk-side stride.
DiskSectorGraph previously allocated `sectors_data` aligned to
`block_size` (typically 4096), conflating the on-disk stride with the
reader's hardware/OS memory-placement requirement.

Introduce a sealed `Alignment` trait with `const VALUE: PowerOfTwo` and
two unit markers (`A1`, `A512`), and parameterize `AlignedRead` by an
`A: Alignment` witness (defaulting to `A1`). `AlignedRead::new` checks
buffer pointer, byte-length, and disk-offset alignment against
`A::VALUE` at construction and returns `ANNResult<Self>`, so a typed
`AlignedRead<u8, A512>` is itself a witness that the request is
well-formed -- the file reader's `read` impl no longer needs to
re-validate.

Replace `AlignedFileReader::DISK_IO_ALIGNMENT` (and the hardcoded 512
in `AlignedRead::new`) with `AlignedFileReader::Alignment`. Linux and
Windows direct-I/O readers set `type Alignment = A512;`; the buffered
`StorageProviderAlignedFileReader` uses `A1`.
`DiskSectorGraph::sectors_data` is now allocated using
`<AlignedReaderType::Alignment as Alignment>::VALUE` -- the reader's
contract -- rather than `block_size`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@doliawu doliawu force-pushed the decouple-alignments branch from 3954f7a to 37fdf00 Compare April 29, 2026 21:43
@doliawu doliawu enabled auto-merge (squash) April 29, 2026 22:08
@doliawu doliawu merged commit d990e33 into microsoft:main Apr 29, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants