[MMAP] Add in-memory (mmap) stream index mapping.#310
Open
[MMAP] Add in-memory (mmap) stream index mapping.#310
Conversation
1b35ca7 to
9e5588f
Compare
rfsaliev
commented
Apr 14, 2026
Comment on lines
+98
to
+113
|
|
||
| // Load from a memory-mapped file. | ||
| // The file is expected to be in the format produced by save(). | ||
| static Status map_to_file( | ||
| VamanaIndex** index, const char* path, MetricType metric, StorageKind storage_kind | ||
| ) noexcept; | ||
|
|
||
| // Load from a memory buffer. | ||
| // The buffer is expected to be in the format produced by save(). | ||
| static Status map_to_memory( | ||
| VamanaIndex** index, | ||
| const void* data, | ||
| size_t size, | ||
| MetricType metric, | ||
| StorageKind storage_kind | ||
| ) noexcept; |
…ests - Introduced `memstream.h` and `memstream.cpp` for memory-mapped stream functionality. - Updated `io.h` and `simple.h` to include memory stream support. - Enhanced `simple.cpp` and `flat.cpp` tests to validate loading from memory streams.
…emory and memory-mapped stream support
9e5588f to
a5ed391
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds mmap/in-memory stream support for loading indices and datasets (zero-copy “view” loading) across the core library and C++ runtime API, plus tests to validate behavior.
Changes:
- Introduces
mmstream/mmstreambuf,is_memory_stream, and pointer helpers (current_ptr/begin_ptr/end_ptr) plus aMemoryStreamAllocatorto enable view-based, zero-copy loading. - Updates loading paths (e.g.,
SimpleData,SimpleGraph, Vamana graph selection) to support view allocators and to enforce memory-stream requirements. - Extends C++ runtime API with
map_to_file/map_to_memoryfor Flat and Vamana indices and adds new coverage.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| include/svs/core/io/memstream.h | Adds mmapped streambuf/istream + memory-stream utilities and allocator for zero-copy loads |
| tests/svs/core/io/memstream.cpp | Adds unit tests for mmstream, pointer helpers, and memory-stream detection |
| include/svs/core/data/simple.h | Adds view-only load path from ContextFreeLoadTable + memory-stream enforcement |
| include/svs/core/data/io.h | Skips populate() for view allocators; enforces memory-stream constraint |
| include/svs/core/graph/graph.h | Allows forwarding allocator args during stream load to support view allocators |
| include/svs/orchestrators/vamana.h | Adjusts graph type selection for view-allocated data (mmapped/index views) |
| include/svs/lib/array.h | Generalizes “view allocator” DenseArray specialization to support MemoryStreamAllocator |
| include/svs/quantization/scalar/scalar.h | Fixes min/max init and adjusts SQDataset load APIs + adds mutable get_datum |
| bindings/cpp/include/svs/runtime/flat_index.h | Adds map_to_file / map_to_memory for FlatIndex |
| bindings/cpp/src/flat_index.cpp | Implements FlatIndex mapping APIs |
| bindings/cpp/src/flat_index_impl.h | Holds mapped stream lifetime and adds map_to_stream implementation |
| bindings/cpp/include/svs/runtime/vamana_index.h | Adds map_to_file / map_to_memory for VamanaIndex |
| bindings/cpp/src/vamana_index.cpp | Implements VamanaIndex mapping APIs |
| bindings/cpp/src/vamana_index_impl.h | Holds mapped stream lifetime and adds map_to_stream implementation |
| bindings/cpp/tests/runtime_test.cpp | Adds runtime tests for mapping APIs across storage kinds |
| tests/svs/core/data/simple.cpp | Adds tests for loading SimpleDataView from stringstream + error-path test |
| tests/svs/quantization/scalar/scalar.cpp | Adds SQDataset view-load test from stringstream |
| tests/svs/index/flat/flat.cpp | Adds FlatIndex view-load tests from stringstream and mmapped file |
| tests/svs/index/vamana/index.cpp | Adds Vamana view-load tests from stringstream/mmapped file + SQ view-load test |
| tests/CMakeLists.txt | Registers new memstream test source |
748365f to
f04b1c2
Compare
…stream&) to accept custom accessors
…views Co-authored-by: Copilot <copilot@github.com>
Member
|
Please address CI failure before merging |
Member
|
And if it is expected to require updates from private source to work correctly then we'll need to update https://github.com/intel/ScalableVectorSearch/blob/main/bindings/cpp/CMakeLists.txt#L126 with LTO shared lib build from the private PR (can publish as nightly here) |
Co-authored-by: Copilot <copilot@github.com>
…ed allocator handling Co-authored-by: Copilot <copilot@github.com>
…get_allocator to 'view' DenseArray Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request adds to CPP Runtime API support for loading vector search indices (
FlatIndexandVamanaIndex) from memory-mapped files and memory buffers, in addition to existing stream-based loading. It introduces new API methods, updates internal implementations to manage mapped resources, and adds comprehensive tests to ensure correct behavior for all supported storage types.Major new features:
map_to_fileandmap_to_memorystatic methods to bothFlatIndexandVamanaIndexclasses, allowing indices to be loaded directly from memory-mapped files or memory buffers, in addition to streams. [1] [2]Internal implementation changes:
FlatIndexImplandVamanaIndexImplto support mapping from streams, and to manage the lifetime of mapped streams viaunique_ptrmembers. [1] [2] [3] [4] [5]Testing improvements:
write_and_map_index) and comprehensive test cases to verify correct loading and querying of indices from memory-mapped files for all supported storage kinds and index types. [1] [2] [3] [4] [5] [6] [7]