Skip to content

Validate IndexHNSW2Level storage type during deserialization and search (#5113)#5113

Closed
scsiguy wants to merge 5 commits into
facebookresearch:mainfrom
scsiguy:export-D101243603
Closed

Validate IndexHNSW2Level storage type during deserialization and search (#5113)#5113
scsiguy wants to merge 5 commits into
facebookresearch:mainfrom
scsiguy:export-D101243603

Conversation

@scsiguy
Copy link
Copy Markdown
Contributor

@scsiguy scsiguy commented Apr 17, 2026

Summary:

Add validation that IndexHNSW2Level (fourcc "IHN2") has storage of an appropriate type, both at deserialization time and at search time.

IndexHNSW2Level::search() uses dynamic_cast to dispatch between Index2Layer and IndexIVFPQ storage types. When storage is null or a different type (e.g. IndexFlat from corrupt serialized data, or a programmatically misconfigured index), the dynamic_cast returns nullptr which is then unconditionally dereferenced, causing a segfault.

Deserialization-time fix: After reading the HNSW storage sub-index for IHN2, validate that storage is non-null and is either Index2Layer or IndexIVFPQ.

Search-time defense-in-depth: Add a FAISS_THROW_IF_NOT check on the dynamic_cast result in IndexHNSW2Level::search() before dereferencing. This protects against programmatically constructed indexes that bypass deserialization validation.

Reviewed By: mnorris11

Differential Revision: D101243603

@meta-cla meta-cla Bot added the CLA Signed label Apr 17, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Apr 17, 2026

@scsiguy has exported this pull request. If you are a Meta employee, you can view the originating Diff in D101243603.

scsiguy added a commit to scsiguy/faiss that referenced this pull request Apr 17, 2026
…ch (facebookresearch#5113)

Summary:

Add validation that IndexHNSW2Level (fourcc "IHN2") has storage of an appropriate type, both at deserialization time and at search time.

IndexHNSW2Level::search() uses dynamic_cast to dispatch between Index2Layer and IndexIVFPQ storage types. When storage is null or a different type (e.g. IndexFlat from corrupt serialized data, or a programmatically misconfigured index), the dynamic_cast returns nullptr which is then unconditionally dereferenced, causing a segfault.

Deserialization-time fix: After reading the HNSW storage sub-index for IHN2, validate that storage is non-null and is either Index2Layer or IndexIVFPQ.

Search-time defense-in-depth: Add a FAISS_THROW_IF_NOT check on the dynamic_cast result in IndexHNSW2Level::search() before dereferencing. This protects against programmatically constructed indexes that bypass deserialization validation.

Reviewed By: mnorris11

Differential Revision: D101243603
scsiguy added a commit to scsiguy/faiss that referenced this pull request Apr 17, 2026
…ch (facebookresearch#5113)

Summary:

Add validation that IndexHNSW2Level (fourcc "IHN2") has storage of an appropriate type, both at deserialization time and at search time.

IndexHNSW2Level::search() uses dynamic_cast to dispatch between Index2Layer and IndexIVFPQ storage types. When storage is null or a different type (e.g. IndexFlat from corrupt serialized data, or a programmatically misconfigured index), the dynamic_cast returns nullptr which is then unconditionally dereferenced, causing a segfault.

Deserialization-time fix: After reading the HNSW storage sub-index for IHN2, validate that storage is non-null and is either Index2Layer or IndexIVFPQ.

Search-time defense-in-depth: Add a FAISS_THROW_IF_NOT check on the dynamic_cast result in IndexHNSW2Level::search() before dereferencing. This protects against programmatically constructed indexes that bypass deserialization validation.

Reviewed By: mnorris11

Differential Revision: D101243603
scsiguy added a commit to scsiguy/faiss that referenced this pull request Apr 17, 2026
…ch (facebookresearch#5113)

Summary:

Add validation that IndexHNSW2Level (fourcc "IHN2") has storage of an appropriate type, both at deserialization time and at search time.

IndexHNSW2Level::search() uses dynamic_cast to dispatch between Index2Layer and IndexIVFPQ storage types. When storage is null or a different type (e.g. IndexFlat from corrupt serialized data, or a programmatically misconfigured index), the dynamic_cast returns nullptr which is then unconditionally dereferenced, causing a segfault.

Deserialization-time fix: After reading the HNSW storage sub-index for IHN2, validate that storage is non-null and is either Index2Layer or IndexIVFPQ.

Search-time defense-in-depth: Add a FAISS_THROW_IF_NOT check on the dynamic_cast result in IndexHNSW2Level::search() before dereferencing. This protects against programmatically constructed indexes that bypass deserialization validation.

Reviewed By: mnorris11

Differential Revision: D101243603
scsiguy added a commit to scsiguy/faiss that referenced this pull request Apr 17, 2026
…ch (facebookresearch#5113)

Summary:

Add validation that IndexHNSW2Level (fourcc "IHN2") has storage of an appropriate type, both at deserialization time and at search time.

IndexHNSW2Level::search() uses dynamic_cast to dispatch between Index2Layer and IndexIVFPQ storage types. When storage is null or a different type (e.g. IndexFlat from corrupt serialized data, or a programmatically misconfigured index), the dynamic_cast returns nullptr which is then unconditionally dereferenced, causing a segfault.

Deserialization-time fix: After reading the HNSW storage sub-index for IHN2, validate that storage is non-null and is either Index2Layer or IndexIVFPQ.

Search-time defense-in-depth: Add a FAISS_THROW_IF_NOT check on the dynamic_cast result in IndexHNSW2Level::search() before dereferencing. This protects against programmatically constructed indexes that bypass deserialization validation.

Reviewed By: mnorris11

Differential Revision: D101243603
scsiguy added 4 commits April 18, 2026 14:52
Summary:
Add utility functions to FaissException.h for capturing and rethrowing exceptions across OpenMP parallel region boundaries, and migrate IndexIVF to use them.

New helpers in FaissException.h:

- omp_capture_exception(ex): Captures the current exception into a shared exception_ptr using #pragma omp critical. Only records the first exception.
- omp_capture_exception(ex, cleanup): Overload with a per-thread cleanup callable (e.g. setting an interrupt flag) that runs inside the critical section.
- omp_rethrow_if_exception(ex): Rethrows the captured exception on the main thread after the parallel region completes.

The exception_ptr approach preserves the full exception during capture and preserves the original exception type on rethrow.

IndexIVF migration (search, search_preassigned, range_search_preassigned):

- Replace std::mutex + std::string exception_string with std::exception_ptr + omp_capture_exception/omp_rethrow_if_exception.
- Replace bool interrupt with std::atomic<bool> (the previous non-atomic bool was read without synchronization in the omp for loop while being written inside a mutex-guarded catch block — undefined behavior). Use relaxed atomic loads for the interrupt short-circuit check since there is no way to fully close the race between interrupt being set and a thread being in the middle of computation.
- Replace catch (const std::exception&) with catch (...) to capture all exception types.
- Original exception type is now preserved on rethrow instead of being reformatted into a FaissException wrapping demangle_cpp_symbol + what().
- Replace InterruptCallback::is_interrupted() + manual flag set with InterruptCallback::check(). check() throws on interrupt, which is captured by omp_capture_exception and sets the interrupt flag via the cleanup lambda. This eliminates the separate post-region interrupt check — all error paths now flow through the unified exception capture/rethrow mechanism.
- range_search_preassigned has no interrupt short-circuit check in its loops, so it uses omp_capture_exception(ex) without a cleanup lambda and has no interrupt variable.
- Wrap all #pragma omp for loop bodies in try/catch blocks so exceptions are caught within the worksharing construct (required by the OpenMP specification). The outer try/catch at the #pragma omp parallel level remains as a safety net for code outside worksharing constructs.
- Remove redundant try/catch from scan_one_list and scan_list_func lambdas since all call sites are now protected by the loop body try/catch.

Differential Revision: D101233059
)

Summary:
Pull Request resolved: facebookresearch#5105

Exceptions thrown inside OpenMP worksharing constructs in IndexFlatCodes::search call std::terminate because the OpenMP specification does not allow exceptions to escape parallel regions. This is triggered by corrupt serialized index data that causes allocation failures or assertion errors in GenericFlatCodesDistanceComputer, but can occur with any exception thrown during the search loop.

Wrap the OpenMP parallel body in IndexFlatCodes::Run_search_with_decompress with per-iteration try/catch blocks that capture exceptions via std::exception_ptr and re-throw them on the main thread after the parallel region completes. The DC constructor is also wrapped separately since it runs before the worksharing loop.

SingleResultHandler is moved from stack to heap allocation (std::unique_ptr) because the constructor is now inside a try/catch block. If a stack-allocated object's constructor throws, the object never comes into existence, but OpenMP requires all threads to participate in the worksharing loop that follows — any reference to the uninitialized object would be undefined behavior. With std::unique_ptr, the variable starts as nullptr before the try block, remains nullptr if construction fails, and the interrupt flag ensures all loop iterations skip via continue without dereferencing it.

Differential Revision: D101008838
Summary:
Apply the same OpenMP exception-safety fix from IndexFlatCodes::search to IndexNNDescent::search and IndexNSG::search. Both methods have identical OMP structure: a parallel region constructs per-thread VisitedTable and DistanceComputer objects, then a worksharing loop calls the graph search function. Exceptions thrown inside either the constructor or the loop body (e.g. from corrupt deserialized graph state) call std::terminate because OpenMP does not allow exceptions to escape worksharing constructs.

Wrap the per-thread setup and the per-iteration loop body in try/catch blocks that capture exceptions via std::exception_ptr and re-throw on the main thread after the parallel region completes.

Potential exception sources in the OMP region:
- VisitedTable(ntotal): std::bad_alloc if ntotal is corrupt/huge. ntotal is validated as >= 0 during deserialization but has no upper bound.
- storage_distance_computer(storage): may throw from get_distance_computer() if the storage index has corrupt state.
- nndescent.search() / nsg.search(): throws FaissException if the graph has not been built (has_built check), or std::bad_alloc from internal vector allocations with corrupt search_L.

Differential Revision: D101031002
Summary:
Add null-quantizer guards to IndexIVF entry points (search, range_search, add_with_ids, train_q1) and to initialize_IVFPQ_precomputed_table. These prevent the null-deref crashes found by the fuzzer when a serialized IVF index contains a "null" fourcc for the quantizer sub-index.

The guards are placed at the usage sites rather than in read_ivf_header because some callers intentionally deserialize IVF indexes with a null quantizer (serializing the coarse quantizer separately to avoid double-serialization, then reattaching it after reading).

Differential Revision: D101236489
scsiguy added a commit to scsiguy/faiss that referenced this pull request Apr 18, 2026
…ch (facebookresearch#5113)

Summary:

Add validation that IndexHNSW2Level (fourcc "IHN2") has storage of an appropriate type, both at deserialization time and at search time.

IndexHNSW2Level::search() uses dynamic_cast to dispatch between Index2Layer and IndexIVFPQ storage types. When storage is null or a different type (e.g. IndexFlat from corrupt serialized data, or a programmatically misconfigured index), the dynamic_cast returns nullptr which is then unconditionally dereferenced, causing a segfault.

Deserialization-time fix: After reading the HNSW storage sub-index for IHN2, validate that storage is non-null and is either Index2Layer or IndexIVFPQ.

Search-time defense-in-depth: Add a FAISS_THROW_IF_NOT check on the dynamic_cast result in IndexHNSW2Level::search() before dereferencing. This protects against programmatically constructed indexes that bypass deserialization validation.

Reviewed By: mnorris11

Differential Revision: D101243603
@meta-codesync meta-codesync Bot changed the title Validate IndexHNSW2Level storage type during deserialization and search Validate IndexHNSW2Level storage type during deserialization and search (#5113) Apr 18, 2026
@scsiguy scsiguy force-pushed the export-D101243603 branch from 282253e to 684ddf7 Compare April 18, 2026 21:55
scsiguy added a commit to scsiguy/faiss that referenced this pull request Apr 18, 2026
…ch (facebookresearch#5113)

Summary:

Add validation that IndexHNSW2Level (fourcc "IHN2") has storage of an appropriate type, both at deserialization time and at search time.

IndexHNSW2Level::search() uses dynamic_cast to dispatch between Index2Layer and IndexIVFPQ storage types. When storage is null or a different type (e.g. IndexFlat from corrupt serialized data, or a programmatically misconfigured index), the dynamic_cast returns nullptr which is then unconditionally dereferenced, causing a segfault.

Deserialization-time fix: After reading the HNSW storage sub-index for IHN2, validate that storage is non-null and is either Index2Layer or IndexIVFPQ.

Search-time defense-in-depth: Add a FAISS_THROW_IF_NOT check on the dynamic_cast result in IndexHNSW2Level::search() before dereferencing. This protects against programmatically constructed indexes that bypass deserialization validation.

Reviewed By: mnorris11

Differential Revision: D101243603
scsiguy added a commit to scsiguy/faiss that referenced this pull request Apr 18, 2026
…ch (facebookresearch#5113)

Summary:

Add validation that IndexHNSW2Level (fourcc "IHN2") has storage of an appropriate type, both at deserialization time and at search time.

IndexHNSW2Level::search() uses dynamic_cast to dispatch between Index2Layer and IndexIVFPQ storage types. When storage is null or a different type (e.g. IndexFlat from corrupt serialized data, or a programmatically misconfigured index), the dynamic_cast returns nullptr which is then unconditionally dereferenced, causing a segfault.

Deserialization-time fix: After reading the HNSW storage sub-index for IHN2, validate that storage is non-null and is either Index2Layer or IndexIVFPQ.

Search-time defense-in-depth: Add a FAISS_THROW_IF_NOT check on the dynamic_cast result in IndexHNSW2Level::search() before dereferencing. This protects against programmatically constructed indexes that bypass deserialization validation.

Reviewed By: mnorris11

Differential Revision: D101243603
scsiguy added a commit to scsiguy/faiss that referenced this pull request Apr 18, 2026
…ch (facebookresearch#5113)

Summary:

Add validation that IndexHNSW2Level (fourcc "IHN2") has storage of an appropriate type, both at deserialization time and at search time.

IndexHNSW2Level::search() uses dynamic_cast to dispatch between Index2Layer and IndexIVFPQ storage types. When storage is null or a different type (e.g. IndexFlat from corrupt serialized data, or a programmatically misconfigured index), the dynamic_cast returns nullptr which is then unconditionally dereferenced, causing a segfault.

Deserialization-time fix: After reading the HNSW storage sub-index for IHN2, validate that storage is non-null and is either Index2Layer or IndexIVFPQ.

Search-time defense-in-depth: Add a FAISS_THROW_IF_NOT check on the dynamic_cast result in IndexHNSW2Level::search() before dereferencing. This protects against programmatically constructed indexes that bypass deserialization validation.

Reviewed By: mnorris11

Differential Revision: D101243603
…ch (facebookresearch#5113)

Summary:
Pull Request resolved: facebookresearch#5113

Add validation that IndexHNSW2Level (fourcc "IHN2") has storage of an appropriate type, both at deserialization time and at search time.

IndexHNSW2Level::search() uses dynamic_cast to dispatch between Index2Layer and IndexIVFPQ storage types. When storage is null or a different type (e.g. IndexFlat from corrupt serialized data, or a programmatically misconfigured index), the dynamic_cast returns nullptr which is then unconditionally dereferenced, causing a segfault.

Deserialization-time fix: After reading the HNSW storage sub-index for IHN2, validate that storage is non-null and is either Index2Layer or IndexIVFPQ.

Search-time defense-in-depth: Add a FAISS_THROW_IF_NOT check on the dynamic_cast result in IndexHNSW2Level::search() before dereferencing. This protects against programmatically constructed indexes that bypass deserialization validation.

Reviewed By: mnorris11

Differential Revision: D101243603
@scsiguy scsiguy force-pushed the export-D101243603 branch from 684ddf7 to dc03e2e Compare April 18, 2026 21:59
@meta-codesync meta-codesync Bot closed this in 349df70 Apr 19, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Apr 19, 2026

This pull request has been merged in 349df70.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant