Skip to content

Validate id_map size matches ntotal in IndexBinaryIDMap deserialization (#4917)#4917

Closed
scsiguy wants to merge 2 commits into
facebookresearch:mainfrom
scsiguy:export-D96333421
Closed

Validate id_map size matches ntotal in IndexBinaryIDMap deserialization (#4917)#4917
scsiguy wants to merge 2 commits into
facebookresearch:mainfrom
scsiguy:export-D96333421

Conversation

@scsiguy
Copy link
Copy Markdown
Contributor

@scsiguy scsiguy commented Mar 12, 2026

Summary:

Add a check in the IBMp/IBM2 deserialization path that rejects
input where id_map.size() != ntotal. Without this validation,
a crafted index with a mismatched id_map would silently deserialize,
and subsequent search() or reconstruct() calls would use an
inconsistent ID mapping. For IndexBinaryIDMap2, the downstream
construct_rev_map() call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421

@meta-cla meta-cla Bot added the CLA Signed label Mar 12, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Mar 12, 2026

@scsiguy has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96333421.

scsiguy added a commit to scsiguy/faiss that referenced this pull request Mar 12, 2026
…on (facebookresearch#4917)

Summary:

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
scsiguy added a commit to scsiguy/faiss that referenced this pull request Mar 12, 2026
…on (facebookresearch#4917)

Summary:

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
scsiguy added a commit to scsiguy/faiss that referenced this pull request Mar 12, 2026
…on (facebookresearch#4917)

Summary:

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
scsiguy added a commit to scsiguy/faiss that referenced this pull request Mar 12, 2026
…on (facebookresearch#4917)

Summary:

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
scsiguy added a commit to scsiguy/faiss that referenced this pull request Mar 12, 2026
…on (facebookresearch#4917)

Summary:

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
scsiguy added a commit to scsiguy/faiss that referenced this pull request Mar 12, 2026
…on (facebookresearch#4917)

Summary:
Pull Request resolved: facebookresearch#4917

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
@meta-codesync meta-codesync Bot changed the title Validate id_map size matches ntotal in IndexBinaryIDMap deserialization Validate id_map size matches ntotal in IndexBinaryIDMap deserialization (#4917) Mar 12, 2026
Summary:
In `read_binary_multi_hash_map()`, individual `ilsz` values are read
from the untrusted bitstring buffer and used to control an inner loop
that calls `BitstringReader::read()`. The existing buffer-size check
validates total expected bits assuming `sum(ilsz) == ntotal`, but
crafted `ilsz` values can violate this invariant and cause the reader
to consume bits past the end of the buffer.

Add a running `total_ids` counter and reject any `ilsz` that would
push the cumulative ID count past `ntotal`, throwing a descriptive
`FaissException` before the overrun occurs.

Differential Revision: D96329613
scsiguy added a commit to scsiguy/faiss that referenced this pull request Mar 13, 2026
…on (facebookresearch#4917)

Summary:

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
scsiguy added a commit to scsiguy/faiss that referenced this pull request Mar 13, 2026
…on (facebookresearch#4917)

Summary:

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
scsiguy added a commit to scsiguy/faiss that referenced this pull request Mar 13, 2026
…on (facebookresearch#4917)

Summary:

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
…on (facebookresearch#4917)

Summary:
Pull Request resolved: facebookresearch#4917

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
scsiguy added a commit to scsiguy/faiss that referenced this pull request Mar 13, 2026
…on (facebookresearch#4917)

Summary:

Add a check in the `IBMp`/`IBM2` deserialization path that rejects
input where `id_map.size() != ntotal`.  Without this validation,
a crafted index with a mismatched `id_map` would silently deserialize,
and subsequent `search()` or `reconstruct()` calls would use an
inconsistent ID mapping.  For `IndexBinaryIDMap2`, the downstream
`construct_rev_map()` call would also build a corrupted reverse map.

Reviewed By: mnorris11

Differential Revision: D96333421
@meta-codesync meta-codesync Bot closed this in cdb7254 Mar 13, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Mar 13, 2026

This pull request has been merged in cdb7254.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant