You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I ran a test on Lucene 9.4 where I tried to force merge 2 million vectors with
dimension 768. It failed with
java.lang.IllegalStateException: Vector data length 3070061568 not matching
size=999369 * dim=768 * byteSize=4 = -1224905728
The problem is that we use an integer to represent the size, which is too small
to hold it. The bug snuck in during the work to enable int8 values, which
switched a long to an int: #1054. This error doesn't occur before version 9.4.
The text was updated successfully, but these errors were encountered:
java.lang.IllegalStateException: Vector data length 3070061568 not matching size=999369 * dim=768 * byteSize=4 = -1224905728
at org.apache.lucene.core@9.4.0/org.apache.lucene.codecs.lucene94.Lucene94HnswVectorsReader.validateFieldEntry(Lucene94HnswVectorsReader.java:185)
at org.apache.lucene.core@9.4.0/org.apache.lucene.codecs.lucene94.Lucene94HnswVectorsReader.readFields(Lucene94HnswVectorsReader.java:156)
at org.apache.lucene.core@9.4.0/org.apache.lucene.codecs.lucene94.Lucene94HnswVectorsReader.readMetadata(Lucene94HnswVectorsReader.java:103)
at org.apache.lucene.core@9.4.0/org.apache.lucene.codecs.lucene94.Lucene94HnswVectorsReader.<init>(Lucene94HnswVectorsReader.java:64)
at org.apache.lucene.core@9.4.0/org.apache.lucene.codecs.lucene94.Lucene94HnswVectorsFormat.fieldsReader(Lucene94HnswVectorsFormat.java:157)
Thanks to @ebadyano and @benwtrent for helping me track this down so quickly.
jtibshirani
changed the title
Lucene94HnswVectorsFormat validation fails with large datasets
Lucene94HnswVectorsFormat validation fails with large segments
Nov 2, 2022
I ran a test on Lucene 9.4 where I tried to force merge 2 million vectors with
dimension 768. It failed with
The problem is that we use an integer to represent the size, which is too small
to hold it. The bug snuck in during the work to enable int8 values, which
switched a long to an int: #1054. This error doesn't occur before version 9.4.
The text was updated successfully, but these errors were encountered: