DataOutput.writeGroupVInts throws IntegerOverflow exception during merging #13373

iamsanjay · 2024-05-15T15:00:58Z

Description

As being discussed on email list that DataOutput.writeGroupVInts throws as IntegerOverflow exception. The goal is to find out the main reason and also to improve the exception message.

Exception in thread "Lucene Merge Thread #202"
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.ArithmeticException: integer overflow at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:735) at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:727)
Caused by: java.lang.ArithmeticException: integer overflow at 
java.base/java.lang.Math.toIntExact(Math.java:1135) at 
org.apache.lucene.store.DataOutput.writeGroupVInts(DataOutput.java:354) at 
org.apache.lucene.codecs.lucene99.Lucene99PostingsWriter.finishTerm(Lucene99PostingsWriter.java:379) at 
org.apache.lucene.codecs.PushPostingsWriterBase.writeTerm(PushPostingsWriterBase.java:173) at 
org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$TermsWriter.write(Lucene90BlockTreeTermsWriter.java:1097) at
org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter.write(Lucene90BlockTreeTermsWriter.java:398) at 
org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:95) at
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:205) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:209) at
org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:298) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:137) at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5252) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4740) at
org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6541) at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:639) at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:700)

More context from the reporter

Looking deeper into this. I think we overflowed a term frequency field.
Looking in some statistics, in a previous release we had 1,288,526,281
of a certain field, this would be larger now. Each of these would have
had a limited set of values. But crucially nearly all of them would have
had the term "positional" or "non-positional" added to the document.

There is no good reason to do this today, we should just turn this into
a boolean field and update the UI. I will do this and report back.

Do you think that a patch for a try/catch for a more informative log
message be appreciated by the community? e.g. mentioning the field name
in the exception?

The index that had an issue when merging into one segment definitely had
more than 1 billion times the word "positional" in it. I hope to be able
to give a closer number once re-indexing finished with a "work-around".

Of course the "work-around" is to just fix this correctly by not having
that word so often in the index and definitely not as docs, freqs and
postings.

For background information.

The use case was to find a set of documents that where either
"positional" or "non-positional". This was present in the first check in
of our code 18 years ago! since then our data has grown a bit ;) The
code was using Lucene 1.4.3 at that time. Users would search using this
as what now would be a facet type:positional. I changed this to a
field only IndexOptions.DOCS which is called 'positional' and searched
as positional:yes rewriting the previous query syntax behind the scene
to not break any user tools.

Version and environment details

No response

The text was updated successfully, but these errors were encountered:

iamsanjay · 2024-05-15T15:20:58Z

Below code snippet is from 9_10 branch where this issues has been observed. As per the latest change for 10, we have moved few set of lines from below method to other class into a new method.
java.base/java.lang.Math.toIntExact(Math.java:1135) at
org.apache.lucene.store.DataOutput.writeGroupVInts(DataOutput.java:354) at

lucene/lucene/core/src/java/org/apache/lucene/store/DataOutput.java

Lines 337 to 356 in f12e489

    
           public void writeGroupVInts(long[] values, int limit) throws IOException { 
        
             int off = 0; 
        
             // encode each group 
        
             while ((limit - off) >= 4) { 
        
               byte flag = 0; 
        
               groupVIntBytes.setLength(1); 
        
               flag |= (encodeGroupValue(Math.toIntExact(values[off++])) - 1) << 6; 
        
               flag |= (encodeGroupValue(Math.toIntExact(values[off++])) - 1) << 4; 
        
               flag |= (encodeGroupValue(Math.toIntExact(values[off++])) - 1) << 2; 
        
               flag |= (encodeGroupValue(Math.toIntExact(values[off++])) - 1); 
        
               groupVIntBytes.setByteAt(0, flag); 
        
               writeBytes(groupVIntBytes.bytes(), groupVIntBytes.length()); 
        
             } 
        
             // tail vints 
        
             for (; off < limit; off++) { 
        
               writeVInt(Math.toIntExact(values[off])); 
        
             } 
        
           }

easyice · 2024-05-16T03:59:29Z

Sorry for missing the email list, It seems the docDeltaBuffer should not overflow if just reading the code, I will try to reproduce this issue, Could you show me your source code for indexing, and some sample data? @iamsanjay

JervenBolleman · 2024-05-16T06:58:42Z

Hi @easyice, I am the original reporter on the mailing list.

As the code around indexing is a bit abstracted it might be hard to follow. What I do have, is the index that failed merging it is however, 173 GB xz compressed. I could use luke or a tool like that to extract more information for the lucene team.

The fieldtype that we are indexing into is

UNSTORED_POSITIONAL.setOmitNorms(true);
UNSTORED_POSITIONAL.setIndexOptions(IndexOptions.DOCS_AND_FREQS_AND_POSITIONS);
UNSTORED_POSITIONAL.setStored(false);
UNSTORED_POSITIONAL.setTokenized(false);
UNSTORED_POSITIONAL.freeze();```

Then we add fields like so

doc.add(new Field("type", value.toLowerCase(Locale.US), UNSTORED_POSITIONAL);

With over 1,177,800,000 documents in this index, all with the term "positional" at least once in their documents.
On average there are three fields of this type in each document.

So to create local sample data I would just do ;)

for (int i=0;i<2_000_000_000;i++){
{
    Document doc = new Document();
    doc.add(new Field("type", "number", UNSTORED_POSITIONAL);
    if (i % 2 == 0} {
        doc.add(new Field("type", "even", UNSTORED_POSITIONAL);
    } else {
        doc.add(new Field("type", "un-even", UNSTORED_POSITIONAL);
   }
   writer.addDocument(doc);
}

easyice · 2024-05-16T07:05:21Z

Thank you @JervenBolleman , I have found the cause of the issue with @gf2121 , i will raise a PR later.

mikemccand · 2024-05-17T15:09:40Z

Here is the java-user discussion that lead to this issue.

Thank you for reporting this @iamsanjay! It looks like it was a real bug, phew, and somewhat serious (not sure).

And thank you @easyice and @gf2121 for the quick repro/fix.

iamsanjay added the type:bug label May 15, 2024

easyice mentioned this issue May 16, 2024

Fix IntegerOverflow exception in postings encoding as group-varint #13376

Merged

1 task

easyice closed this as completed in #13376 May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataOutput.writeGroupVInts throws IntegerOverflow exception during merging #13373

DataOutput.writeGroupVInts throws IntegerOverflow exception during merging #13373

iamsanjay commented May 15, 2024

iamsanjay commented May 15, 2024 •

edited

easyice commented May 16, 2024

JervenBolleman commented May 16, 2024

easyice commented May 16, 2024

mikemccand commented May 17, 2024

DataOutput.writeGroupVInts throws IntegerOverflow exception during merging #13373

DataOutput.writeGroupVInts throws IntegerOverflow exception during merging #13373

Comments

iamsanjay commented May 15, 2024

Description

Version and environment details

iamsanjay commented May 15, 2024 • edited

easyice commented May 16, 2024

JervenBolleman commented May 16, 2024

easyice commented May 16, 2024

mikemccand commented May 17, 2024

iamsanjay commented May 15, 2024 •

edited