-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MultiLevelSkipListReader doesn't handle large offsets #29
Comments
Done. |
Hi ! Even with this fix, I've problem with big base (segfault): #0 0x00007ff0ea44d725 in operator at /home/albert/albert/LucenePlusPlus/include/Array.h:116 Something like 200 Go of data for 70 000 000 docs. -rw-rw-rw- 1 alian users 148641417159 22 janv. 16:11 _h75q.fdt Do you have an idea, even a small ? Thanks ! |
I don't have an idea off-hand, but generally I'd look for other offsets having int32 type -- could be in both class members and local variables. Sorry, nothing more concrete at the moment. We've been using Lucene++ for some time now and didn't notice other crashes we could attribute to it (our indicies do not grow past 10-15G though and phrase queries are not typical for our applications). That said, I would consider possibility of memory corruption happening somewhere else in the program. |
Really thanks for your answer, just to see that I'm not alone in the sea :-) I I need to find a test case where I can reproduce it easly, and the creation of the base must be in the test to avoid to work on a production server :-/ I give you a feedback if I find something ... |
It could be really tricky to reproduce the bug (original one I was only able to reproduce carefully picking queries and queries that reproduced crash on one version of index wouldn't reproduce it on next version). So, if you ask me -- make a snapshot of whatever index it has crashed on and try exactly the same query you have in core dump. If you're lucky, it will repro. If not -- examining core more carefully is your best bet. Some bugs are just impossible to design a test for :-( |
Just noticed that src/core/include/_MMapDirectory.h has offsets as int32 as well. It's easy to fix (look through c++ file too, there're casts there). Hope that helps |
Also, MiscUtils::arrayCopy uses int32_t as an offset, and this method is used inside memmapped index input. Changing argument types to int64 (as well as MMapIndexInput _length and bufferPosition) have sorted things out for us. yeah, we just recently switched to memmapped index input :-) |
Thanks for highlighting this. |
Sure :-) That looks like carbon copy from Java, it's arrayCopy is not used in Java Lucene |
MultiLevelSkipListReader.cpp:79 should have (int32_t) cast removed.
if (level > 0 && lastChildPointer > (int32_t)skipStream[level - 1]->getFilePointer())
The text was updated successfully, but these errors were encountered: