-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash in intrusive_list.hpp line 174 #3205
Comments
After restarting RethinkDB it appears to be working fine without any data corruption. |
Hi @timjrobinson, sorry you ran into this, and thanks for the bug report. I reformatted your backtrace for legibility:
@mlucy, this looks changefeed-related. |
A few additional questions @timjrobinson
Thank you for your help! Please also let us know in case this happens a second time. |
@AtnNn -- is there a way to go from that backtrace to line numbers? It would really help to know which line of |
The debug symbols are available in the buildbot assets folder, in this case rethinkdb-dbg_1.15.0-1-0ubuntu1~trusty_amd64.deb. Deb files can be extracted with |
@AtnNn -- thanks! That line is the end of the second
There are no obvious race conditions that we might be falling afoul of. @danielmewes -- is it plausible for destroying an @timjrobinson -- thanks for reporting this! Is there any chance you have a core dump from the crash? |
In posting this I just realized there's a bug in this code if connections attempt to listen to the same user / lobby / queuer ID it will close the existing connection -_-. I use this function like so:
It closes the connection like so:
|
@mlucy How do I get a core dump? Also db-2 crashed today with what looks to be the same error:
I upgraded to rethinkdb 15.1 yesterday hoping it might fix the issue but looks like it didn't. |
@timjrobinson: if you have core dumps enabled there will be a file |
Sorry to hear it crashed again! At least it's reproducible, which will help a lot with tracking it down. If you could get a core file for us, that would be a big help. In the meantime I'm going to take another look at the surrounding code. |
What directory should this core file be in on Ubuntu 14.04? I checked /var/lib/rethinkdb/instances.d/db-2, /etc/rethinkdb/instances.d and /usr/bin and it's not in any of those folders. I guess it doesn't work until I do ulimit -c unlimited. To get the core dump working I've done the following: Logged in as root then:
Should this be all I need to do? |
db-4 crashed today with the same error:
A core file was created in / but it's 5.3GB. What's the best way of getting it to you guys? |
@timjrobinson can you write an email to mike at rethinkdb.com ? He will send you directions on how to upload the core file. I also recommend gzipping it to reduce its size a bit. (@mglukhovsky ) |
(@timjrobinson Also, note that the core file will contain data from your database. If you have sensitive data in there, we are happy to sign an NDA.) |
@danielmewes, I've worked with @timjrobinson to transfer the core file to our internal servers, so you should be all set. |
Thanks @timjrobinson, @mglukhovsky . Looking at this now. |
Alright, @danielmewes and I looked at this together and we think we have a fix for this. @timjrobinson -- if we put together a custom binary for you, could you try running it and see if this problem goes away? @AtnNn -- could we put together a custom binary based on https://github.com/rethinkdb/rethinkdb/tree/mlucy_3205 ? (It's branched off |
(That branch fixes the problem and also turns on some guarantees, so we'll have more information if we misdiagnosed it and the server crashes again.) |
Yes. I will build a package for Trusty amd64 from the mlucy_3205 branch. |
Here is the package: http://atnnn.github.io/rethinkdb/rethinkdb_1.15.1+1+g505d7d~0trusty_amd64.deb It can be installed using dpkg, replacing an already installed rethinkdb_1.15.1 package:
|
@timjrobinson -- not to be a bother, but did you ever get a chance to test this? |
Sorry I haven't had a chance yet. Will try it tonight. |
@timjrobinson -- Thanks! Here's hoping it fixes the problem. |
@mlucy I would like to ship the likely fix you've implemented with 1.15.2. I'm happy to do the code review so we can merge it today. |
I've installed it, haven't had any crashes yet, usually crashes once every 24 hours, will let you guys know in a day or two. |
Thank you for your feedback @timjrobinson . |
Alright, this is in next and v1.15.x . @timjrobinson -- if it turns out this does crash in production even after the fix, please re-open this and we'll look into it again. (Also, thanks again for all the help.) |
One of my servers in my 6 node cluster just crashed with the following error:
RethinkDB version:
Ubuntu version 14.04
The text was updated successfully, but these errors were encountered: