Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database corruption during heavy load #2928

Closed
JosephHewitt opened this issue Aug 17, 2014 · 12 comments
Closed

Database corruption during heavy load #2928

JosephHewitt opened this issue Aug 17, 2014 · 12 comments

Comments

@JosephHewitt
Copy link

I've experienced this problem 3 times now, so it's not just a one off. I'm running rethinkdb on a rather low spec server (1GB RAM with 3GB swap, 2.0GHz single core processor).

When a large amount of data is written to rethinkdb, rethinkdb crashes and refuses to start up again with the following error;
Guarantee failed: [state_ != state_in_use || extent_use_refcount == 0]
and
backtrace_t::backtrace_t() at 0x884cf7d (/usr/bin/rethinkdb)\n2: format_backtrace(bool) at 0x884d2f8 (/usr/bin/rethinkdb)\n14: coro_t::run() at 0x886decd (/usr/bin/rethinkdb)

This can be reproduced by getting a table with 3 million entires, and telling rethinkdb to create an index for one of the rows.

The only way I've found to fix the problem is to delete the entire database and instance, then start again.

@danielmewes
Copy link
Member

Hi @JOE95443,
thank you for the issue report. I have a few more questions about your setup:

  • Which version of RethinkDB are you running?
  • Which version of Linux (or is this on OS X?)? What's the output of uname -a?
  • Are you using SSD or rotational storage?
  • Which file system is the RethinkDB data on? Are you using any special mount options?
  • Is this running in a virtual or virtualized machine?

We are about to ship a couple of fixes with RethinkDB 1.13.4 in the next days, though at this point I cannot say whether those will fix this problem as well or if this is something different.

@danielmewes danielmewes added this to the 1.14.x milestone Aug 18, 2014
@JosephHewitt
Copy link
Author

Hi, I don't have access to my server for a while, so I'll answer what I can and fill in the rest later;

I only installed rethinkdb 2 weeks ago straight from your website, so I think I'm running the latest version but will confirm later

I'm running Ubuntu 14.04 server with all updates installed and very little custom software

I'm using rotational storage running on a VPS. Not sure what kind of storage they're using in terms of RAID or disk speeds, but I know I'm not the only person using this disk/array of disks.

Sorry I can't provide more details right now

@danielmewes
Copy link
Member

@JOE95443 thank you for the information. We've just released RethinkDB 1.13.4 and I recommend updating to that version as a first step.

Can you send us a copy of the table files (rethinkdb_data directory) when this happens the next time? Just tar/gzip the whole directory. It should compress pretty well. Once you have the file, let us know and @mglukhovsky can set up a secure upload page for it.
I would like to take a look at the file to see how exactly it is corrupted.

@JosephHewitt
Copy link
Author

Hi, I haven't deleted the files from the last time it happened. Would you like me to compress and send these files or attempt to reproduce the problem on version 1.13.4 and send those files instead?

@danielmewes
Copy link
Member

@JOE95443 the old files will do.
@mglukhovsky can you arrange an upload page for @JOE95443 ?

@mglukhovsky
Copy link
Member

@JOE95443: thanks for offering to send a copy of your data directory, so we can track down this issue.

Send me an email at mike@rethinkdb.com, and I'll go ahead and set up a secure server that you can scp your data directory to -- thanks!

@mglukhovsky
Copy link
Member

@danielmewes, @JOE95443 sent over a copy of his data files, and they're available on our internal servers.

@danielmewes
Copy link
Member

This appears to be a 32 bit only issue. I can reproduce it now, and am looking for what's causing this.

@danielmewes
Copy link
Member

We had a bug which lead to file sizes being represented by a 32 bit unsigned integer on 32 bit systems. That caused these crashes for tables that are bigger than 4 GB.

A fix is up in code review 2024 by @Tryneus .

@larkost / @AtnNn we should prepare a 1.14.1 point release once the fix is in v1.14.x.

@danielmewes
Copy link
Member

Fixed in v1.14.x 11a01a4 and next.

@larkost / @AtnNn / @coffeemug Can we do a point release?

@AtnNn
Copy link
Member

AtnNn commented Sep 5, 2014

Sorry, I only just saw your message. I'll get the gears in motion for a release of 1.14.1 early next week.

@AtnNn AtnNn modified the milestones: 1.14.1, 1.14.x Sep 5, 2014
@AtnNn
Copy link
Member

AtnNn commented Sep 12, 2014

@JOE95443 The fix for this issue has been released in RethinkDB 1.14.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants