Database corruption during heavy load #2928

JosephHewitt · 2014-08-17T19:57:03Z

I've experienced this problem 3 times now, so it's not just a one off. I'm running rethinkdb on a rather low spec server (1GB RAM with 3GB swap, 2.0GHz single core processor).

When a large amount of data is written to rethinkdb, rethinkdb crashes and refuses to start up again with the following error;
Guarantee failed: [state_ != state_in_use || extent_use_refcount == 0]
and
backtrace_t::backtrace_t() at 0x884cf7d (/usr/bin/rethinkdb)\n2: format_backtrace(bool) at 0x884d2f8 (/usr/bin/rethinkdb)\n14: coro_t::run() at 0x886decd (/usr/bin/rethinkdb)

This can be reproduced by getting a table with 3 million entires, and telling rethinkdb to create an index for one of the rows.

The only way I've found to fix the problem is to delete the entire database and instance, then start again.

The text was updated successfully, but these errors were encountered:

danielmewes · 2014-08-18T18:47:53Z

Hi @JOE95443,
thank you for the issue report. I have a few more questions about your setup:

Which version of RethinkDB are you running?
Which version of Linux (or is this on OS X?)? What's the output of uname -a?
Are you using SSD or rotational storage?
Which file system is the RethinkDB data on? Are you using any special mount options?
Is this running in a virtual or virtualized machine?

We are about to ship a couple of fixes with RethinkDB 1.13.4 in the next days, though at this point I cannot say whether those will fix this problem as well or if this is something different.

JosephHewitt · 2014-08-18T19:36:22Z

Hi, I don't have access to my server for a while, so I'll answer what I can and fill in the rest later;

I only installed rethinkdb 2 weeks ago straight from your website, so I think I'm running the latest version but will confirm later

I'm running Ubuntu 14.04 server with all updates installed and very little custom software

I'm using rotational storage running on a VPS. Not sure what kind of storage they're using in terms of RAID or disk speeds, but I know I'm not the only person using this disk/array of disks.

Sorry I can't provide more details right now

danielmewes · 2014-08-18T21:48:49Z

@JOE95443 thank you for the information. We've just released RethinkDB 1.13.4 and I recommend updating to that version as a first step.

Can you send us a copy of the table files (rethinkdb_data directory) when this happens the next time? Just tar/gzip the whole directory. It should compress pretty well. Once you have the file, let us know and @mglukhovsky can set up a secure upload page for it.
I would like to take a look at the file to see how exactly it is corrupted.

JosephHewitt · 2014-08-19T01:38:22Z

Hi, I haven't deleted the files from the last time it happened. Would you like me to compress and send these files or attempt to reproduce the problem on version 1.13.4 and send those files instead?

danielmewes · 2014-08-19T20:21:43Z

@JOE95443 the old files will do.
@mglukhovsky can you arrange an upload page for @JOE95443 ?

mglukhovsky · 2014-08-20T21:01:00Z

@JOE95443: thanks for offering to send a copy of your data directory, so we can track down this issue.

Send me an email at mike@rethinkdb.com, and I'll go ahead and set up a secure server that you can scp your data directory to -- thanks!

mglukhovsky · 2014-08-23T00:00:35Z

@danielmewes, @JOE95443 sent over a copy of his data files, and they're available on our internal servers.

danielmewes · 2014-08-27T01:41:11Z

This appears to be a 32 bit only issue. I can reproduce it now, and am looking for what's causing this.

danielmewes · 2014-08-27T02:34:17Z

We had a bug which lead to file sizes being represented by a 32 bit unsigned integer on 32 bit systems. That caused these crashes for tables that are bigger than 4 GB.

A fix is up in code review 2024 by @Tryneus .

@larkost / @AtnNn we should prepare a 1.14.1 point release once the fix is in v1.14.x.

danielmewes · 2014-08-27T22:12:38Z

Fixed in v1.14.x 11a01a4 and next.

@larkost / @AtnNn / @coffeemug Can we do a point release?

AtnNn · 2014-09-05T01:07:19Z

Sorry, I only just saw your message. I'll get the gears in motion for a release of 1.14.1 early next week.

AtnNn · 2014-09-12T22:12:49Z

@JOE95443 The fix for this issue has been released in RethinkDB 1.14.1

danielmewes added this to the 1.14.x milestone Aug 18, 2014

danielmewes added tp:bug labels Aug 18, 2014

danielmewes closed this as completed Aug 27, 2014

AtnNn modified the milestones: 1.14.1, 1.14.x Sep 5, 2014

AtnNn mentioned this issue Sep 5, 2014

RethinkDB 1.14.1 release preparation #3016

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Database corruption during heavy load #2928

Database corruption during heavy load #2928

JosephHewitt commented Aug 17, 2014

danielmewes commented Aug 18, 2014

JosephHewitt commented Aug 18, 2014

danielmewes commented Aug 18, 2014

JosephHewitt commented Aug 19, 2014

danielmewes commented Aug 19, 2014

mglukhovsky commented Aug 20, 2014

mglukhovsky commented Aug 23, 2014

danielmewes commented Aug 27, 2014

danielmewes commented Aug 27, 2014

danielmewes commented Aug 27, 2014

AtnNn commented Sep 5, 2014

AtnNn commented Sep 12, 2014

Database corruption during heavy load #2928

Database corruption during heavy load #2928

Comments

JosephHewitt commented Aug 17, 2014

danielmewes commented Aug 18, 2014

JosephHewitt commented Aug 18, 2014

danielmewes commented Aug 18, 2014

JosephHewitt commented Aug 19, 2014

danielmewes commented Aug 19, 2014

mglukhovsky commented Aug 20, 2014

mglukhovsky commented Aug 23, 2014

danielmewes commented Aug 27, 2014

danielmewes commented Aug 27, 2014

danielmewes commented Aug 27, 2014

AtnNn commented Sep 5, 2014

AtnNn commented Sep 12, 2014