Empty tables consume huge meory without any querying #3463

AlexLuya · 2014-12-19T15:26:46Z

In my dev machine:

ubuntu 14.04 64bit  8gb memory
rethinkdb 1.15.2 64bit  
single node  
no sharding
cache size=512
(log print out:Using cache size of 512 MB Posted by x_space_vkr)

I have 70 tables,and almost all empty(some may contains 1~5 records for testing).After ubuntu startup,rethinkdb consumes 5 gb(RES as you can see) memory constantly even without any querying.

I remember that my machine was dead several times,and it was restarted by pushing power button.I doubt some file system errors caused current problem,but all reading/writing/querying works fine,so it can be sure,it seems that no way can be figured out to check and confirm this doubting.

Does meta data record memory allocating info,I mean:what table or operation are consuming how much memory.if it does, and some utilities can be used to read and print out these infos,debugging will be easier.

The text was updated successfully, but these errors were encountered:

danielmewes · 2014-12-19T22:00:23Z

Hi @AlexLuya, thanks for opening an issue here. We'll investigate this.

Can you send us a copy of your RethinkDB data directory? It is probably going to compress very well if you gzip/tar it, but in case it's too big for an email, please let us know and we can set up an upload server for you.
(my email: daniel at rethinkdb.com)

danielmewes · 2014-12-19T22:02:33Z

For reference to others reading this: I had tried to reproduce the issue, but after creating 70 tables and inserting a document into each, I only got a memory usage of ~700 MB. After restarting RethinkDB, it was even smaller in my case (~300 MB). So something odd is going on here.

AlexLuya · 2014-12-20T01:15:21Z

@danielmewes I found that the size of metadata is 1.5 gb,and >500MB after compressing,so upload server may be needed

danielmewes · 2014-12-20T01:54:34Z

Wow! 1.5 GB is extremely big for the metadata file. Maybe that's where something went wrong...
In any case, we can investigate this once we have the data.

@mglukhovsky can you send @AlexLuya an email with the upload instructions?

mglukhovsky · 2014-12-20T02:38:33Z

@AlexLuya, just sent you an email, thanks for helping us track this down!

AlexLuya · 2014-12-20T04:40:23Z

@danielmewes Data has been uploaded,and I think the problem may be caused by data migration.

mglukhovsky · 2014-12-21T08:08:28Z

@danielmewes, a data file for this issue is available on our internal servers.

danielmewes · 2014-12-22T19:47:16Z

Thank you for the data @AlexLuya and @mglukhovsky . I'm going to take a look now.

danielmewes · 2014-12-23T03:05:12Z

The problem is that
a) the branch history grows bigger and bigger over time. In this case it reached a serialized size of 10 MB.
b) we didn't estimate the expected change count correctly when re-writing the branch history. This stopped the cache from throttling the writes to remain approximately within its memory limit (1 MB in the case of the metadata cache). I still have to check the details of this, but I think that since a lot of shards were trying to rewrite the branch history at the same time, the unwritten dirty pages in the cache went way above the cache limit. Passing in a reasonable expected change count solves the memory consumption issue for me.

Part a) is probably tricky to fix and needs more thought, but I'm preparing a fix for part b) now.

Generally the tables take a very long time to become available, which might also be caused by the big branch history that each shard is rewriting to disk over and over. We should at least combine those writes when possible.

@AlexLuya as a work-around, I recommend deleting the data directory and re-creating the tables. That should make the metadata small again.

danielmewes · 2014-12-23T22:58:57Z

I ended up implementing a logic that limits the number of concurrent branch history flushes (to disk). This allows multiple changes to the branch history to be combined into a single disk write.

The result is that not only the memory issue goes away, but the tables become available virtually instantly after starting the server. Without this, they take about a minute even on an SSD, while the branch history is rewritten over and over.
As another neat side-effect, the on-disk size of the metadata file goes down to less than 300 MB compared to up to ~1.4 GB previously. I expect that it's going to shrinks even more over time as garbage collection keeps moving data to the beginning of the file.

Unfortunately this is a slightly bigger change, so I'm not going to merge the fix into v1.15.x. Instead we are going to release it with RethinkDB 1.16.

Again, as a work-around for now, I recommend starting over with a fresh database (i.e. delete the RethinkDB data directory).

danielmewes · 2014-12-23T23:42:33Z

The fix is in branch daniel_3463_116 and in CR 2420 by @timmaxw .

AlexLuya · 2014-12-24T09:55:48Z

Thanks,already done as you suggested.

danielmewes · 2014-12-29T20:37:35Z

Fixed in next as of 4087906 . Going to ship with 1.16.0.

danielmewes added the tp:bug label Dec 19, 2014

danielmewes added this to the 1.15.x milestone Dec 19, 2014

danielmewes self-assigned this Dec 23, 2014

danielmewes modified the milestones: 1.16, 1.15.x Dec 23, 2014

danielmewes closed this as completed Dec 29, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Empty tables consume huge meory without any querying #3463

Empty tables consume huge meory without any querying #3463

AlexLuya commented Dec 19, 2014

danielmewes commented Dec 19, 2014

danielmewes commented Dec 19, 2014

AlexLuya commented Dec 20, 2014

danielmewes commented Dec 20, 2014

mglukhovsky commented Dec 20, 2014

AlexLuya commented Dec 20, 2014

mglukhovsky commented Dec 21, 2014

danielmewes commented Dec 22, 2014

danielmewes commented Dec 23, 2014

danielmewes commented Dec 23, 2014

danielmewes commented Dec 23, 2014

AlexLuya commented Dec 24, 2014

danielmewes commented Dec 29, 2014

Empty tables consume huge meory without any querying #3463

Empty tables consume huge meory without any querying #3463

Comments

AlexLuya commented Dec 19, 2014

danielmewes commented Dec 19, 2014

danielmewes commented Dec 19, 2014

AlexLuya commented Dec 20, 2014

danielmewes commented Dec 20, 2014

mglukhovsky commented Dec 20, 2014

AlexLuya commented Dec 20, 2014

mglukhovsky commented Dec 21, 2014

danielmewes commented Dec 22, 2014

danielmewes commented Dec 23, 2014

danielmewes commented Dec 23, 2014

danielmewes commented Dec 23, 2014

AlexLuya commented Dec 24, 2014

danielmewes commented Dec 29, 2014