New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty tables consume huge meory without any querying #3463
Comments
Hi @AlexLuya, thanks for opening an issue here. We'll investigate this. Can you send us a copy of your RethinkDB data directory? It is probably going to compress very well if you gzip/tar it, but in case it's too big for an email, please let us know and we can set up an upload server for you. |
For reference to others reading this: I had tried to reproduce the issue, but after creating 70 tables and inserting a document into each, I only got a memory usage of ~700 MB. After restarting RethinkDB, it was even smaller in my case (~300 MB). So something odd is going on here. |
@danielmewes I found that the size of metadata is 1.5 gb,and >500MB after compressing,so upload server may be needed |
Wow! 1.5 GB is extremely big for the metadata file. Maybe that's where something went wrong... @mglukhovsky can you send @AlexLuya an email with the upload instructions? |
@AlexLuya, just sent you an email, thanks for helping us track this down! |
@danielmewes Data has been uploaded,and I think the problem may be caused by data migration. |
@danielmewes, a data file for this issue is available on our internal servers. |
Thank you for the data @AlexLuya and @mglukhovsky . I'm going to take a look now. |
The problem is that Part a) is probably tricky to fix and needs more thought, but I'm preparing a fix for part b) now. Generally the tables take a very long time to become available, which might also be caused by the big branch history that each shard is rewriting to disk over and over. We should at least combine those writes when possible. @AlexLuya as a work-around, I recommend deleting the data directory and re-creating the tables. That should make the metadata small again. |
I ended up implementing a logic that limits the number of concurrent branch history flushes (to disk). This allows multiple changes to the branch history to be combined into a single disk write. The result is that not only the memory issue goes away, but the tables become available virtually instantly after starting the server. Without this, they take about a minute even on an SSD, while the branch history is rewritten over and over. Unfortunately this is a slightly bigger change, so I'm not going to merge the fix into v1.15.x. Instead we are going to release it with RethinkDB 1.16. Again, as a work-around for now, I recommend starting over with a fresh database (i.e. delete the RethinkDB data directory). |
The fix is in branch daniel_3463_116 and in CR 2420 by @timmaxw . |
Thanks,already done as you suggested. |
Fixed in next as of 4087906 . Going to ship with 1.16.0. |
In my dev machine:
I have 70 tables,and almost all empty(some may contains 1~5 records for testing).After ubuntu startup,rethinkdb consumes 5 gb(RES as you can see) memory constantly even without any querying.
I remember that my machine was dead several times,and it was restarted by pushing power button.I doubt some file system errors caused current problem,but all reading/writing/querying works fine,so it can be sure,it seems that no way can be figured out to check and confirm this doubting.
Does meta data record memory allocating info,I mean:what table or operation are consuming how much memory.if it does, and some utilities can be used to read and print out these infos,debugging will be easier.
The text was updated successfully, but these errors were encountered: