Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache size does not have any effect #6275

Closed
Andarius opened this issue Feb 20, 2017 · 10 comments
Closed

Cache size does not have any effect #6275

Andarius opened this issue Feb 20, 2017 · 10 comments

Comments

@Andarius
Copy link

I'm running rethinkdb in docker (alpine 2.3.5) with --cache-size 15000 set (one node here).
However on some heavy queries rethinkdb uses a lot more than the allowed memory (up to 64Gb) then run up of memory and dies.

Here is the full stack trace of the error:

warn: Some RethinkDB data on this server has been placed into swap memory. This may impact performance.
rethinkdb: Memory allocation failed. This usually means that we have run out of RAM. Aborting.
Version: rethinkdb 2.3.5~0jessie (GCC 4.9.2)
error: Error in src/arch/runtime/thread_pool.cc at line 367:
error: Segmentation fault from reading the address (nil).
error: Backtrace:
error: Mon Feb 20 12:20:32 2017

   1 [0xae7500]: backtrace_t::backtrace_t() at 0xae7500 (rethinkdb)
   2 [0xae7879]: format_backtrace(bool) at 0xae7879 (rethinkdb)
   3 [0xd9f6c3]: report_fatal_error(char const*, int, char const*, ...) at 0xd9f6c3 (rethinkdb)
   4 [0x9f0254]: linux_thread_pool_t::fatal_signal_handler(int, siginfo_t*, void*) at 0x9f0254 (rethinkdb)
   5 [0x7f46af6db8d0]: /lib/x86_64-linux-gnu/libpthread.so.0(+0xf8d0) [0x7f46af6db8d0] at 0x7f46af6db8d0 (/lib/x86_64-linux-gnu/libpthread.so.0)
   6 [0x7f46af357532]: abort+0x232 at 0x7f46af357532 (/lib/x86_64-linux-gnu/libc.so.6)
   7 [0xd9f424]: rethinkdb() [0xd9f424] at 0xd9f424 ()
   8 [0xd9f439]: rethinkdb() [0xd9f439] at 0xd9f439 ()
   9 [0x7f46afe5f2fc]: operator new(unsigned long) at 0x7f46afe5f2fc (/usr/lib/x86_64-linux-gnu/libstdc++.so.6)
   10 [0xa4f932]: void std::vector<char, std::allocator<char> >::_M_range_insert<char const*>(__gnu_cxx::__normal_iterator<char*, std::vector<char, std::allocator<char> > >, char const*, char const*, std::forward_iterator_tag) at 0xa4f932 (rethinkdb)
   11 [0xa4f7b1]: vector_stream_t::write(void const*, long) at 0xa4f7b1 (rethinkdb)
   12 [0xa5161e]: send_write_message(write_stream_t*, write_message_t const*) at 0xa5161e (rethinkdb)
   13 [0xa147da]: raw_mailbox_writer_t::write(write_stream_t*) at 0xa147da (rethinkdb)
   14 [0x9fd35a]: connectivity_cluster_t::send_message(connectivity_cluster_t::connection_t*, auto_drainer_t::lock_t, unsigned char, cluster_send_message_write_callback_t*) at 0x9fd35a (rethinkdb)
   15 [0xa133dd]: send_write(mailbox_manager_t*, raw_mailbox_t::address_t, mailbox_write_callback_t*) at 0xa133dd (rethinkdb)
   16 [0xbaafd2]: primary_query_server_t::client_t::perform_request(boost::variant<primary_query_bcard_t::read_request_t, primary_query_bcard_t::write_request_t, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, signal_t*) at 0xbaafd2 (rethinkdb)
   17 [0xbb3536]: multi_client_server_t<boost::variant<primary_query_bcard_t::read_request_t, primary_query_bcard_t::write_request_t, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>, primary_query_server_t*, primary_query_server_t::client_t>::client_t::on_request(signal_t*, boost::variant<primary_query_bcard_t::read_request_t, primary_query_bcard_t::write_request_t, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) at 0xbb3536 (rethinkdb)
   18 [0xbb32bf]: mailbox_t<void (boost::variant<primary_query_bcard_t::read_request_t, primary_query_bcard_t::write_request_t, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_>)>::read_impl_t::read(read_stream_t*, signal_t*) at 0xbb32bf (rethinkdb)
   19 [0xa14442]: mailbox_manager_t::mailbox_read_coroutine(threadnum_t, unsigned long, std::vector<char, std::allocator<char> >*, long, mailbox_manager_t::force_yield_t) at 0xa14442 (rethinkdb)
   20 [0xa14542]: rethinkdb() [0xa14542] at 0xa14542 ()
   21 [0x9f2c47]: coro_t::run() at 0x9f2c47 (rethinkdb)

error: Exiting.

@srh
Copy link
Contributor

srh commented Feb 20, 2017

The cache size parameter doesn't affect intermediate query computation sizes. But there should be some way to deal with that, without the server dying. In my opinion this is a known defect of the product, and I'd like it to be fixed someday. But you have 49 GB to burn through! What sort of query are you running?

@Andarius
Copy link
Author

The cache size parameter doesn't affect intermediate query computation sizes

I agree. But then, when the computation ends, why doesn't it free the cache memory used for the query ? Also I don't get the X% cached used in the interface. Right now, no requests are running and rethink is using 25G but I can see 93% cache used with a --cache-size of 15000 and a max memory for the docker of 35G.
The query I ran was a aggregation on a 14 million rows table.

@srh
Copy link
Contributor

srh commented May 24, 2017

What was the actual query?

@Andarius
Copy link
Author

Here is one:

req = (rdb.table(Price.table)
    .group(rdb.row['store_id'], index='field_id')
    .count()
    .ungroup()
)

@marshall007
Copy link
Contributor

@Andarius I don't think you can group by a secondary index and a field like that. You have to group by either a secondary index or one or more fields/functions.

@AtnNn
Copy link
Member

AtnNn commented Jul 13, 2017

no requests are running and rethink is using 25G but I can see 93% cache used with a --cache-size of 15000

That would mean it is using 14.2 GB for the cache and 10.8 GB for other data.

Does the memory usage go down if you query it with r.expr(1) a few times?

Do you have a large amount of tables? I believe he metadata for those can consume a lot of memory in some cases.

It is also possible there is a memory leak somewhere.

@Andarius
Copy link
Author

Andarius commented Jul 16, 2017

Does the memory usage go down if you query it with r.expr(1) a few times?

No it does not.

Do you have a large amount of tables?

In total I have around 10 tables, but when I run the request that it's only on 1 table and it fills the cache.

@lciummo
Copy link

lciummo commented Dec 11, 2017

Are you confusing cache usage and disk space. the GB number is disk space I beleive. I have a 4GB RAM VM and have 100GB od disk space.

@GeoffreyPlitt
Copy link

This exact thing happens to me a lot on my cluster serving a production environment. I have a sense that lowering my cache by a certain amount will free up the headroom needed for table metadata and per-query memory, but i have no idea how to figure out what that amount is. And if I'm even slightly off and the instance swaps more than a little, everything comes to a halt.

@lciummo
Copy link

lciummo commented Jan 5, 2018

Looking at the original error in this thread (a segfault) is different than the error we see with rethinkdb on mem issues - we saw a call to an "out of memory" in rethinkdb that cause the process to halt.

We added a --restart always to the docker image to get around it. Adding RAM just made it fail less often (two days vs a few hours).

Doesn't seem like large DB's of hundreds of megs are handled well.

If you believe swapping is an issue, you might look in Linux hugetlb/hugepage processing. That helped a similar mysql issue a few years back.

@Andarius Andarius closed this as completed Jan 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants