-
Notifications
You must be signed in to change notification settings - Fork 35.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow memory leak in v22.0? #24542
Comments
I think I've seen the same thing; but mostly for me it just looks like "bitcoind has a bunch of memory in swap and it takes a while to deallocate when shutting down" so I haven't been able to get any insight into what's actually going on. I think the same memory leak may show up in liquid/elements, but with higher severity (due to the higher block rate) EDIT: maybe worth noting that the "takes a while to deallocate" occurs after |
Maybe |
I went to check on my node today expecting higher RAM usage than before, but it was actually down to 1.9 GB. Puzzled, I checked and it looks like the node actually shutdown 2 days ago (and systemd restarted it automatically). Looking at the logs, there are tons of these
That's the last log message before it switches to startup logs. Looks like peers started dropping my node, I'm guessing because it was unresponsive, and so it was making a lot of outbound connections, and looks like block syncing crawled to a stop. I'll try to keep tabs on things and keep this issue updated if I see anything else interesting. |
These are likely caused by multiple parallel |
Thanks, iirc when I tried making rpc calls via |
Hi facing similar issue. I'm running an rpc node. My memory usage grows beyond 36GB. I only see this problem when setting Did you figure out why the leak? |
Also 22.0? Which value of |
FYI, I am now using v24.0.1 with I increased the |
Thanks @sangaman. Additionally, do you have any estimates on how many rpcs you are making per time to the node? |
Hmm I'm not exactly sure. This bitcoind node is primarily used to serve an lnd node as well as an electrum server and mempool.space block explorer, although the latter two get very light usage aside from staying in sync with the chain. If you think it'd be helpful I can try to turn on debug logs and see how many rpc calls are made over a certain period of time. |
I'm running I could solve the issue by setting the below env var. By default, since glibc 2.10, the C library will create up to two heap
without this the usage grows beyond 36GB, with this it stays under 2GB ref: https://github.com/bitcoin/bitcoin/blob/master/doc/reduce- |
Thank you @cshintov ! I'll try this env var out and see if it solves my issue without having to change my config. |
So far my results with |
This adds the `MALLOC_ARENA_MAX=1` environment variable as suggested in https://github.com/bitcoin/bitcoin/blob/master/doc/reduce-memory.md#linux-specific to the systemd service file definition. Without this env var, memory usage can grow significantly especially when `rpcthreads` is increased above its default value. Closes bitcoin#24542.
I think setting this to 1 will give significantly worse performance for most users though? Seems like it would be better to add it to the documentation if anything... |
This adds the `MALLOC_ARENA_MAX=1` environment variable as suggested in https://github.com/bitcoin/bitcoin/blob/master/doc/reduce-memory.md#linux-specific to the systemd service file definition. Without this env var, memory usage can grow significantly especially when `rpcthreads` is increased above its default value. Closes bitcoin#24542.
This adds the `MALLOC_ARENA_MAX=1` environment variable as suggested in https://github.com/bitcoin/bitcoin/blob/master/doc/reduce-memory.md#linux-specific to the systemd service file definition. Without this env var, memory usage can grow significantly especially when `rpcthreads` is increased above its default value. Closes bitcoin#24542.
This adds the `MALLOC_ARENA_MAX=1` environment variable as suggested in https://github.com/bitcoin/bitcoin/blob/master/doc/reduce-memory.md#linux-specific to the systemd service file definition. Without this env var, memory usage can grow significantly especially when `rpcthreads` is increased above its default value. Closes bitcoin#24542.
It is in the documentation already at https://github.com/bitcoin/bitcoin/blob/master/doc/reduce-memory.md.
Per the documentation above that wouldn't be the case although I can't say for sure. I certainly didn't notice any decreased performance when setting this env var.
|
Is this still a problem with v26.0 (or later)? |
Not sure what can be done here. It would be good to use a tool to debug this, e.g. #24542 (comment), if it still happens. |
OK, I think I have pretty much identified the root cause here. Calling There is then an additional 15-20MB allocated for After these allocations are made they then seem to be re-used when possible in the future, so long as subsequent blocks are small enough to fit. This totals up to the ~100MB RSS increase observed sometimes when getting blocks via REST. In the event that blocks are fetched in series in increasing sizes, repeated allocations (of increasing sizes) may be made, causing even higher apparent resource usage. The cause seems to be two-fold:
It appears that this has been investgated previously (see also here) in search of better speed, but having a move semantic for UniValue would also shave off half of this function's allocation requirement. Thanks @stickies-v for helping me investigate |
And if I understand correctly, it is then copied again here since |
Discovered another attempt to avoid copies from @maflcko #25429 while researching further. Ref my previous test in #30052, I have a branch with some improvements (heavily "insipred" by @martinus 's commit jgarzik/univalue@e9109e2) which limit allocations to about 60-70MB: This is about what I'd expect to be the minimum size based on the allocation for the JSON string, and the UniValue Object. |
Can you explain this? It has a move constructor, at least the following compiles for me: diff --git a/src/univalue/lib/univalue.cpp b/src/univalue/lib/univalue.cpp
index 656d2e8203..b3a33f36ac 100644
--- a/src/univalue/lib/univalue.cpp
+++ b/src/univalue/lib/univalue.cpp
@@ -125,6 +125,7 @@ void UniValue::pushKVEnd(std::string key, UniValue val)
void UniValue::pushKV(std::string key, UniValue val)
{
+static_assert(std::is_move_constructible_v<UniValue>);
checkType(VOBJ);
size_t idx;
I presume this can be fixed by adding a |
I had a PR somewhere where I removed the copy constructor and made it mandatory to call a That's nice because the compiler finds all problems automatically, but requires to touch quite a lot of places. |
I think you are correct here, the compiler implicitly generates move-contructors which are enough to fix this issue, if combined with appropriate
The test script requests every block from (0 - 844,000) % 5,000, and measures the RSS after each call. |
Style-wise I found it a bit unfortunate, because it forces code to either specify |
Don't do that. You'll break guaranteed copy elision, which requires the compiler to skip the construction of a temporary object in certain expressions and statements but can only happen if an accessible copy constructor exists (even though it is not called). By deleting the copy constructor, you are denying all opportunities for copy elision and are forcing the compiler always to construct temporaries and to move from them subsequently. |
I've been noticing gradually increasing memory usage bitcoin core v22.0. I've been running this node for a couple of years with the same hardware and general setup and have only noticed this in the past few months, around the time I upgraded to v22, which makes me suspect it may be related.
I'm seeing bitcoind memory usage slowly climb over weeks/months of uptime. Right now it's at 3.4 GB memory usage and as I recall I've seen it over 5 GB before in an incident where it caused the machine to run out of RAM, which is when I first noticed this. If I restart bitcoind, RAM usage goes back to a normal level and stays there for a while.
I do recall seeing a lot of
BlockUntilSyncedToCurrentChain: txindex is catching up on block notifications
messages in the logs when RAM usage was over 5GB and my node was barely responsive to RPC calls, but I'm not sure if this has anything to do with the issue.I'm using
dbcache=1024
which would explain some of the RAM usage but not all of it.Other config settings that may be relevant:
Otherwise, I'm using this node to run an lnd node (with zmq block/tx notifications) and electrumx server. There's a medium-sized watch-only wallet that's loaded. There's also a small script I run with
blocknotify
.This is all running on an RPI4 with 8GB RAM, using raspbian buster w/ the OS and block indexes on an SSD.
I'd be interested if others have seen anything similar out of their nodes or if perhaps this RAM usage is normal given how I'm using it, but I am pretty sure this wasn't happening a year ago with similar usage patterns.
The text was updated successfully, but these errors were encountered: