Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory in long term runs #412

Closed
wzrdtales opened this issue Feb 17, 2016 · 32 comments
Closed

Out of memory in long term runs #412

wzrdtales opened this issue Feb 17, 2016 · 32 comments

Comments

@wzrdtales
Copy link

It seems like there still are some memory problems, after about a month bitcore node crashes with the following message:

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory

Details about the setup:
Bitcore Node now runs now on a machine with 64GiB dedicated RAM and has about 100 to 200 connections open at the same time. Currently it also allows up to 500 connections to be opened at the same time.

Additional config settings are:
minrelaytxfee=0.00002
limitfreerelay=70
maxconnections=500

Unfortunately I do not have anything else for you yet, if you want something specific let me know and I will post it on the next crash.

@kangasbros
Copy link

Same issue here.

@gabegattis
Copy link
Contributor

What version were you running when you got the out-of-memory error?

@kangasbros
Copy link

0.31.0

@braydonf
Copy link
Contributor

There isn't a bitcore-node version 0.31.0

@kangasbros
Copy link

Sorry, looked at the wrong number. I'm newbie with these, I'm currently trying to look how to upgrade the software, how can I look up the version number?

@kangasbros
Copy link

bitcore-node --version prints out undefined for me

@braydonf
Copy link
Contributor

We should add an --version option: bitpay/bitcore#1368

@wzrdtales
Copy link
Author

@braydonf For me you can consider it to be always the latest version.

@wzrdtales
Copy link
Author

Btw. bitcore runs round about 20 to 30 days before crashing. The used memory grows to a size of about just 6 to 8GiB before it crashes.

@braydonf
Copy link
Contributor

It may be worth checking the size of the mempool near that point with the bitcoin-cli utility:

$ bitcoin-cli getmempoolinfo

@braydonf
Copy link
Contributor

@kangasbros You can find the version in the installed package.json currently.

@kangasbros
Copy link

My versions: {
"description": "A full Bitcoin node build with Bitcore",
"repository": "https://github.com/user/project",
"license": "MIT",
"readme": "README.md",
"dependencies": {
"bitcore-lib": "^v0.13.13",
"bitcore-node": "^2.1.0",
"insight-api": "^0.3.2",
"insight-ui": "^0.3.1"
}
}

@wzrdtales
Copy link
Author

And here you go, some logs from the crash:

[2016-04-21T18:07:03.391Z] info: Bitcoin Height: 408333 Percentage: 100.00001525878906
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory

<--- Last few GCs --->

611321551 ms: Scavenge 1395.1 (1441.0) -> 1395.1 (1441.0) MB, 6.5 / 0 ms (+ 2.2 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
611323174 ms: Mark-sweep 1395.1 (1441.0) -> 1393.5 (1440.0) MB, 1622.5 / 0 ms (+ 49.0 ms in 431 steps since start of marking, biggest step 2.3 ms) [last resort gc].
611324774 ms: Mark-sweep 1393.5 (1440.0) -> 1390.1 (1441.0) MB, 1600.3 / 0 ms [last resort gc].


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x357e638b4629 <JS Object>
    1: blockHandler [/home/bt2/bitcore-node/lib/services/address/index.js:480] [pc=0x1f268d6b00f] (this=0x1119405fe811 <JS Object>,block=0x1a9d1088ab19 <a Block with map 0x273b7ec5a689>,addOutput=0x357e63804211 <true>,callback=0x1a9d10804711 <JS Function (SharedFunctionInfo 0x4159dcd1981)>)
    2: /* anonymous */(aka /* anonymous */) [/home/bt2/bitcore-node/lib/services/db.js:566] [pc=0x1f26991...

@levino
Copy link

levino commented Apr 28, 2016

SatoshiPay has the same issues. I ran some analysis of the heap and see that a lot of RAM is allocated in this function:

https://github.com/bitpay/bitcore-node/blob/master/lib/services/address/index.js#L304

and used for the "mempoolSpentIndex" Array.

I kinda suspect that indices are added to the "mempoolSpentIndex" Array here https://github.com/bitpay/bitcore-node/blob/master/lib/services/address/index.js#L372 but the "delete" part is never triggered or just plainly does not work.

Another reason could be "transactions stuck in mempool". They will trigger an item to be added to the array which will never be removed, right?

@braydonf Should be able to answer this as these are his changes from Oct 2015 89ef28f

@braydonf
Copy link
Contributor

BTW, we're moving to keep the address index in bitcoind, and these indexes here won't be needed. See: https://github.com/bitpay/bitcoin/pull/6/files#diff-ca74c4b28865382863b8fe7633a85cd6R420 and #422

@levino
Copy link

levino commented Apr 28, 2016

This is a big PR. When will it be merged? Why dont you have more prs and smallers ones? Who can proof read all this?

@braydonf
Copy link
Contributor

It may seem like that but the code base is quite small considering, most of the new code is at: https://github.com/braydonf/bitcore-node/blob/bitcoind/lib/services/bitcoind.js with corresponding docs here: https://github.com/braydonf/bitcore-node/blob/bitcoind/docs/services/bitcoind.md

@wzrdtales
Copy link
Author

And another one

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory

<--- Last few GCs --->

635316153 ms: Scavenge 1387.5 (1439.0) -> 1387.5 (1440.0) MB, 1.2 / 0 ms (+ 5.7 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
635317676 ms: Mark-sweep 1387.5 (1440.0) -> 1356.2 (1440.0) MB, 1523.3 / 0 ms (+ 386.9 ms in 1284 steps since start of marking, biggest step 6.0 ms) [last resort gc].
635319202 ms: Mark-sweep 1356.2 (1440.0) -> 1362.7 (1415.0) MB, 1525.2 / 0 ms [last resort gc].


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x2056d6cb4629 <JS Object>
    1: DELETE [native runtime.js:350] [pc=0x2360d4c5dd71] (this=0xf3f2c9140b1 <an Object with map 0x3a55150f31e9>,q=0x283659fd38d9 <String[36]\: \xb2\xbec*;\xf15\x86+\xe8\xb3\xea\xbe\xe0g\x98\x98\xed\x8du\xd7\xfb\xb0\xd8\x0f\xc7c\xa0\xf5\xdd\x06x\x00\x00\x00\x01>,r=1)
    2: updateMempoolIndex [/home/bt2/bitcore-node/lib/services/address/index.js:~283] [pc=0x2360db5311d6] (this=0x26b08e7ec951 <...

@STRML
Copy link
Contributor

STRML commented May 4, 2016

I'm seeing what appears to be a leak in mempoolSpentIndex as well, but it does not appear to be nearly so bad as to crash the process. I heapdumped a bitcored process using 968MB of resident memory and the heap was only 40MB - mempoolSpendIndex was only 3.

Perhaps it's actually a bitcoind leak.

@kleetus
Copy link
Contributor

kleetus commented May 4, 2016

If you have the means, try @braydonf #422 branch. What we are planning for bitcore (bitcore-node) is to remove the static library patch and spawn bitcoind as a subprocess of bitcore. This way we can optimize bitcoind with respect to memory use and performance and send those fixes upstream without introducing controversy over making bitcoind a proper shared library. We could not get 100% consensus over making bitcoind a shared library (libbitcoind.a/libbitcoind.so) even though the patch was very basic (removing main() and patching Makefile.am. So now, if you STILL get leaks, then, guaranteed this is bitcoind and can't be related to bitcore-node. I've done extensive testing using valgrind and friends and detected leaks in bitcoind, but nothing that I was too worried about fixing.

@wzrdtales
Copy link
Author

Another crash:

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory

<--- Last few GCs --->

941576635 ms: Scavenge 1400.3 (1452.5) -> 1400.3 (1452.5) MB, 0.3 / 0 ms (+ 6.0 ms in 1 steps since last GC) [allocation failure] [incremental marking delaying mark-sweep].
941578294 ms: Mark-sweep 1400.3 (1452.5) -> 1399.9 (1450.5) MB, 1659.9 / 0 ms (+ 10.3 ms in 4 steps since start of marking, biggest step 6.0 ms) [last resort gc].
941579938 ms: Mark-sweep 1399.9 (1450.5) -> 1382.8 (1450.5) MB, 1643.6 / 0 ms [last resort gc].


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 0x2dc212b4629 <JS Object>
    1: constructor(aka LazyTransform) [internal/streams/lazy_transform.js:~11] [pc=0x37e130b56023] (this=0x5863ec2e189 <a Hash with map 0x3273377f2c11>,options=0x2dc212041b9 <undefined>)
    2: new constructor(aka Hash) [crypto.js:~47] [pc=0x37e130b58592] (this=0x5863ec2e189 <a Hash with map 0x3273377f2c11>,algorithm=0x2a5fe19bffe9 <String[6]: sha256>,options=0x2dc212041b9 <undefined>)
    4: sh...

I will also take a look at @braydonf branch.

@levino
Copy link

levino commented May 27, 2016

Any news on this? We need a fix @satoshipay.

@braydonf
Copy link
Contributor

braydonf commented Jun 8, 2016

@levino, #422 has been merged and released. Though we should still do deeper memory leak inspection and checking (if there still are any issues).

FYI, it looks like @chjj fixed a couple of memory leaks in leveldown recently (Level/leveldown#264 and Level/leveldown#267) and could have been related. It's not relevant to the latest release as both levelup and leveldown are no longer dependencies, since leveldb in bitcoind is used instead. This is still good news since leveldown still useful for other purposes.

@wzrdtales
Copy link
Author

I'm currently on the 3.0.0 and it crashed again unfortunately no logs in the bitcoind logs neither in the bitcore-node.log file.

I update now to the 3.0.2
@braydonf Btw. I can't start bitcore node in daemon mode anymore, is this intended? I have to manually send it to background as a background process. --daemon or -d seems to be just completely ignored.

@braydonf
Copy link
Contributor

In regards to running a daemon: Since upstart/systemd manages log rotation and process management (keeping track of the pid, restarting, and etc) it's the recommended method to run as a daemon, see https://bitcore.io/guides/upstart-daemon.

@wzrdtales
Copy link
Author

Ok good to know, you should remove the -d --daemon description from the help dialogue if it was removed.

@braydonf
Copy link
Contributor

Thanks, didn't realize it was still there, updated in #461

@levino
Copy link

levino commented Jun 27, 2016

We came around to merging in the changes and as of today we use bitcore-node@3.0.2 @satoshipay. So far it is looking good. Will update on the matter in a week when hopefully everything remained stable.

@levino
Copy link

levino commented Jun 27, 2016

Ah, and btw: Thank you for all the good work @braydonf @bitpay.

@braydonf
Copy link
Contributor

braydonf commented Jul 7, 2016

No problem!

FYI, had a 3.0.0 node running from 2016-06-14 to 2016-07-07 without any issues.

@levino
Copy link

levino commented Jul 7, 2016

Something else: Sync is really io heavy now. Got the impression this is due to bitcoind creating many more indices. But it is hefty. Takes a week to sync and index on a dedicated machine (with somewhat slow disks though)

@braydonf
Copy link
Contributor

I don't think there are any issue here still, please let us know if there any other problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants