Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gaiad: opens many files while synching and can exceed standard Linux limits #1394

Closed
adrianbrink opened this issue Jun 26, 2018 · 7 comments
Closed

Comments

@adrianbrink
Copy link
Contributor

adrianbrink commented Jun 26, 2018

I set up a new node (4GB ram, 4cpu, 80gb disk DO). It syncs with gaia-6002. I ran it from block 0 until around 410,000 and it crashed with too many open files. 46657 and 46658 are blocked so it's not an RPC issue.

It seems that the database opens too many files.

I[06-26|22:21:11.469] Executed block                               module=state height=417269 validTxs=0 invalidTxs=0
I[06-26|22:21:11.683] Committed state                              module=state height=417269 txs=0 appHash=C1A2404F6318651FD9C4452B473EC2DDC8613DFD
I[06-26|22:21:11.683] Recheck txs                                  module=mempool numtxs=1 height=417269
I[06-26|22:21:11.684] Done rechecking txs                          module=mempool
I[06-26|22:21:11.692] Indexed block                                module=txindex height=417269
I[06-26|22:21:11.701] Stopping MConnection                         module=p2p peer=91.126.244.116:52306 impl=MConn{91.126.244.116:52306}
E[06-26|22:21:11.702] Stopping peer for error                      module=p2p peer="Peer{MConn{91.126.244.116:52306} 26d255c6901d1ba6dd230865c18b250d3a6d4984 in}" err="Error{`recovered panic in MConnection` (cause: open /root/.gaiad/data/blockstore.db/048960.ldb: too many open files)}"
I[06-26|22:21:11.702] Stopping Peer                                module=p2p peer=91.126.244.116:52306 impl="Peer{MConn{91.126.244.116:52306} 26d255c6901d1ba6dd230865c18b250d3a6d4984 in}"
panic: open /root/.gaiad/data/gaia.db/971996.ldb: too many open files

goroutine 58 [running]:
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tmlibs/db.(*GoLevelDB).Get(0xc42000e390, 0xc469ec0800, 0x37, 0x40, 0x37, 0xc469ec0800, 0xd)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tmlibs/db/go_level_db.go:50 +0x138
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tmlibs/db.(*prefixDB).Get(0xc420c9ffb0, 0xc45b31ba10, 0x2a, 0x30, 0x0, 0x0, 0x0)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tmlibs/db/prefix_db.go:60 +0x190
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*nodeDB).GetNode(0xc4209401e0, 0xc462281d00, 0x14, 0x14, 0x0)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/nodedb.go:77 +0x19d
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).getRightNode(0xc43bbaa4d0, 0xc471ad86e0, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:441 +0x73
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc43bbaa4d0, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0x1, 0x14, 0x14)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:163 +0x20f
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc427e45ad0, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0x1, 0x14, 0x14)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:164 +0x24d
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc4398962c0, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0x1, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:164 +0x24d
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc439896210, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0x1, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:164 +0x24d
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc447104160, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0xffffffffffffffff, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:164 +0x24d
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc447104000, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0xffffffffffffffff, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:161 +0x1b9
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc439f73ef0, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0xffffffffffffffff, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:161 +0x1b9
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc439f73ce0, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0xffffffffffffffff, 0x14, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:161 +0x1b9
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc439f73c30, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0x1d, 0xffffffffffffffff, 0xc4397d0188, 0x5070f8)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:161 +0x1b9
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Node).get(0xc439f73b80, 0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0xc28520, 0x411e49, 0x411e49, 0xc475613f40)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/node.go:161 +0x1b9
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Tree).Get64(0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0xc42f52d080, 0x0, 0xc42005c800, 0xc42005c800)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/tree.go:131 +0x5a
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl.(*Tree).Get(0xc471ad86e0, 0xc427e66bc0, 0x1d, 0x20, 0xc400000008, 0xc441f90680, 0xc427e66bdd, 0xc4397d0278)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/iavl/tree.go:123 +0x49
github.com/cosmos/cosmos-sdk/store.(*iavlStore).Get(0xc42ce811b0, 0xc427e66bc0, 0x1d, 0x20, 0x139afe0, 0xc475613d00, 0xc427e66bc0)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/store/iavlstore.go:104 +0x52
github.com/cosmos/cosmos-sdk/store.(*cacheKVStore).Get(0xc475613d80, 0xc427e66bc0, 0x1d, 0x20, 0x0, 0x0, 0x0)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/store/cachekvstore.go:50 +0x14c
github.com/cosmos/cosmos-sdk/store.(*gasKVStore).Get(0xc475613f40, 0xc427e66bc0, 0x1d, 0x20, 0xc427e66bc0, 0x1d, 0x20)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/store/gaskvstore.go:42 +0x8a
github.com/cosmos/cosmos-sdk/x/slashing.Keeper.getValidatorSigningBitArray(0xed86a0, 0xc42109fee0, 0xc420088900, 0xee28e0, 0xc420940050, 0xa, 0xedee80, 0xc42865ec00, 0xc4364ca040, 0x9, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/x/slashing/signing_info.go:33 +0x105
github.com/cosmos/cosmos-sdk/x/slashing.Keeper.handleValidatorSignature(0xed86a0, 0xc42109fee0, 0xc420088900, 0xee28e0, 0xc420940050, 0xa, 0xedee80, 0xc42865ec00, 0xc4364ca040, 0x9, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/x/slashing/keeper.go:70 +0x322
github.com/cosmos/cosmos-sdk/x/slashing.BeginBlocker(0xedee80, 0xc42865ec00, 0xc4364ca040, 0x9, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, 0x65df6, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/x/slashing/tick.go:40 +0x5a2
github.com/cosmos/cosmos-sdk/cmd/gaia/app.(*GaiaApp).BeginBlocker(0xc4209137c0, 0xedee80, 0xc42865ec00, 0xc4364ca040, 0x9, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/cmd/gaia/app/app.go:117 +0xc3
github.com/cosmos/cosmos-sdk/cmd/gaia/app.(*GaiaApp).BeginBlocker-fm(0xedee80, 0xc42865ec00, 0xc4364ca040, 0x9, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, 0x65df6, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/cmd/gaia/app/app.go:90 +0xa0
github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).BeginBlock(0xc4200fc0e0, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, 0x65df6, 0x5b2fa2d4, 0x0, 0x68f4, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/baseapp/baseapp.go:386 +0x157
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/abci/client.(*localClient).BeginBlockSync(0xc436c449c0, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, 0x65df6, 0x5b2fa2d4, 0x0, 0x68f4, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/abci/client/local_client.go:206 +0xab
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/proxy.(*appConnConsensus).BeginBlockSync(0xc460803950, 0xc427e669c0, 0x14, 0x20, 0xc440423a60, 0x9, 0x65df6, 0x5b2fa2d4, 0x0, 0x68f4, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/proxy/app_conn.go:69 +0x78
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/state.execBlockOnProxyApp(0xedf580, 0xc47507bd20, 0xee23a0, 0xc460803950, 0xc4858c2370, 0xc470e45680, 0xee5a00, 0xc462f2d990, 0x1, 0xc45fe50860, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/state/execution.go:190 +0x547
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/state.(*BlockExecutor).ApplyBlock(0xc436c45200, 0xc45430ec20, 0x9, 0x65df5, 0x68f4, 0xc45fe51d20, 0x14, 0x20, 0x1, 0xc45fe50860, ...)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/state/execution.go:76 +0x12f
github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/blockchain.(*BlockchainReactor).poolRoutine(0xc43fe8b800)
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/blockchain/reactor.go:300 +0x426
created by github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/blockchain.(*BlockchainReactor).OnStart
        /root/.gvm/pkgsets/go1.10.3/global/src/github.com/cosmos/cosmos-sdk/vendor/github.com/tendermint/tendermint/blockchain/reactor.go:117 +0x86

I think we need to determine a way to limit the number of open files on the system.
Currently any user would have to increase the number of open files according to this guide: https://www.tecmint.com/increase-set-open-file-limits-in-linux/

@zmanian zmanian changed the title gaiad crashses if synced from scratch gaiad: opens many files while synching and can exceed standard Linux limits Jun 27, 2018
@zmanian
Copy link
Member

zmanian commented Jun 27, 2018

More descriptive issue title

@wimel
Copy link
Contributor

wimel commented Jun 27, 2018

I fixed this problem editing the file /etc/security/limits.conf and put this (at the end of the file):
*user* hard nofile 500000
*user* soft nofile 450000
maybe is more than I need, but solve the problem
This link explain the problem:
https://access.redhat.com/solutions/61334

@ebuchman ebuchman added this to To do in Stability via automation Jun 30, 2018
@zmanian
Copy link
Member

zmanian commented Jul 5, 2018

I was looking https://godoc.org/github.com/syndtr/goleveldb/leveldb/opt#Options and it seems like each leveldb isntance we open might open as many as 500 cache files. This could explain why we are seeing so many open file.

@ebuchman
Copy link
Member

ebuchman commented Jul 6, 2018

I think we might be able to limit this by passing in options https://github.com/tendermint/tendermint/blob/07747de305ba80144ff0d4ad9109068bc486dedd/libs/db/go_level_db.go#L32

@zmanian
Copy link
Member

zmanian commented Jul 6, 2018

Please may want to be able specify higher levels of caching for validators etc and then less caching for standard full nodes

@ebuchman ebuchman added this to To do in Current Iteration via automation Jul 11, 2018
@cwgoes
Copy link
Contributor

cwgoes commented Jul 18, 2018

This should go in a Gaia config file (#1662).

@cwgoes
Copy link
Contributor

cwgoes commented Jul 25, 2018

Closing, believed to have been fixed upstream, please reopen if this can be replicated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

No branches or pull requests

5 participants