Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

electrs crashes while indexing mainnet with "Too many open files" #133

Open
prashb94 opened this issue Aug 10, 2019 · 9 comments
Open

electrs crashes while indexing mainnet with "Too many open files" #133

prashb94 opened this issue Aug 10, 2019 · 9 comments

Comments

@prashb94
Copy link

Electrs new-index works fine for testnet but while syncing mainnet, errors out with -

(Truncated log)

TRACE - skipping block 0000000000000000001a871a0c81fe392e9d90562e702eddd2835e27da815f1d
TRACE - skipping block 0000000000000000001198ed4b9090ef67acebc8ca517bdcd67efc930e554b6c
TRACE - skipping block 0000000000000000001c02b01cb173dc33cd901d0842be6f331037c03b1b1afa
TRACE - skipping block 000000000000000000131227a7c21c0c247b5ee30aeffbd1f9ccba6038d071d5
TRACE - skipping block 0000000000000000000c99cf30cb7609a3d3e1bc6b65c6360b03130e34b2f150
TRACE - fetched 9 blocks
DEBUG - writing 98889 rows to RocksDB { path: "./db/mainnet/newindex/txstore" }, flush=Disable
DEBUG - starting full compaction on RocksDB { path: "./db/mainnet/newindex/txstore" }
DEBUG - finished full compaction on RocksDB { path: "./db/mainnet/newindex/txstore" }
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { message: "IO error: While open a file for random read: ./db/mainnet/newindex/txstore/000938.sst: Too many open files" }', src/libcore/result.rs:997:5
Aborted (core dumped)

Also, the size of ./db is ~325GB. Is this normal?

@greenaddress
Copy link
Collaborator

@prashb94 size of db is normal, IIRC it can go up to 700+GB before it compacts back down (to 495GB, excluding bitocoind, if we include that total disk requirements, once compacted is around 765GB, but you will need more for the initial run).

How much ram do you have on the machine you run this? What OS/distro?

@prashb94
Copy link
Author

prashb94 commented Aug 19, 2019

Thanks!
It's on a t3.xlarge EC2 instance (4vCPU/16GB RAM) and attached 2TB of block storage. So memory isn't the issue. Any idea what could be causing it to crash?

Edit: OS - Ubuntu 16.04.5 LTS (GNU/Linux 4.4.0-1088-aws x86_64)

@greenaddress
Copy link
Collaborator

@prashb94 I am not too sure but I think this depends on the OS configuration. You may solve the issue by changing /etc/security/limits.conf see romanz/electrs#28 as a similar issue but from romanz/electrs#11 it appears it could also be related to a corrupted bitcoind block file.

Are you using any ad-hoc configuration for bitcoindd? Is the storage ssd?

Thanks

@setpill
Copy link

setpill commented Oct 18, 2019

Running into same issue, does not seem to be OS config (or maybe I'm missing something?)

Trace (with RUST_BACKTRACE=full). NB: This occurred on the first run of the service after a reboot.

Only non-comment line in /etc/sysctl.conf: fs.file-max = 500000

Only non-comment lines in /etc/security/limits.conf:

*		soft	nofile		100000
*		hard	nofile		100000

@clarkmoody
Copy link

Seeing the same problem here. I got an interesting result when I deleted the cache directory and re-ran electrs: it tried opening a socket to listen from the server and produced a "too many open files" error, but the output message had fd: 1023 in the socket error. This hints at a 1024 fd limit for the process somehow. I upped the hard and soft limits on the machine to 500k and double-checked across all users. Somehow electrs did not have access to that limit.

Upstream electrs is setting the open files limit manually.

@clarkmoody
Copy link

Relevant lines in my logs. I guess the error happened when trying to connect to Bitcoind (port 8332)

Dec 11 06:59:50 - esplora-electrs[28090]: 2019-12-11T06:59:50.344+00:00 - ERROR - server failed: Error: failed to clone TcpStream { addr: V4(127.0.0.1:3000), peer: V4(127.0.0.1:8332), fd: 1023 }
Dec 11 06:59:50 - esplora-electrs[28090]: Caused by: Too many open files (os error 24)

@setpill
Copy link

setpill commented Dec 12, 2019

The issue on my system turned out to be caused by systemd overriding system wide limits with a "sane" default. Was resolved by setting LimitNOFILE with a higher value in the electrs service file.

@dongcarl
Copy link

dongcarl commented Dec 12, 2019

Here's how I got around this on the command line:

sudo prlimit --nofile=65536 sudo -u "$(id -u)" -g "$(id -g)" cargo blah blah wtv

The first sudo makes us root and gives us access to modify file limits, the second sudo brings us back to our original user to execute cargo properly

@clarkmoody
Copy link

The issue on my system turned out to be caused by systemd overriding system wide limits with a "sane" default. Was resolved by setting LimitNOFILE with a higher value in the electrs service file.

@setpill Excellent, thanks! Running via systemd here.

Might be nice to make a note of this in the docs 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants