-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail starting up if ulimit is too low #86
Comments
Moved to a different Project. |
@Gautam Gopinadhan |
Awesome! All that sounds great. 😀 |
@imeyer yes, that was the approach I was going to take at startup Makes sense about providing precise guidance being tricky. Right now the signature we see is "I/O Error: Too many open files", which happens hours into readyset being up. So even a slightly more error sooner (e.g. "System misconfiguration detected: ulimit -n is set too low $current_value. ReadySet may need up to $recommended_value file descriptors.") and letting operators figure out how to fix the error themselves would probably prevent most of the pain in this case. |
Was your plan to run a The tricky part is informing the user how to set it appropriately based on how they're running readyset as limits across distros etc are not uniform. For example the defaults for the following distros…
Are quite different! |
@imeyer From my reading, for 1,2,and 4 we should still be able to detect if readyset won't have enough file descriptors by running |
Some things to consider for limits precedence (at least on Linux, unclear on the BSDs)
|
Lets avoid getting fancy and just request for ulimit to be set to a large number (32K), if and its not set, we will fail to start up. cc: @emrysal - we should set the ulimit in the docker scripts if this is not already happening. cc: @coredb-service-userstanton - Similar setting may be needed in the helm installer as well. |
When we finish snapshotting and run compaction, we open one file descriptor per sst file, which can open roughly $DB_SIZE / 66M worth of file descriptors. So a 2 TB database could potentially open ~32k file descriptors. I am not sure if rocksdb actually opens them all, or if it processes things in smaller chunks, but the default 1024 is too low for even 100GiB.
If ulimit -n is less than the number of sst files, we fail compaction--this can happen hours after readyset starts up, making it less of a helpful error than if we checked ulimit -n at startup and errored if we think it is too low.
We could either recommend that folks set a ulimit of 32k and error startup if it's lower than that (perhaps with a flag to allow for a smaller value without erroring), or try to query the upstream database to assess size and make an estimate of how many fds we need based on that and error before we actually spend a lot of time snapshotting and begin compaction.
From SyncLinear.com | REA-2899
The text was updated successfully, but these errors were encountered: