Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail starting up if ulimit is too low #86

Closed
Tracked by #443
lukoktonos opened this issue Jun 27, 2023 · 8 comments
Closed
Tracked by #443

Fail starting up if ulimit is too low #86

lukoktonos opened this issue Jun 27, 2023 · 8 comments
Labels
first-issue Created by Linear-GitHub Sync High priority Created by Linear-GitHub Sync
Milestone

Comments

@lukoktonos
Copy link
Contributor

When we finish snapshotting and run compaction, we open one file descriptor per sst file, which can open roughly $DB_SIZE / 66M worth of file descriptors. So a 2 TB database could potentially open ~32k file descriptors. I am not sure if rocksdb actually opens them all, or if it processes things in smaller chunks, but the default 1024 is too low for even 100GiB.

If ulimit -n is less than the number of sst files, we fail compaction--this can happen hours after readyset starts up, making it less of a helpful error than if we checked ulimit -n at startup and errored if we think it is too low.

We could either recommend that folks set a ulimit of 32k and error startup if it's lower than that (perhaps with a flag to allow for a smaller value without erroring), or try to query the upstream database to assess size and make an estimate of how many fds we need based on that and error before we actually spend a lot of time snapshotting and begin compaction.

From SyncLinear.com | REA-2899

@lukoktonos lukoktonos self-assigned this Jun 27, 2023
@lukoktonos lukoktonos added the High priority Created by Linear-GitHub Sync label Jun 27, 2023
@lukoktonos
Copy link
Contributor Author

Moved to a different Project.

@lukoktonos
Copy link
Contributor Author

@Gautam Gopinadhan

@lukoktonos
Copy link
Contributor Author

Awesome! All that sounds great. 😀

@lukoktonos
Copy link
Contributor Author

@imeyer yes, that was the approach I was going to take at startup

Makes sense about providing precise guidance being tricky. Right now the signature we see is "I/O Error: Too many open files", which happens hours into readyset being up. So even a slightly more error sooner (e.g. "System misconfiguration detected: ulimit -n is set too low $current_value. ReadySet may need up to $recommended_value file descriptors.") and letting operators figure out how to fix the error themselves would probably prevent most of the pain in this case.

@lukoktonos
Copy link
Contributor Author

Was your plan to run a ulimit -Hn && ulimit -Sn from within readyset itself? I think the most/closest would be to get the runinng PID of readyset immediately at startup (in the readyset Rust code), and parse the /proc/<pid>/limits file for that information, and that would cover pretty much all cases I mentioned. (i'm sure there is an idiomatic rust way, I just don't know it 😄)

The tricky part is informing the user how to set it appropriately based on how they're running readyset as limits across distros etc are not uniform. For example the defaults for the following distros…

Amazon Linux 2: sysctl fs.file-max && ulimit -Hn && ulimit -Sn
fs.file-max = 9223372036854775807
32768
32768
Ubuntu 22.04: sysctl fs.file-max && ulimit -Hn && ulimit -Sn
fs.file-max = 9223372036854775807
1048576
1024
Fedora 38: sysctl fs.file-max && ulimit -Hn && ulimit -Sn
fs.file-max = 9223372036854775807
524288
1024

Are quite different!

@lukoktonos
Copy link
Contributor Author

@imeyer From my reading, for 1,2,and 4 we should still be able to detect if readyset won't have enough file descriptors by running ulimit -n on startup, but I'm not sure about (3). Do you have a recommendation for how to detect the file descriptor configuration in that use case?

@lukoktonos
Copy link
Contributor Author

Some things to consider for limits precedence (at least on Linux, unclear on the BSDs)

  1. sysctl is the facility for kernel tunables (fs.file-max takes precedence over values set via limits.conf(5))
  2. PAM controls user/group level limits and can not exceed those defined by the kernel
    1. There are hard and soft limits. ulimit -Hn and ulimit -Sn respectively
    2. Limitations set via PAM are per process (i.e. you run readyset from your logged-in terminal), they are not global to a user/group/session.
  3. systemd has it's own mechanism for setting limits via LimitNOFILE=<int> in the [Service] section of the unit file.
  4. can be set before the process is started without changing anything in limits.conf(5) provided it does not exceed the hard limit.. (e.g. ulimit -n 32768 && readyset … )

@lukoktonos
Copy link
Contributor Author

Lets avoid getting fancy and just request for ulimit to be set to a large number (32K), if and its not set, we will fail to start up.

cc: @emrysal - we should set the ulimit in the docker scripts if this is not already happening.

cc: @coredb-service-userstanton - Similar setting may be needed in the helm installer as well.

@lukoktonos lukoktonos added the first-issue Created by Linear-GitHub Sync label Jun 27, 2023
@lukoktonos lukoktonos added this to the v.3 milestone Jul 3, 2023
@lukoktonos lukoktonos removed their assignment Jul 3, 2023
@lukoktonos lukoktonos modified the milestones: v.3, v.4 Jul 17, 2023
@lukoktonos lukoktonos modified the milestones: v.4, v.5 Jul 31, 2023
@zoready zoready changed the title [REA-2899] Fail starting up if ulimit is too low Fail starting up if ulimit is too low Aug 7, 2023
@gvsg-rs gvsg-rs closed this as completed Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
first-issue Created by Linear-GitHub Sync High priority Created by Linear-GitHub Sync
Projects
None yet
Development

No branches or pull requests

2 participants