Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scylla does not start when kernel inotify limits are exceeded #7700

Closed
avikivity opened this issue Nov 25, 2020 · 7 comments
Closed

scylla does not start when kernel inotify limits are exceeded #7700

avikivity opened this issue Nov 25, 2020 · 7 comments
Assignees
Milestone

Comments

@avikivity
Copy link
Member

Each tls instance consumes an inotify watch, and there can be multiple tls instances per shard. A large machine can run out, and will fail startup. The default is 128, which is enough for 64 shards.

@avikivity avikivity changed the title scylla can exhaust inotify kernel limits scylla does not start when kernel inotify limits are exceeded Nov 25, 2020
avikivity added a commit to avikivity/scylladb that referenced this issue Nov 25, 2020
Since f3bcd4d ("Merge 'Support SSL Certificate Hot
Reloading' from Calle"), we reload certificates as they are
modified on disk. This uses inotify, which is limited by a
sysctl fs.inotify.max_user_instances, with a default of 128.

This is enough for 64 shards only, if both rpc and cql are
encrypted; above that startup fails.

Increase to 1200, which is enough for 6 instances * 200 shards.

Fixes scylladb#7700.
@psarna
Copy link
Contributor

psarna commented Nov 26, 2020

Maybe this error should also not be fatal, but instead print an error message in the logs? I guess it depends if the mechanisms that rely on inotify also allow other ways of reloading observed files (e.g. via a signal or REST or whatever).

@avikivity
Copy link
Member Author

@psarna it threw an exception, but the exception was converted to an assert() when a sharded<> instance was destroyed incorrectly. @elcallio promised to fix that.

@avikivity
Copy link
Member Author

Backported to 4.1, 4.2, 4.3.

avikivity added a commit that referenced this issue Nov 29, 2020
Since f3bcd4d ("Merge 'Support SSL Certificate Hot
Reloading' from Calle"), we reload certificates as they are
modified on disk. This uses inotify, which is limited by a
sysctl fs.inotify.max_user_instances, with a default of 128.

This is enough for 64 shards only, if both rpc and cql are
encrypted; above that startup fails.

Increase to 1200, which is enough for 6 instances * 200 shards.

Fixes #7700.

Closes #7701

(cherry picked from commit 390e07d)
avikivity added a commit that referenced this issue Nov 29, 2020
Since f3bcd4d ("Merge 'Support SSL Certificate Hot
Reloading' from Calle"), we reload certificates as they are
modified on disk. This uses inotify, which is limited by a
sysctl fs.inotify.max_user_instances, with a default of 128.

This is enough for 64 shards only, if both rpc and cql are
encrypted; above that startup fails.

Increase to 1200, which is enough for 6 instances * 200 shards.

Fixes #7700.

Closes #7701

(cherry picked from commit 390e07d)
avikivity added a commit that referenced this issue Nov 29, 2020
Since f3bcd4d ("Merge 'Support SSL Certificate Hot
Reloading' from Calle"), we reload certificates as they are
modified on disk. This uses inotify, which is limited by a
sysctl fs.inotify.max_user_instances, with a default of 128.

This is enough for 64 shards only, if both rpc and cql are
encrypted; above that startup fails.

Increase to 1200, which is enough for 6 instances * 200 shards.

Fixes #7700.

Closes #7701

(cherry picked from commit 390e07d)
@elcallio
Copy link
Contributor

Question: Should we try to address this on a seastar level? While the basic problem or shard multiplication cannot be solved, we could maybe fix it somewhat for the usage pattern of shard-shared credentials builder generating a reloadable credentials object per shard.
With some (terrible) juggling of foreign pointers it should be possible to make only a single shard actually use inotify, and the rest cross-shard subscribe to the originating shards notifications:

There will be a lot of cross-shard calls when stuff changes, but...

We can also add a fallback option for the actual originating shard reloader, to use polling iff inotify is not available.

@avikivity
Copy link
Member Author

We could, but it's a huge amount of work compared to writing to a sysctl file.

@elcallio
Copy link
Contributor

I take that as a down prioritization of the idea.

@avikivity
Copy link
Member Author

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants