New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High cpu usage on some threads after the "nodetool drain" command on linux kernels 3.x / 4.x #13377
Milestone
Comments
Hello. |
tzach
referenced
this issue
Dec 5, 2023
* seastar 830ce8673...55a821524 (34): > Revert "reactor/scheduling_group: Handle at_destroy queue special in init_new_scheduling_group_key etc" > epoll: Avoid spinning on aborted connections Fixes #12774 Fixes #7753 Fixes #13337 > Merge 'Sanitize test-only reactor facilities' from Pavel Emelyanov > test/unit: fix fmt version check > reactor/scheduling_group: Handle at_destroy queue special in init_new_scheduling_group_key etc > build: add spaces before () and after commands > reactor: use zero-initialization to initialize io_uring_params > Merge 'build: do not return a non-false condition if the option is off ' from Kefu Chai > memory: do not use variable length array > build: use tri_state_option() to link against Sanitizers > build: do not define SEASTAR_TYPE_ERASE_MORE on all builds > Revert "shared_future: make available() immediate after set_value()" > test_runner: do not throw when seastar.app fails to start > Merge 'Address issue where Seastar faults in toeplitz hash when reassembling fragment' from John Hester > defer, closeable: do not use [[nodiscard(str)]] > Merge 'build: generate config-specific rules using generator expressions' from Kefu Chai > treewide: use *_v and *_t for better readability > build: use different names for .pc files for each build mode > perftune.py: skip discovering IRQs for iSCSI disks > io-tester: explicit use uint64_t for boost::irange(...) > gate: correct the typo in doxygen comment > shared_future: make available() immediate after set_value() > smp: drop unused templates > include fmt/ostream.h to make headers self-sufficient > Support ccache in ./configure.py > rpc_tester: Disable -Wuninitialized when including boost.accumulators > file: construct directory_entry with aggregated ctor > file: s/ino64_t/ino_t/, s/off64_t/off_t/ > sstring_test: include fmt/std.h only if fmtlib >= 10.0.0 > file: do not include coroutine headers if coroutine is disabled > fair_queue::unregister_priority_class:fix assertion > Merge 'Generalize `net::udp_channel` into `net::datagram_channel`' from Michał Sala > Merge 'Add file::list_directory() that co_yields entries' from Pavel Emelyanov > http/file_handler: remove unnecessary cast Closes #16201
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Installation details
Scylla version: 5.1.5 Open Source; 4.6.x versions are affected as well.
OS (RHEL/CentOS/Ubuntu/AWS AMI): RHEL/CentOS/Ubuntu - any of them on kernels 3.x / 4.x, but not on 5.x.
We see high cpu usage on some scylladb threads after the "nodetool drain" command.
The problem is easily reproducible.
We believe that it's some unexpected behavior.
Below are the examples of some OS commands to compare.
top [-1] -H -n1 -b -p $(pidof scylla)
Linux kernels 3.x / 4.x
Ubuntu 18.04, Centos 7/8, RHEL 8.1
Linux kernels 5.x
Ubuntu 20.04, Centos 7 (5.x kernel is installed manually)
On distros with the top -1 option available we see that first 1 or 2 threads are 100% busy.
The situation is slightly different in a multi-node environment:
On the drained node the reactor-1 thread consumes 100% cpu as well (2 theads are 100% busy in this case).
But not on other nodes, where the main scylla process consumes 100% of cpu only.
strace -p $(pidof scylla) -c
Linux kernels 3.x / 4.x
Ubuntu 18.04, Centos 7/8, RHEL 8.1
Linux kernels 5.x
Ubuntu 20.04, Centos 7 (5.x kernel is installed manually)
The text was updated successfully, but these errors were encountered: