New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
init - Startup failed: seastar::sleep_aborted (Sleep is aborted) #12898
Comments
@fgelcer @yaronkaikov |
in this case, i just deleted my commento to not mess up with this one |
me2 |
What exactly is the issue? For example, for the first (decommission) case, I see:
Why is the service stopped? Is the issue that scylla is stopped, or that it doesn't stop cleanly, and it prints this error?
|
I re-checked, we really restart the scylla. So the issue that Scylla was stopped with "Startup failed" error
|
So this is very minor. Line 1765 in 1cefb66
and exit cleanly in this case. |
@xemul ^^ |
@bhalevy When did you plan to fix it? Now we can get test failed with this error despite it is minor |
We can't ignore such an error, if scylla fails to boot, it's not something we easily can ignore |
@fruch |
When scylla starts it may go to sleep along the way before the "serving" message appears. If SIGINT is sent at that time the whole thing unrolls and the main code ends up catching the sleep_aborted exception, printing the error in logs and exiting with non-zero code. However, that's not an error, just the start was interrupted earlier than it was expected by the stop_signal thing. fixes: scylladb#12898 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When scylla starts it may go to sleep along the way before the "serving" message appears. If SIGINT is sent at that time the whole thing unrolls and the main code ends up catching the sleep_aborted exception, printing the error in logs and exiting with non-zero code. However, that's not an error, just the start was interrupted earlier than it was expected by the stop_signal thing. fixes: #12898 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #14034
This issue is not relevant to production deployments, not backporting. |
Issue description
Problem
Startup failed on a new node that was created as replacement for terminated/decommissioned node:
Description
This falure happens twice during this test.
Test runs with workload prioritization (I am not sure if it is connected to the issue).
longevity-sla-100gb-4h-2023-1-db-node-910e5c8f-1
has been decommissioned and newlongevity-sla-100gb-4h-2023-1-db-node-910e5c8f-7 (3.250.19.195 | 10.4.0.18
) node was added and starting:12 seconds later
system_auth.roles
table was created and then waiting for gossip:Immediatelly after that Scylla was stopped:
And startup failed:
longevity-sla-100gb-4h-2023-1-db-node-910e5c8f-4
was terminated and newlongevity-sla-100gb-4h-2023-1-db-node-910e5c8f-8 (34.244.86.108 | 10.4.1.190)
was srationg:12 seconds later
system_auth.roles
table creation and waiting for gossip:Immediatelly the Scylla was stopped:
Startup failed:
No errors/coredump/aborting/faults. Scylla was started successfully after that
Impact
Scylla was started successfully after that
How frequently does it reproduce?
twice during same test run
Installation details
Kernel Version: 5.15.0-1026-aws
Scylla version (or git commit hash):
2022.1.5-20230108.8c2c21866
with build-id90676755bb7af26527b54cf1a5afb6498162afba
Relocatable Package: http://downloads.scylladb.com/downloads/scylla-enterprise/relocatable/scylladb-2022.1/scylla-enterprise-x86_64-package-2022.1.5.0.20230108.8c2c21866.tar.gz
Cluster size: 6 nodes (i3.4xlarge)
Scylla Nodes used in this run:
OS / Image:
ami-0287067067652acb9
(aws: eu-west-1)Test:
longevity-sla-100gb-4h-test
Test id:
910e5c8f-b2de-44e8-bf21-de82c605135d
Test name:
enterprise-2022.1/SCT_Enterprise_Features/Workload_Prioritization/longevity-sla-100gb-4h-test
Test config file(s):
Logs and commands
$ hydra investigate show-monitor 910e5c8f-b2de-44e8-bf21-de82c605135d
$ hydra investigate show-logs 910e5c8f-b2de-44e8-bf21-de82c605135d
Logs:
Jenkins job URL
The text was updated successfully, but these errors were encountered: