Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scylla crash after encountering transient filesystem errors #737

Closed
benoit-canet opened this issue Dec 30, 2015 · 5 comments
Closed

Scylla crash after encountering transient filesystem errors #737

benoit-canet opened this issue Dec 30, 2015 · 5 comments

Comments

@benoit-canet
Copy link

The following crash was caused by running cassandra stress against scylla
then caussing a 10 seconds filesystem error session with CharybdeFS then
clearing it.

[root@localhost scylla]# ./build/release/scylla
Scylla version development-20151217.89dcc5d starting ...
WARN [shard 0] config - Option partitioner is not (yet) used.
Start Storage service ...
WARN [shard 0] init - /var/lib/scylla/commitlog is not on XFS. This is a non-supported setup, and performance is expected to be very bad.
For better performance, placing your data on XFS-formatted directories is strongly recommended
WARN [shard 0] init - /var/lib/scylla/data is not on XFS. This is a non-supported setup, and performance is expected to be very bad.
For better performance, placing your data on XFS-formatted directories is strongly recommended
Start gossiper service ...
Starting listening for CQL clients on localhost:9042...
Seastar HTTP server listening on localhost:10000 ...
ERROR [shard 2] commitlog - Failed to persist commits to disk for CommitLog-1-36028797055990612.log: std::system_error (error system:27, File too large)
WARN [shard 2] init - Shutdown communications until operator examinate the situation.
WARNING: exceptional future ignored of type 'std::system_error': Error system:27 (File too large)
WARNING: exceptional future ignored of type 'std::system_error': Error system:103 (Software caused connection abort)
WARNING: exceptional future ignored of type 'std::system_error': Error system:103 (Software caused connection abort)
WARNING: exceptional future ignored of type 'std::system_error': Error system:103 (Software caused connection abort)
WARNING: exceptional future ignored of type 'std::system_error': Error system:103 (Software caused connection abort)
WARNING: exceptional future ignored of type 'std::system_error': Error system:103 (Software caused connection abort)
ERROR [shard 6] commitlog - Failed to persist commits to disk for CommitLog-1-108086391093918548.log: std::system_error (error system:43, Identifier removed)
WARNING: exceptional future ignored of type 'std::system_error': Error system:103 (Software caused connection abort)
WARNING: exceptional future ignored of type 'std::system_error': Error system:43 (Identifier removed)
WARNING: exceptional future ignored of type 'std::system_error': Error system:103 (Software caused connection abort)
WARNING: exceptional future ignored of type 'std::system_error': Error system:103 (Software caused connection abort)
WARN [shard 0] storage_service - CQL server stopped
WARN [shard 0] storage_service - Stopping gossip by operator request
ERROR [shard 5] database - failed to write sstable: std::system_error (error system:45, Level 2 not synchronized)
ERROR [shard 3] commitlog - Failed to persist commits to disk for CommitLog-1-54043195565472596.log: std::system_error (error system:36, File name too long)
WARN [shard 3] commitlog - Exception in segment reservation: std::system_error (error system:11, Resource temporarily unavailable)
Erreur de segmentation (core dumped)
[root@localhost scylla]#

@gleb-cloudius
Copy link
Contributor

On Wed, Dec 30, 2015 at 10:37:57AM -0800, Benoît Canet wrote:

Erreur de segmentation (core dumped)
Anything interesting there?

        Gleb.

@benoit-canet
Copy link
Author

I don't have a coredump at hand.

On Wed, Dec 30, 2015 at 7:41 PM, Gleb Natapov notifications@github.com
wrote:

On Wed, Dec 30, 2015 at 10:37:57AM -0800, Benoît Canet wrote:

Erreur de segmentation (core dumped)
Anything interesting there?

Gleb.


Reply to this email directly or view it on GitHub
#737 (comment).

@benoit-canet
Copy link
Author

ahh it says (core dumped) I will look for it

On Wed, Dec 30, 2015 at 7:46 PM, Benoît Canet benoit@cloudius-systems.com
wrote:

I don't have a coredump at hand.

On Wed, Dec 30, 2015 at 7:41 PM, Gleb Natapov notifications@github.com
wrote:

On Wed, Dec 30, 2015 at 10:37:57AM -0800, Benoît Canet wrote:

Erreur de segmentation (core dumped)
Anything interesting there?

Gleb.


Reply to this email directly or view it on GitHub
#737 (comment).

@benoit-canet
Copy link
Author

#0 0x00007f555ecca8d9 in
std::rethrow_exception(std::__exception_ptr::exception_ptr) () from
/lib64/libstdc++.so.6
#1 0x00000000005b9dc6 in main_ns::is_enomem () at main.cc:66
#2 0x00000000005e4508 in main_ns::isolate_on_error (commit_log=) at main.cc:80
#3 0x0000000000b8e05b in
db::commitlog::segment_manager::<lambda(auto:30)>::operator()std::__exception_ptr::exception_ptr
(__closure=, ep=...) at db/commitlog/commitlog.cc:1077
#4 do_void_futurize_apply<const
db::commitlog::segment_manager::on_timer()::<lambda(auto:30)>&,
std::__exception_ptr::exception_ptr> (func=...) at
/home/benoit/scylla/seastar/core/future.hh:1138
#5 futurize::apply<const
db::commitlog::segment_manager::on_timer()::<lambda(auto:30)>&,
std::__exception_ptr::exception_ptr> (func=...) at
/home/benoit/scylla/seastar/core/future.hh:1186
#6 future<>::<lambda(auto:5&&)>::operator()<future<> >
(__closure=, fut=<unknown type in
/home/benoit/scylla/build/release/scylla, CU 0x8f9bd4f, DIE 0x90f10f0>)
at /home/benoit/scylla/seastar/core/future.hh:1010
#7 futurize<future<> >::apply<future::handle_exception(Func&&) [with
Func = db::commitlog::segment_manager::on_timer()::<lambda(auto:30)>; T =
{}]::<lambda(auto:5&&)>, future<> > (func=)
at /home/benoit/scylla/seastar/core/future.hh:1203
#8 future<>::<lambda(auto:2&&)>::operator()<future_state<> >
(state=<unknown type in /home/benoit/scylla/build/release/scylla, CU
0x8f9bd4f, DIE 0x9114167>, __closure=0x60400074fe80)
at /home/benoit/scylla/seastar/core/future.hh:881
#9 continuation<future::then_wrapped(Func&&) [with Func =
future::handle_exception(Func&&) [with Func =
db::commitlog::segment_manager::on_timer()::<lambda(auto:30)>; T =
{}]::<lambda(auto:5&&)>; Result =
future<>; T = {}]::<lambda(auto:2&&)> >::run(void) (this=0x60400074fe70) at
/home/benoit/scylla/seastar/core/future.hh:399
#10 0x000000000047026d in reactor::run_tasks (this=this@entry=0x6040000c3000,
tasks=...) at core/reactor.cc:1317
#11 0x000000000049bfab in reactor::run (this=0x6040000c3000) at
core/reactor.cc:1621
#12 0x00000000004bc217 in smp::<lambda()>::operator()(void) const
(__closure=0x6000000b3900) at core/reactor.cc:2376
#13 0x00000000004fbf2e in std::function<void ()>::operator()() const
(this=) at /usr/include/c++/5.3.1/functional:2271
#14 posix_thread::start_routine (arg=) at core/posix.cc:51
#15 0x00007f555c30260a in start_thread () from /lib64/libpthread.so.0
#16 0x00007f555c03ca9d in clone () from /lib64/libc.so.6
(gdb) frame 0

For some reason it does not like rethrowing the current exception in a
pattern I use to isolate_on_error.

This is my fault I will dig this.

On Wed, Dec 30, 2015 at 7:41 PM, Gleb Natapov notifications@github.com
wrote:

On Wed, Dec 30, 2015 at 10:37:57AM -0800, Benoît Canet wrote:

Erreur de segmentation (core dumped)
Anything interesting there?

Gleb.


Reply to this email directly or view it on GitHub
#737 (comment).

@benoit-canet
Copy link
Author

I must rethrow a null ptr exception

On Wed, Dec 30, 2015 at 7:51 PM, Benoît Canet benoit@cloudius-systems.com
wrote:

#0 0x00007f555ecca8d9 in
std::rethrow_exception(std::__exception_ptr::exception_ptr) () from
/lib64/libstdc++.so.6
#1 0x00000000005b9dc6 in main_ns::is_enomem () at main.cc:66
#2 0x00000000005e4508 in main_ns::isolate_on_error (commit_log=) at main.cc:80
#3 0x0000000000b8e05b in
db::commitlog::segment_manager::<lambda(auto:30)>::operator()std::__exception_ptr::exception_ptr
(__closure=, ep=...) at db/commitlog/commitlog.cc:1077
#4 do_void_futurize_apply<const
db::commitlog::segment_manager::on_timer()::<lambda(auto:30)>&,
std::__exception_ptr::exception_ptr> (func=...) at
/home/benoit/scylla/seastar/core/future.hh:1138
#5 futurize::apply<const
db::commitlog::segment_manager::on_timer()::<lambda(auto:30)>&,
std::__exception_ptr::exception_ptr> (func=...) at
/home/benoit/scylla/seastar/core/future.hh:1186
#6 future<>::<lambda(auto:5&&)>::operator()<future<> >
(__closure=, fut=<unknown type in
/home/benoit/scylla/build/release/scylla, CU 0x8f9bd4f, DIE 0x90f10f0>)
at /home/benoit/scylla/seastar/core/future.hh:1010
#7 futurize<future<> >::apply<future::handle_exception(Func&&) [with
Func = db::commitlog::segment_manager::on_timer()::<lambda(auto:30)>; T =
{}]::<lambda(auto:5&&)>, future<> > (func=)
at /home/benoit/scylla/seastar/core/future.hh:1203
#8 future<>::<lambda(auto:2&&)>::operator()<future_state<> >
(state=<unknown type in /home/benoit/scylla/build/release/scylla, CU
0x8f9bd4f, DIE 0x9114167>, __closure=0x60400074fe80)
at /home/benoit/scylla/seastar/core/future.hh:881
#9 continuation<future::then_wrapped(Func&&) [with Func =
future::handle_exception(Func&&) [with Func =
db::commitlog::segment_manager::on_timer()::<lambda(auto:30)>; T =
{}]::<lambda(auto:5&&)>; Result =
future<>; T = {}]::<lambda(auto:2&&)> >::run(void) (this=0x60400074fe70)
at /home/benoit/scylla/seastar/core/future.hh:399
#10 0x000000000047026d in reactor::run_tasks (this=this@entry=0x6040000c3000,
tasks=...) at core/reactor.cc:1317
#11 0x000000000049bfab in reactor::run (this=0x6040000c3000) at
core/reactor.cc:1621
#12 0x00000000004bc217 in smp::<lambda()>::operator()(void) const
(__closure=0x6000000b3900) at core/reactor.cc:2376
#13 0x00000000004fbf2e in std::function<void ()>::operator()() const
(this=) at /usr/include/c++/5.3.1/functional:2271
#14 posix_thread::start_routine (arg=) at core/posix.cc:51
#15 0x00007f555c30260a in start_thread () from /lib64/libpthread.so.0
#16 0x00007f555c03ca9d in clone () from /lib64/libc.so.6
(gdb) frame 0

For some reason it does not like rethrowing the current exception in a
pattern I use to isolate_on_error.

This is my fault I will dig this.

On Wed, Dec 30, 2015 at 7:41 PM, Gleb Natapov notifications@github.com
wrote:

On Wed, Dec 30, 2015 at 10:37:57AM -0800, Benoît Canet wrote:

Erreur de segmentation (core dumped)
Anything interesting there?

Gleb.


Reply to this email directly or view it on GitHub
#737 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants