Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storaged crash when scanEdge called #3467

Closed
kikimo opened this issue Dec 14, 2021 · 2 comments · Fixed by #3427
Closed

Storaged crash when scanEdge called #3467

kikimo opened this issue Dec 14, 2021 · 2 comments · Fixed by #3427
Assignees
Labels
type/bug Type: something is unexpected
Milestone

Comments

@kikimo
Copy link
Contributor

kikimo commented Dec 14, 2021

Please check the FAQ documentation before raising an issue

Storaged crash when scan edge with the following parameter:

	req := &storage.ScanEdgeRequest{
		SpaceID: globalSpaceID,
		Parts: map[int32]*storage.ScanCursor{
			globalPartitionID: {
				HasNext:    true,
				NextCursor: nil,
			},
		},
...

the crash stack:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
--Type <RET> for more, q to quit, c to continue without paging--c
Core was generated by `/root/src/nebula/build/bin/nebula-storaged --flagfile /root/nebula-chaos-cluste'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000004f15958 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
[Current thread is 1 (Thread 0x7f873f1ff700 (LWP 154))]
(gdb) bt
#0  0x0000000004f15958 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#1  0x0000000002d13411 in nebula::storage::ScanEdgeProcessor::runInSingleThread (this=0x7f8770e7d600, req=...) at /root/src/nebula/src/storage/query/ScanEdgeProcessor.cpp:139
#2  0x0000000002d125cd in nebula::storage::ScanEdgeProcessor::doProcess (this=0x7f8770e7d600, req=...) at /root/src/nebula/src/storage/query/ScanEdgeProcessor.cpp:53
#3  0x0000000002d11e81 in nebula::storage::ScanEdgeProcessor::<lambda()>::operator()(void) const (__closure=0x7f8770e211c0) at /root/src/nebula/src/storage/query/ScanEdgeProcessor.cpp:19
#4  0x0000000002d19073 in folly::detail::function::FunctionTraits<void()>::callBig<nebula::storage::ScanEdgeProcessor::process(const nebula::storage::cpp2::ScanEdgeRequest&)::<lambda()> >(folly::detail::function::Data &) (p=...)
    at /root/src/nebula/build/third-party/install/include/folly/Function.h:385
#5  0x000000000457b8a7 in virtual thunk to apache::thrift::concurrency::FunctionRunner::run() ()
#6  0x00000000046b0ca3 in apache::thrift::concurrency::ThreadManager::Impl::Worker::run() ()
#7  0x00000000046b48fd in apache::thrift::concurrency::PthreadThread::threadMain(void*) ()
#8  0x00007f877f03f609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#9  0x00007f877ef66293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) up
#1  0x0000000002d13411 in nebula::storage::ScanEdgeProcessor::runInSingleThread (this=0x7f8770e7d600, req=...) at /root/src/nebula/src/storage/query/ScanEdgeProcessor.cpp:139
139	    auto ret = plan.go(partId, cursor.get_has_next() ? *cursor.get_next_cursor() : "");
(gdb) down
#0  0x0000000004f15958 in std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
(gdb) up
#1  0x0000000002d13411 in nebula::storage::ScanEdgeProcessor::runInSingleThread (this=0x7f8770e7d600, req=...) at /root/src/nebula/src/storage/query/ScanEdgeProcessor.cpp:139
139	    auto ret = plan.go(partId, cursor.get_has_next() ? *cursor.get_next_cursor() : "");
(gdb) p partId
$1 = 1
(gdb) p cursor.get_has_next()
You can't do that without a process to debug.
(gdb) p cursor
$2 = {static __fbthrift_cpp2_gen_json = false, static __fbthrift_cpp2_gen_nimble = false, static __fbthrift_cpp2_gen_has_thrift_uri = false, static __fbthrift_cpp2_is_union = false, has_next = true, next_cursor = {static npos = 18446744073709551615,
    _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x7f873f1fc828 ""}, _M_string_length = 0, {_M_local_buf = '\000' <repeats 15 times>, _M_allocated_capacity = 0}}, __isset = {has_next = true,
    next_cursor = false}}
(gdb)

Your Environments (required)

  • OS: uname -a
  • Compiler: g++ --version or clang++ --version
  • CPU: lscpu
  • Commit id (e.g. a3ffc7d8)

How To Reproduce(required)

Steps to reproduce the behavior:

  1. Step 1
  2. Step 2
  3. Step 3

Expected behavior

Additional context

@kikimo kikimo added the type/bug Type: something is unexpected label Dec 14, 2021
@Shylock-Hg
Copy link
Contributor

When has_next set to true, the next_cursor can't be null.

@Shylock-Hg
Copy link
Contributor

A fix in #3427

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Type: something is unexpected
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants