New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memory corruption - "Segmentation fault on shard 2" on 4 nodes after nodetool refresh #14618
Comments
This is a memory corruption bug, so I'm setting P1. |
I looked at 3 out of 4 cores. In all cases the 32-byte small pool is corrupted. Its It appears that either But the crashes happen in unrelated code paths. It seems that the corruption happened some time before the crash and its source won't be easily figured out from the dump. But since it happened on multiple nodes at once, it's likely reproducible, and we could bisect it. |
Misaligned pointers can happen when we have a seastar shared pointer that crossed shards. This can causes an early free, or a wild increment or decrement that can hit something. |
Absolutely |
We can try a version that has SEASTAR_DEBUG_SHARED_PTR set, if it indeed looks like this sort of bug. |
Wow, it really is a wild decrement. I verified this by reconstructing the presumed true head of the free list from the garbage, and following it until the end. The length of this list is consistent with the length written down in In other words, there are some wild decrements-after-free. (In 2/3 cores the wild decrement happened once, in 1/3 cores the decrement happened thrice.) But I'm not sure if I see how cross-sharded seastar shared pointers explain this. Do you mean their non-atomic crements? It would explain what we see, but isn't it unlikely (although totally possible) that their made the count off by 3? |
seems a duplicate of #14475 (comment) |
We've seen it (wild decrements due to badly shared pointers) multiple times. |
Yes |
my gut feeling is that this is a recent regression. |
For sure |
that's indeed a good candidate. lw shared ptr doesn't use atomic for refcounting, so it's susceptible to races. |
Well that's deleted code. I think it's |
Could be tested by forcing a reshard with a debug build. |
/cc @Deexie |
That deleted code made a local copy of |
But I don't see anything wrong there. The shared_ptr doesn't come from the caller shard or is leaked back into it. |
Yes |
@michoecho please file a revert pull request with this analysis. |
Revert? Or just fix the bug by making a local copy of the vector, as it was before the bad patch? |
This reverts commit 2a58b4a, reversing changes made to dd63169. After patch 87c8d63, table_resharding_compaction_task_impl::run() performs the forbidden action of copying a lw_shared_ptr (_owned_ranges_ptr) on a remote shard, which is a data race that can cause a use-after-free, typically manifesting as allocator corruption. Note: before the bad patch, this was avoided by copying the _contents_ of the lw_shared_ptr into a new, local lw_shared_ptr. Fixes scylladb#14475 Fixes scylladb#14618
table_resharding_compaction_task_impl::run() performs the forbidden action of copying a lw_shared_ptr (_owned_ranges_ptr) on a remote shard, which is a data race that can cause a use-after-free, typically manifesting as allocator corruption. Content of _owned_ranges_ptr is copied to local lw_shared_ptrs. Fixes scylladb#14475 Fixes scylladb#14618
we need reshard on refresh as the owned ranges must be propagated I think we can even make this a py test in core |
Task manager tasks covering reshard compaction. Reattempt on #14044. Bugfix for #14618 is squashed with 95191f4. Regression test added. Closes #14739 * github.com:scylladb/scylladb: test: add test for resharding with non-empty owned_ranges_ptr test: extend test_compaction_task.py to test resharding compaction compaction: add shard_reshard_sstables_compaction_task_impl compaction: invoke resharding on sharded database compaction: move run_resharding_jobs into reshard_sstables_compaction_task_impl::run() compaction: add reshard_sstables_compaction_task_impl compaction: create resharding_compaction_task_impl
No vulnerable branches, not backporting. |
Issue description
Issued a nodetool refresh on all 6 cluster nodes.
Few seconds later 4 nodes got segmentation fault.
(not sure it's not related to #14299)
core dump details:
Describe your issue in detail and steps it took to produce it.
Impact
Describe the impact this issue causes to the user.
How frequently does it reproduce?
Describe the frequency with how this issue can be reproduced.
Installation details
Kernel Version: 5.15.0-1037-gcp
Scylla version (or git commit hash):
5.4.0~dev-20230703.1ab2bb69b8a6
with build-id1054adfa55f238441c3044e6213fd02d31d43279
Cluster size: 6 nodes (n1-highmem-16)
Scylla Nodes used in this run:
OS / Image: `` (gce: undefined_region)
Test:
longevity-10gb-3h-gce-test
Test id:
996fb5c0-da43-41b1-9a72-4b1ee54f4fac
Test name:
scylla-master/longevity/longevity-10gb-3h-gce-test
Test config file(s):
Logs and commands
$ hydra investigate show-monitor 996fb5c0-da43-41b1-9a72-4b1ee54f4fac
$ hydra investigate show-logs 996fb5c0-da43-41b1-9a72-4b1ee54f4fac
Logs:
Jenkins job URL
Argus
The text was updated successfully, but these errors were encountered: