[branch 5.2] Iterator invalidation in lsa_partition_reader::reset_state() #14696

michoecho · 2023-07-14T06:26:34Z

do_refresh_state() keeps iterators to rows_entry in a vector.
This vector might be resized during the procedure, triggering memory reclaim and invalidating the iterators, which can cause arbitrarily long loops and/or a segmentation fault during make_heap().

michoecho · 2023-07-14T06:26:52Z

Refs https://github.com/scylladb/scylla-enterprise/issues/3125

…_state do_refresh_state() keeps iterators to rows_entry in a vector. This vector might be resized during the procedure, triggering memory reclaim and invalidating the iterators, which can cause arbitrarily long loops and/or a segmentation fault during make_heap(). To fix this, do_refresh_state has to always be called from the allocating section. Additionally, it turns out that the first do_refresh_state is useless, because reset_state() doesn't set _change_mark. This causes do_refresh_state to be needlessly repeated during a next_row() or next_range_tombstone() which happens immediately after it. Therefore this patch moves the _change_mark assignment from maybe_refresh_state to do_refresh_state, so that the change mark is properly set even after the first refresh. Fixes scylladb#14696

…_state do_refresh_state() keeps iterators to rows_entry in a vector. This vector might be resized during the procedure, triggering memory reclaim and invalidating the iterators, which can cause arbitrarily long loops and/or a segmentation fault during make_heap(). To fix this, do_refresh_state has to always be called from the allocating section. Additionally, it turns out that the first do_refresh_state is useless, because reset_state() doesn't set _change_mark. This causes do_refresh_state to be needlessly repeated during a next_row() or next_range_tombstone() which happens immediately after it. Therefore this patch moves the _change_mark assignment from maybe_refresh_state to do_refresh_state, so that the change mark is properly set even after the first refresh. Fixes #14696 Closes #14697

bhalevy · 2023-07-23T06:26:09Z

@michoecho I understand that #14696 was merged to 5.2
Can you please document what fixed this issue in later release?

@scylladb/scylla-maint should we close this issue and add the Backport Candidate label to facilitate backport to 5.1?

michoecho · 2023-07-23T19:18:59Z

Can you please document what fixed this issue in later release?

It was fixed by #12048, which rewrote the entire implementation of memtable readers (unifying it with partition_snapshot_row_cursor).

denesb · 2023-07-24T06:54:40Z

Fixed in 41aef6d (Gihub didn't auto-close the issue because the fix was targeted at 5.2).

denesb · 2023-07-24T06:56:25Z

@scylladb/scylla-maint should we close this issue and add the Backport Candidate label to facilitate backport to 5.1?

Should we only backport to 5.1? How about enterprise? @michoecho what are the user-visible symptoms and how hard it is to trigger?

michoecho · 2023-07-24T07:22:33Z

Should we only backport to 5.1? How about enterprise?

We should backport it to all supported versions, including enterprise.

@michoecho what are the user-visible symptoms

Segmentation fault.

how hard it is to trigger?

Very hard — it took us a few years to discover it.

But on the other hand, it was discovered when it happened in production, and it took down two nodes at once.

In other words, you have to be very unlucky to trigger it, but there are some workloads which make it likely enough.

denesb · 2023-07-24T12:12:18Z

I see no test attached to the fix. I presume this is very hard to reproduce in a unit test. Was this patch tested manually, perhaps on the cluster this bug affected?

michoecho · 2023-07-24T12:46:49Z

I see no test attached to the fix. I presume this is very hard to reproduce in a unit test.

Yes, see #14697 (comment).

In general it's impossible to test for invalidation bugs reliably. This is a systemic problem. It would require additional LSA machinery to fix. For example, we could force all accesses to LSA to go through handles (rather than raw pointers/references) which would check (at least in debug mode) that no invalidation happened. (The key is that tests should also check for potential invalidation — any instance of allocation with reclaiming enabled, even if the allocation didn't happened to trigger reclaim).

Was this patch tested manually, perhaps on the cluster this bug affected?

The only "testing" here was me fooling around trying to write the unit test for this. The patch relied on review rather than testing.

You can't test it on the affected cluster, because it's not reproducing the bug reliably. It happened once — on two nodes, but only once — and didn't happen again.

denesb · 2023-07-24T12:59:55Z

Ok, let's let the fix see at least one OSS release, to make sure there are no adverse effects, then we can backport.

…_state do_refresh_state() keeps iterators to rows_entry in a vector. This vector might be resized during the procedure, triggering memory reclaim and invalidating the iterators, which can cause arbitrarily long loops and/or a segmentation fault during make_heap(). To fix this, do_refresh_state has to always be called from the allocating section. Additionally, it turns out that the first do_refresh_state is useless, because reset_state() doesn't set _change_mark. This causes do_refresh_state to be needlessly repeated during a next_row() or next_range_tombstone() which happens immediately after it. Therefore this patch moves the _change_mark assignment from maybe_refresh_state to do_refresh_state, so that the change mark is properly set even after the first refresh. Fixes #14696 Closes #14697 (cherry picked from commit 41aef6d)

denesb · 2023-08-09T06:08:51Z

Backported to 5.1.

michoecho mentioned this issue Jul 14, 2023

partition_snapshot_reader.hh: fix iterator invalidation in do_refresh_state #14697

Closed

michoecho self-assigned this Jul 14, 2023

michoecho added type/bug Requires-Backport-to-5.1 backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed labels Jul 14, 2023

denesb closed this as completed Jul 24, 2023

denesb added Backport candidate and removed backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed labels Jul 24, 2023

denesb removed Backport candidate Requires-Backport-to-5.1 labels Aug 9, 2023

DoronArazii added this to the 5.4 milestone Aug 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[branch 5.2] Iterator invalidation in lsa_partition_reader::reset_state() #14696

[branch 5.2] Iterator invalidation in lsa_partition_reader::reset_state() #14696

michoecho commented Jul 14, 2023

michoecho commented Jul 14, 2023

bhalevy commented Jul 23, 2023

michoecho commented Jul 23, 2023

denesb commented Jul 24, 2023

denesb commented Jul 24, 2023

michoecho commented Jul 24, 2023 •

edited

denesb commented Jul 24, 2023

michoecho commented Jul 24, 2023

denesb commented Jul 24, 2023

denesb commented Aug 9, 2023

[branch 5.2] Iterator invalidation in lsa_partition_reader::reset_state() #14696

[branch 5.2] Iterator invalidation in lsa_partition_reader::reset_state() #14696

Comments

michoecho commented Jul 14, 2023

michoecho commented Jul 14, 2023

bhalevy commented Jul 23, 2023

michoecho commented Jul 23, 2023

denesb commented Jul 24, 2023

denesb commented Jul 24, 2023

michoecho commented Jul 24, 2023 • edited

denesb commented Jul 24, 2023

michoecho commented Jul 24, 2023

denesb commented Jul 24, 2023

denesb commented Aug 9, 2023

michoecho commented Jul 24, 2023 •

edited