New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
table: Fix quadratic behavior when inserting sstables into tracker on schema change #12593
table: Fix quadratic behavior when inserting sstables into tracker on schema change #12593
Conversation
CI state |
replica/table.cc
Outdated
}); | ||
} | ||
|
||
void execute() noexcept { | ||
new_bt.replace_sstables({}, std::move(new_sstables_for_backlog_tracker)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new_sstables_for_backlog_tracker can also be computed on-the-fly here using boost::copy_range<sstables::shared_sstable>(*new_sstables->all)
, no?
It might be slightly more efficient since it will reserve the required capacity once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is noexcept path though. also not a hot path. I can reserve new_sstables_for_backlog_tracker upfront. what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is noexcept path though.
ok, so let's do that in prepare
.
also not a hot path. I can reserve new_sstables_for_backlog_tracker upfront. what do you think?
Do we know the size upfront?
I don't think sstable_set gives you that, does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is noexcept path though.
ok, so let's do that in
prepare
.also not a hot path. I can reserve new_sstables_for_backlog_tracker upfront. what do you think?
Do we know the size upfront? I don't think sstable_set gives you that, does it?
I think we can easily implement a size() method for sstable_set. Today, we have to call all() which can copy elements, but that's possibly better than reallocating log base_2 N, where N is number of elements inserted into container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compaction_backlog_tracker::replace_sstables() can fail, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is noexcept path though.
ok, so let's do that in
prepare
.also not a hot path. I can reserve new_sstables_for_backlog_tracker upfront. what do you think?
Do we know the size upfront? I don't think sstable_set gives you that, does it?
I think we can easily implement a size() method for sstable_set. Today, we have to call all() which can copy elements, but that's possibly better than reallocating log base_2 N, where N is number of elements inserted into container.
Let's consider this for follow up then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compaction_backlog_tracker::replace_sstables() can fail, no?
Yup,
scylladb/compaction/compaction_manager.cc
Line 1725 in 5417074
ret.push_back(sst); |
Why does execute
need to be noexcept?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, technically errors are handled their in
scylladb/compaction/compaction_manager.cc
Lines 1731 to 1737 in 5417074
try { | |
_impl->replace_sstables(filter_and_revert_charges(old_ssts), filter_and_revert_charges(new_ssts)); | |
} catch (...) { | |
cmlog.error("Disabling backlog tracker due to exception {}", std::current_exception()); | |
// FIXME: tracker should be able to recover from a failure, e.g. OOM, by having its state reset. More details on https://github.com/scylladb/scylla/issues/10297. | |
disable(); | |
} |
But I'm not sure if that's the best course of action.
Seems harsh to disable the backlog tracker on e.g. transient memory pressure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is noexcept path though.
ok, so let's do that in
prepare
.also not a hot path. I can reserve new_sstables_for_backlog_tracker upfront. what do you think?
Do we know the size upfront? I don't think sstable_set gives you that, does it?
I think we can easily implement a size() method for sstable_set. Today, we have to call all() which can copy elements, but that's possibly better than reallocating log base_2 N, where N is number of elements inserted into container.
Let's consider this for follow up then.
sure.
f57c507
to
7149404
Compare
v2: replace sstables in new bt in prepare() phase, then execute() remains noexcept. |
… schema change Each time backlog tracker is informed about a new or old sstable, it will recompute the static part of backlog which complexity is proportional to the total number of sstables. On schema change, we're calling backlog_tracker::replace_sstables() for each existing sstable, therefore it produces O(N ^ 2) complexity. Fixes scylladb#12499. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
7149404
to
ed7dbfe
Compare
cg.main_sstables()->for_each_sstable([this] (const sstables::shared_sstable& s) { | ||
add_sstable_to_backlog_tracker(new_bt, s); | ||
std::vector<sstables::shared_sstable> new_sstables_for_backlog_tracker; | ||
new_sstables_for_backlog_tracker.reserve(cg.main_sstables()->all()->size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't all build a sstable_list?
If so, we can just keep it and use it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not always. It builds a new list if the set is compound (maintenance + main or sets of all groups). Here we're working on main set only of a single group.
For partititioned_sstable_set, for example, we just return its local copy of all.
I think the interface is fragile as user can modify local copy if it doesn't copy on write.
lw_shared_ptr<sstable_list> all() const;
should be changed to
lw_shared_ptr<const sstable_list> all() const;
CI state |
CI state |
@scylladb/scylla-maint ping. |
… schema change Each time backlog tracker is informed about a new or old sstable, it will recompute the static part of backlog which complexity is proportional to the total number of sstables. On schema change, we're calling backlog_tracker::replace_sstables() for each existing sstable, therefore it produces O(N ^ 2) complexity. Fixes scylladb#12499. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb#12593
… schema change Each time backlog tracker is informed about a new or old sstable, it will recompute the static part of backlog which complexity is proportional to the total number of sstables. On schema change, we're calling backlog_tracker::replace_sstables() for each existing sstable, therefore it produces O(N ^ 2) complexity. Fixes scylladb#12499. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb#12593 (cherry picked from commit 87ee547) Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
… schema change Each time backlog tracker is informed about a new or old sstable, it will recompute the static part of backlog which complexity is proportional to the total number of sstables. On schema change, we're calling backlog_tracker::replace_sstables() for each existing sstable, therefore it produces O(N ^ 2) complexity. Fixes scylladb#12499. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb#12593 (cherry picked from commit 87ee547) Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Each time backlog tracker is informed about a new or old sstable, it will recompute the static part of backlog which complexity is proportional to the total number of sstables.
On schema change, we're calling backlog_tracker::replace_sstables() for each existing sstable, therefore it produces O(N ^ 2) complexity.
Fixes #12499.
Signed-off-by: Raphael S. Carvalho raphaelsc@scylladb.com