Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast path for detecting unchanged prefix_extractor #9407

Closed

Conversation

pdillinger
Copy link
Contributor

@pdillinger pdillinger commented Jan 20, 2022

Summary: Fixes a major performance regression in 6.26, where
extra CPU is spent in SliceTransform::AsString when reads involve
a prefix_extractor (Get, MultiGet, Seek). Common case performance
is now better than 6.25.

This change creates a "fast path" for verifying that the current prefix
extractor is unchanged and compatible with what was used to
generate a table file. This fast path detects the common case by
pointer comparison on the current prefix_extractor and a "known
good" prefix extractor (if applicable) that is saved at the time the
table reader is opened. The "known good" prefix extractor is saved
as another shared_ptr copy (in an existing field, however) to ensure
the pointer is not recycled.

When the prefix_extractor has changed to a different instance but
same compatible configuration (rare, odd), performance is still a
regression compared to 6.25, but this is likely acceptable because
of the oddity of such a case. The performance of incompatible
prefix_extractor is essentially unchanged.

Also fixed a minor case (ForwardIterator) where a prefix_extractor
could be used via a raw pointer after being freed as a shared_ptr,
if replaced via SetOptions.

Test Plan:

Performance

Populate DB with TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -benchmarks=fillrandom -num=10000000 -disable_wal=1 -write_buffer_size=10000000 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=12

Running head-to-head comparisons simultaneously with TEST_TMPDIR=/dev/shm/rocksdb ./db_bench -use_existing_db -readonly -benchmarks=seekrandom -num=10000000 -duration=20 -disable_wal=1 -bloom_bits=16 -compaction_style=2 -fifo_compaction_max_table_files_size_mb=10000 -fifo_compaction_allow_compaction=0 -prefix_size=12

Below each is compared by ops/sec vs. baseline which is version 6.25 (multiple baseline runs because of variable machine load)

v6.26: 4833 vs. 6698 (<- major regression!)
v6.27: 4737 vs. 6397 (still)
New: 6704 vs. 6461 (better than baseline in common case)
Disabled fastpath: 4843 vs. 6389 (e.g. if prefix extractor instance changes but is still compatible)
Changed prefix size (no usable filter) in new: 787 vs. 5927
Changed prefix size (no usable filter) in new & baseline: 773 vs. 784

Summary: A proposed alternate fix to performance regression associated
with SliceTransform::AsString, vs. facebook#9401

Test Plan: performance testing TODO
@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@mrambacher mrambacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this logic is the same as the original. The original checked to see if the input prefix extractor was the same one as the one used in building the table. This new logic checks to see if the input prefix extractor is the same one as used by the last operation.

What happens if you build a table with one extractor and then do multiple operations using a different extractor? It seems like the results could be different with this new code compared to the original.

@pdillinger
Copy link
Contributor Author

This new logic checks to see if the input prefix extractor is the same one as used by the last operation.

No it doesn't. known_good_prefix_extractor_instance_id is only assigned non-zero in BlockBasedTable::Open, and only if !PrefixExtractorChangedHelper(rep->table_properties.get(), prefix_extractor); in other words, if the current prefix extractor matches that in the table file.

@pdillinger
Copy link
Contributor Author

I considered getting fancy with atomics to optimize more cases, but decided to keep it simple.

@facebook-github-bot
Copy link
Contributor

@pdillinger has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@pdillinger
Copy link
Contributor Author

@mrambacher pointed out that we can safely do pointer comparison (rather than instance id stuff) if we plumb through shared_ptr<SliceTransform> to just a few more places, which should be fine and save public API additions.

@facebook-github-bot
Copy link
Contributor

@pdillinger has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@pdillinger has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@mrambacher mrambacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change better, though I am worried there are race conditions if the MutableCFOptions.prefix_extractor changes (though these are probably not new).

I am not a fan of storing a reference to the shared_ptr (as opposed to a copy), as I think the reference will cause problems on updates.

Also, can you run and post the same performance test results with this change as the earlier one?

@@ -211,7 +211,8 @@ class ForwardLevelIterator : public InternalIterator {
Status status_;
InternalIterator* file_iter_;
PinnedIteratorsManager* pinned_iters_mgr_;
const SliceTransform* prefix_extractor_;
// Kept alive by ForwardIterator::sv_->mutable_cf_options
const std::shared_ptr<const SliceTransform>& prefix_extractor_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a const std::shared_ptr & or just a shared_ptr? I am thinking that if someone changed the mutable option, the value stored here would also change whereas it would not if it was a copy (and the ownership/keep alive would not appy).

Copy link
Contributor Author

@pdillinger pdillinger Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, SuperVersion::mutable_cf_options, Version::mutable_cf_options_ and Compaction::mutable_cf_options_ are actually immutable. For this change, they are generally referenced in a context where something is holding on the that owning entity. For example, if an iterator doesn't hold on to a SuperVersion, it can't guarantee other data stays around.

The UAF case I ran into above is an iterator refresh. I should probably look for more similar to that.

Copying shared_ptr in too many places would likely negate most of the performance gains.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UAF case I ran into above is an iterator refresh. I should probably look for more similar to that.

I inspected ArenaWrappedDBIter::Refresh and it looks safe. It re-loads everything including prefix_extractor after changing SuperVersion.

file_options.compaction_readahead_size,
allow_unprepared_value);
result = table_reader->NewIterator(
options, prefix_extractor.get(), arena, skip_filters, caller,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a potential bug to me (not necessarily new/related to this) but it seems like the prefix_extractor can go out of scope while this operation is active potentially causing a crash. What happens if an Iterator is open and the prefix_extractor is changed through a mutable option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to come from a SuperVersion, Version, or Compaction

@@ -451,7 +452,7 @@ Status TableCache::Get(const ReadOptions& options,
}
if (s.ok()) {
get_context->SetReplayLog(row_cache_entry); // nullptr if no cache.
s = t->Get(options, k, get_context, prefix_extractor, skip_filters);
s = t->Get(options, k, get_context, prefix_extractor.get(), skip_filters);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the same race as above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Owned & immutable by Version for life of Get

@@ -1011,7 +1012,7 @@ class LevelIterator final : public InternalIterator {
// `prefix_extractor_` may be non-null even for total order seek. Checking
// this variable is not the right way to identify whether prefix iterator
// is used.
const SliceTransform* prefix_extractor_;
const std::shared_ptr<const SliceTransform>& prefix_extractor_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as before. Should this be a ref or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has several other const ref fields. Best I can tell, the lifetime of LevelIterator is bounded within a function call (Version::OverlapWithLevelIterator) or to a Compaction (VersionSet::MakeInputIterator).

@@ -564,7 +564,7 @@ Status VersionEditHandler::LoadTables(ColumnFamilyData* cfd,
cfd->internal_stats(),
version_set_->db_options_->max_file_opening_threads,
prefetch_index_and_filter_in_cache, is_initial_load,
cfd->GetLatestMutableCFOptions()->prefix_extractor.get(),
cfd->GetLatestMutableCFOptions()->prefix_extractor,
Copy link
Contributor Author

@pdillinger pdillinger Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a case of using prefix_extractor from ColumnFamilyData, but this appears to only be called under DB::Open so not actually mutable

@@ -544,7 +544,7 @@ class Repairer {
InternalIterator* iter = table_cache_->NewIterator(
ropts, file_options_, cfd->internal_comparator(), t->meta,
nullptr /* range_del_agg */,
cfd->GetLatestMutableCFOptions()->prefix_extractor.get(),
cfd->GetLatestMutableCFOptions()->prefix_extractor,
Copy link
Contributor Author

@pdillinger pdillinger Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a case of using prefix_extractor from ColumnFamilyData, but under Repairer, it should not be mutable (from SetOptions).

@facebook-github-bot
Copy link
Contributor

@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@mrambacher mrambacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is potentially dangerous how the prefix extractor is being passed/stored but trust that your analysis is complete and accurate.

@pdillinger
Copy link
Contributor Author

pdillinger commented Jan 21, 2022

I think it is potentially dangerous how the prefix extractor is being passed/stored but trust that your analysis is complete and accurate.

I agree there's non-zero risk, and I don't like having the internal APIs in a precarious state either. There is an up side however: if we are to stand by SetOptions as a feature, these changes make it easier to surface UAF bugs involving SetOptions on prefix_extractor because more of them can show without SetOptions--like the one I fixed here in ForwardIterator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants