Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Iterator::Refresh() with a snapshot #10487

Closed
rockeet opened this issue Aug 5, 2022 · 4 comments
Closed

Feature Request: Iterator::Refresh() with a snapshot #10487

rockeet opened this issue Aug 5, 2022 · 4 comments

Comments

@rockeet
Copy link
Contributor

rockeet commented Aug 5, 2022

Current Iterator::Refresh() does not support snapshot, we have no way to refresh an iterator to a specified snapshot, instead we create a new iterator, but creating new iterator is heavy.

Refresh iterator to a specified snapshot, we can avoid creating new iterator thus improving performance.

@cbi42
Copy link
Member

cbi42 commented Aug 5, 2022

I see the current Refresh() is actually pretty heavyweight and takes similar amount of work as creating a new iterator when superversion number is changed:

if (sv_number_ != cur_sv_number) {
Env* env = db_iter_->env();
db_iter_->~DBIter();
arena_.~Arena();
new (&arena_) Arena();
SuperVersion* sv = cfd_->GetReferencedSuperVersion(db_impl_);
SequenceNumber latest_seq = db_impl_->GetLatestSequenceNumber();
if (read_callback_) {
read_callback_->Refresh(latest_seq);
}
Init(env, read_options_, *(cfd_->ioptions()), sv->mutable_cf_options,
sv->current, latest_seq,
sv->mutable_cf_options.max_sequential_skip_in_iterations,
cur_sv_number, read_callback_, db_impl_, cfd_, expose_blob_index_,
allow_refresh_);
InternalIterator* internal_iter = db_impl_->NewInternalIterator(
read_options_, cfd_, sv, &arena_, db_iter_->GetRangeDelAggregator(),
latest_seq, /* allow_unprepared_value */ true);
SetIterUnderDBIter(internal_iter);

So supporting iterator refresh to a new snapshot might only get performance benefit when superversion number is not changed. Is that the performance improvement you expected? Or you expect some improvements could be done when superversion changes too?

@rockeet
Copy link
Contributor Author

rockeet commented Aug 6, 2022

I seeked for a feature to avoid creating new iterator when I need to update an iterator to a specified snapshot, then I found Refresh, I expect Refresh is much lighter than creating new iterator.

As an online service, super version change should be much less frequent than snapshot change, so Refresh with snapshot is much lighter in most cases. If super version is really changed, with carefully coding, the Refresh computation should be still lighter than creating a new iterator.

@cbi42 cbi42 added the up-for-grabs Up for grabs label Aug 9, 2022
@cbi42
Copy link
Member

cbi42 commented Aug 11, 2022

Hi @rockeet, I agree this is a reasonable feature to add, and optimizing the part where superversion is not changed should be easier to implement. It's not planned yet, so I'll leave it as up-for-grabs for now.

@rockeet
Copy link
Contributor Author

rockeet commented Aug 11, 2022

Hi @rockeet, I agree this is a reasonable feature to add, and optimizing the part where superversion is not changed should be easier to implement. It's not planned yet, so I'll leave it as up-for-grabs for now.

Very Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants