New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: have sequenced reads use the intent history #33244

Open
wants to merge 4 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@ridwanmsharif
Copy link
Collaborator

ridwanmsharif commented Dec 18, 2018

Reads can now use the intent history to serve reads from
values written at a lower sequence. Reads no longer read
all writes in a transaction but instead read all writes
in the transaction with a higher sequence than it. If a key
is written to multiple times within a transaction, the read
can now get the appropriate value using the intent history.

Release note: None

@ridwanmsharif ridwanmsharif requested a review from cockroachdb/core-prs as a code owner Dec 18, 2018

@cockroach-teamcity

This comment has been minimized.

Copy link
Member

cockroach-teamcity commented Dec 18, 2018

This change is Reviewable

@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif:sequenced-reads branch 5 times, most recently from f782b0c to c185bf0 Dec 18, 2018

@ridwanmsharif ridwanmsharif requested a review from nvanbenschoten Dec 19, 2018

@nvanbenschoten
Copy link
Member

nvanbenschoten left a comment

Reviewed 3 of 5 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


c-deps/libroach/mvcc.h, line 169 at r1 (raw file):

  }

  bool getFromIntentHistory(cockroach::storage::engine::enginepb::MVCCMetadata meta, int32_t seq) {

It's strange that we pass two fields into this method but then set another field - so it's not a fully static method. I'd recommend passing no fields in here.

@petermattis might also have other comments about the C++ style.


c-deps/libroach/mvcc.h, line 170 at r1 (raw file):

  bool getFromIntentHistory(cockroach::storage::engine::enginepb::MVCCMetadata meta, int32_t seq) {
    int32_t largestSeq = -1;

We should use a binary_search to accomplish this lookup.


c-deps/libroach/mvcc.h, line 290 at r1 (raw file):

    if (txn_epoch_ == meta_.txn().epoch()) {
      if (txn_sequence_ >= meta_.txn().sequence()) {

We're going to need to make this logic conditional on a cluster version. To do this, we'll need to go through a few steps, outlined in cockroach_versions.go.

We'll then want to check whether this version is "active" somewhere in or above the mvcc layer. This is going to be a little tricky because we'll need the right dependencies in a few places. Here's my suggestion:

  1. Add a new IgnoreSeq field to MVCCGetOptions and MVCCScanOptions
  2. Comment with a TODO(nvanbenschoten) that the field can go away in 2.3.
  3. Set this field when the new cluster version is NOT active in cmd_get.go, cmd_scan.go, and cmd_reverse_scan.go (I just counted and I think we'll need to set this in 5 places, but please double check). Here's an example of how to check whether a cluster setting is active.
  4. Pass this option down through C++
  5. Make this condition (ignore_seq_ || (txn_sequence_ >= meta_.txn().sequence()))

c-deps/libroach/mvcc.h, line 291 at r1 (raw file):

    if (txn_epoch_ == meta_.txn().epoch()) {
      if (txn_sequence_ >= meta_.txn().sequence()) {
        // 8. We're reading our own txn's intent at a higher sequence.

"an equal or higher sequence"


c-deps/libroach/mvcc.h, line 299 at r1 (raw file):

        // 9. We're reading our own txn's intent at a lower sequence.
        // This means the intent we're seeing was written after the read.
        // So we try and read a value from the intent history that is

s/So//


c-deps/libroach/mvcc.h, line 300 at r1 (raw file):

        // This means the intent we're seeing was written after the read.
        // So we try and read a value from the intent history that is
        // appropriate instead. If no value is found still, read the value

s/still//


c-deps/libroach/mvcc.h, line 303 at r1 (raw file):

        // as though the intent was not present.
        bool found = getFromIntentHistory(meta_, txn_sequence_);
        if (!found) {

nit: invert this condition so that we can separate this into a separate "steps" (e.g. step 10.)


pkg/kv/txn_interceptor_sequence_nums.go, line 69 at r1 (raw file):

// populateMetaLocked is part of the txnInterceptor interface. Since reads now
// require the sequence number, we must pass it through as well.

No need for this comment.


pkg/kv/txn_interceptor_sequence_nums.go, line 77 at r1 (raw file):

// augmentMetaLocked is part of the txnInterceptor interface. Since reads now
// require the sequence number, we must ensure we don't ever underestimate
// the sequence.

No need for this comment.

@ridwanmsharif
Copy link
Collaborator

ridwanmsharif left a comment

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


c-deps/libroach/mvcc.h, line 169 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It's strange that we pass two fields into this method but then set another field - so it's not a fully static method. I'd recommend passing no fields in here.

@petermattis might also have other comments about the C++ style.

Done.


c-deps/libroach/mvcc.h, line 170 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We should use a binary_search to accomplish this lookup.

Done. Using std:lower_bound that does almost exactly what we want (internally uses binary search).


c-deps/libroach/mvcc.h, line 290 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We're going to need to make this logic conditional on a cluster version. To do this, we'll need to go through a few steps, outlined in cockroach_versions.go.

We'll then want to check whether this version is "active" somewhere in or above the mvcc layer. This is going to be a little tricky because we'll need the right dependencies in a few places. Here's my suggestion:

  1. Add a new IgnoreSeq field to MVCCGetOptions and MVCCScanOptions
  2. Comment with a TODO(nvanbenschoten) that the field can go away in 2.3.
  3. Set this field when the new cluster version is NOT active in cmd_get.go, cmd_scan.go, and cmd_reverse_scan.go (I just counted and I think we'll need to set this in 5 places, but please double check). Here's an example of how to check whether a cluster setting is active.
  4. Pass this option down through C++
  5. Make this condition (ignore_seq_ || (txn_sequence_ >= meta_.txn().sequence()))

Adding a Commit to the PR that'll add a cluster setting.


c-deps/libroach/mvcc.h, line 291 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"an equal or higher sequence"

Done.


c-deps/libroach/mvcc.h, line 299 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/So//

Done.


c-deps/libroach/mvcc.h, line 300 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/still//

Done.


pkg/kv/txn_interceptor_sequence_nums.go, line 69 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

No need for this comment.

Done.


pkg/kv/txn_interceptor_sequence_nums.go, line 77 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

No need for this comment.

Done.

@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif:sequenced-reads branch from a96a543 to 776e08f Dec 20, 2018

@ridwanmsharif ridwanmsharif requested a review from nvanbenschoten Dec 20, 2018

@nvanbenschoten
Copy link
Member

nvanbenschoten left a comment

Reviewed 3 of 3 files at r2, 11 of 11 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


c-deps/libroach/mvcc.h, line 29 at r2 (raw file):

// compareIntents compares two sequenced intents to check if the first one
// has a sequence equal or lower than the second.
bool compareIntents(cockroach::storage::engine::enginepb::MVCCMetadata_SequencedIntent intent1,

make the args const &.


c-deps/libroach/mvcc.h, line 31 at r2 (raw file):

bool compareIntents(cockroach::storage::engine::enginepb::MVCCMetadata_SequencedIntent intent1,
    cockroach::storage::engine::enginepb::MVCCMetadata_SequencedIntent intent2) {
  if (intent1.sequence() <= intent2.sequence()) {

This can be return intent1.sequence() <= intent2.sequence();, right?

I think it should also be strictly less, based on https://en.cppreference.com/w/cpp/algorithm/lower_bound.


c-deps/libroach/mvcc.h, line 195 at r2 (raw file):

      return false;
    }
    const auto intent = *(low - 1);

std::lower_bound returns " the first element in the range [first, last) that is not less than value". So does this logic work if we query at the same sequence as an intent? I think we might want std::upper_bound.


c-deps/libroach/mvcc.h, line 312 at r2 (raw file):

        // This means the intent we're seeing was written after the read.
        // If there exists a value in the intent history that has a sequence
        // number lower than the read sequence, read that value.

equal to or lower than!


c-deps/libroach/mvcc.h, line 317 at r2 (raw file):

          return true;
        }
        // 10. If no value in the intent history has a sequence number lower

ditto

@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif:sequenced-reads branch from 81f41f3 to 1bac244 Dec 21, 2018

@ridwanmsharif
Copy link
Collaborator

ridwanmsharif left a comment

TFTR! Your last review was too chill @nvanbenschoten. You gave me some hope that this gets through before I leave cause there was nothing to iterate on. Give it another 👀 to see if you find anything else, pretty excited now!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


c-deps/libroach/mvcc.h, line 29 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

make the args const &.

Done.


c-deps/libroach/mvcc.h, line 31 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This can be return intent1.sequence() <= intent2.sequence();, right?

I think it should also be strictly less, based on https://en.cppreference.com/w/cpp/algorithm/lower_bound.

Well that isn't really necessary because making it <= would allow for intents written at the same seq to be read as well. But I changed it anyway as I'm now using upper_bound.


c-deps/libroach/mvcc.h, line 195 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

std::lower_bound returns " the first element in the range [first, last) that is not less than value". So does this logic work if we query at the same sequence as an intent? I think we might want std::upper_bound.

Actually lower_bound used to work correctly even in that case as the compare predicate function would consider <= instead of just <. Since, upper_bound achieves the same thing, and is the convention we should stick with it. Done.


c-deps/libroach/mvcc.h, line 312 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

equal to or lower than!

Done.


c-deps/libroach/mvcc.h, line 317 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

ditto

Done.

@ridwanmsharif ridwanmsharif requested a review from nvanbenschoten Dec 21, 2018

@petermattis
Copy link
Contributor

petermattis left a comment

@ridwanmsharif Have you run any of the MVCC benchmarks (in storage/engine) to see the impact of this change? This is performance sensitive code and extra care needs to be taken. I haven't had a chance to look at this change yet, so perhaps there is nothing of concern.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@petermattis
Copy link
Contributor

petermattis left a comment

PS The relevant benchmarks for this change are the various MVCCScan benchmarks.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@ridwanmsharif ridwanmsharif force-pushed the ridwanmsharif:sequenced-reads branch from 1bac244 to e7d2599 Dec 21, 2018

@ridwanmsharif ridwanmsharif requested a review from cockroachdb/sql-rest-prs as a code owner Dec 21, 2018

@ridwanmsharif
Copy link
Collaborator

ridwanmsharif left a comment

Ran a few benchmarks and there was virtually no hit in the benchmarks you mention. I believe that makes sense as the extra logic for reading the right value from history would apply on when there are a few writes on the same key by the same transaction. We've not been writing history for a while and the history is expected to be pretty small as well. I defer judgement to @nvanbenschoten now. Doesn't seem like I will be able to iterate on this anymore. I trust you will review this and push this through. I'll check back on this from my flight. Great working with you all. Happy Halloween! And Happy Holidays!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

storage: have sequenced reads use the intent history
Reads can now use the intent history to serve reads from
values written at a lower sequence. Reads no longer read
all writes in a transaction but instead read all writes
in the transaction with a higher sequence than it. If a key
is written to multiple times within a transaction, the read
can now get the appropriate value using the intent history.

Release note: None
@nvanbenschoten

This comment has been minimized.

Copy link
Member

nvanbenschoten commented Jan 11, 2019

@ridwanmsharif sorry for letting this sit. I rebased this on master, fixed a small bug, and cleaned it up a bit. We're close to being ready to merge! The only issue is that I don't have permission to push to the pull request. Do you mind following these steps: https://help.github.com/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork/?

ridwanmsharif and others added some commits Dec 20, 2018

storage: wrap sequential reads around a cluster version
We need a cluster version for this change. We make sure
to set this newly created `IgnoreSequence` to true each time we see a
cluster version is too low.

Release note: None
storage: fix bug and cleanup around sequenced reads using intent history
This also fixes one bug, where MVCCScan would get stuck reading the
same key and never advancing. The added test cases caught this.

Release note: None
mvcc: make MVCCDeleteRange idempotent
To do this, it needed to read at the previous sequence number before
writing at the current sequece number.

Release note: None
@ridwanmsharif

This comment has been minimized.

Copy link
Collaborator

ridwanmsharif commented Jan 11, 2019

That's great to hear! I do see that this PR already checks the "Allow edits from maintainers" option. I don't quite understand why you can't push to it. Also sent you an invitation to be a collaborator on the fork, maybe that fixes this? @nvanbenschoten

@nvanbenschoten nvanbenschoten force-pushed the ridwanmsharif:sequenced-reads branch from e7d2599 to 49b37d8 Jan 11, 2019

@nvanbenschoten

This comment has been minimized.

Copy link
Member

nvanbenschoten commented Jan 11, 2019

I think the issue was actually that I was pushing to the wrong branch name. Pushing to the correct one fixed it! I left the first two commits alone (other than a rebase) and added two more. @ridwanmsharif you might be interested in looking them over. Otherwise, @petermattis, mind reviewing those two changes when you get a chance?

I'm gone for the next two weeks, so @petermattis feel free to merge this if everything checks out on your end. I'm also happy to merge when I return in case there's any fallout.

The third commit addressed the rest of my feedback.

@ridwanmsharif
Copy link
Collaborator

ridwanmsharif left a comment

Changes LGTM. Thanks for fixing the bug too! I'll defer to @petermattis for the final ✔️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment