Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bug that causes iterator to return wrong result in a rare data race #6973

Closed
wants to merge 3 commits into from

Conversation

siying
Copy link
Contributor

@siying siying commented Jun 12, 2020

Summary:
The bug fixed in #1816 is now applicable to iterator too. This was not an issue but #2886 caused the regression. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong.
Fix it in the same way as the fix in #1816, that is to get the sequence number after referencing the super version.

Test Plan: Will run stress tests for a while to make sure there is no general regression.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@riversand963 riversand963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @siying for the fix.

db/db_test2.cc Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

// to this snapshot, but in that case it can see all the data in the
// super version, which is a valid consistent state after the user
// calls NewIterator().
snapshot = versions_->LastSequence();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a couple SyncPoints at the end of the LastSequence() function to enforce the flush happens immediately after sequence number is obtained. RIght now the sync points are in the right places but it's not obvious to maintainers how they should preserve the order in the future (esp. since their names are just 1, 2, 3, 4).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a try and found that it would make the code much more complicated. The reason is that write also calls versions_->LastSequence(). The sync-point would be complicated if we place inside. I can move the sync point to here anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SyncPoint marker could be useful as the write/flush happen in a different thread. Up to you.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. Re-import the pull request

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. Re-import the pull request

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

// to this snapshot, but in that case it can see all the data in the
// super version, which is a valid consistent state after the user
// calls NewIterator().
snapshot = versions_->LastSequence();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SyncPoint marker could be useful as the write/flush happen in a different thread. Up to you.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. Re-import the pull request

@siying
Copy link
Contributor Author

siying commented Jun 17, 2020

Although the test failures are worth investigating, I don't think it can be caused by this PR. Will land it.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…case.

Summary:
The bug fixed in facebook#1816 is applicable to iterator too. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong.
Fix it in the same way as the fix in facebook#1816, that is to get the sequence number after referencing the super version.

Test Plan: Will run stress tests for a while to make sure there is no general regression.
@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

jbowens added a commit to jbowens/rocksdb that referenced this pull request Jun 18, 2020
jbowens added a commit to cockroachdb/rocksdb that referenced this pull request Jun 18, 2020
riversand963 pushed a commit that referenced this pull request Jun 18, 2020
…race (#6973)

Summary:
The bug fixed in #1816 is now applicable to iterator too. This was not an issue but #2886 caused the regression. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong.
Fix it in the same way as the fix in #1816, that is to get the sequence number after referencing the super version.
Pull Request resolved: #6973

Test Plan: Will run stress tests for a while to make sure there is no general regression.

Reviewed By: ajkr

Differential Revision: D22029348

fbshipit-source-id: 94390f93630906796d6e2fec321f44a920953fd1
jbowens added a commit to cockroachdb/rocksdb that referenced this pull request Jun 18, 2020
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in d6b7b77.

jbowens added a commit to cockroachdb/rocksdb that referenced this pull request Jun 18, 2020
jbowens added a commit to jbowens/cockroach that referenced this pull request Jun 18, 2020
069b29cae backport of facebook/rocksdb#6973
f928a7207 RocksDB Cmake changes for Arm64 CRC32 Optimization (cockroachdb#5304)

Release note: None
jbowens added a commit to jbowens/cockroach that referenced this pull request Jun 18, 2020
069b29cae backport of facebook/rocksdb#6973
f928a7207 RocksDB Cmake changes for Arm64 CRC32 Optimization (cockroachdb#5304)

Release note (bug fix): Fixed a RocksDB that could result in
inconsistencies in rare circumstances.
craig bot pushed a commit to cockroachdb/cockroach that referenced this pull request Jun 18, 2020
50394: vendor: bump Pebble to bc7bb07c r=jbowens a=jbowens

```
bc7bb07 tool: Enhance lsm visualization to show L0 sublevels
6b96ab6 internal/manifest: rename L0Sublevels.Files to L0Sublevels.Levels
0b22ec2 internal/manifest: rename Version.Files to Version.Levels
3b241b7 db: implement min-overlapping ratio heuristic
7ad569c use lockfree rand for skiplist
eb0a851 db: fix race in NewSnapshot
e941125 db: fix Checkpoint race condition
81f4ebe cmd/pebble: replace keys/sec with ops/sec(cum)
```

50395: c-deps/rocksdb: update to 2512e96f r=jbowens a=jbowens

```
069b29cae backport of facebook/rocksdb#6973
f928a7207 RocksDB Cmake changes for Arm64 CRC32 Optimization (#5304)
```

Co-authored-by: Jackson Owens <jackson@cockroachlabs.com>
riversand963 pushed a commit that referenced this pull request Jun 19, 2020
…race (#6973)

Summary:
The bug fixed in #1816 is now applicable to iterator too. This was not an issue but #2886 caused the regression. If a put and DB flush happens just between iterator to get latest sequence number and getting super version, empty result for the key or an older value can be returned, which is wrong.
Fix it in the same way as the fix in #1816, that is to get the sequence number after referencing the super version.
Pull Request resolved: #6973

Test Plan: Will run stress tests for a while to make sure there is no general regression.

Reviewed By: ajkr

Differential Revision: D22029348

fbshipit-source-id: 94390f93630906796d6e2fec321f44a920953fd1
jbowens added a commit to jbowens/cockroach that referenced this pull request Jun 22, 2020
069b29cae backport of facebook/rocksdb#6973
f928a7207 RocksDB Cmake changes for Arm64 CRC32 Optimization

Release note (bug fix): Fixed a RocksDB that could result in
inconsistencies in rare circumstances.
jbowens added a commit to jbowens/rocksdb that referenced this pull request Jun 22, 2020
jbowens added a commit to cockroachdb/rocksdb that referenced this pull request Jun 22, 2020
jbowens added a commit to jbowens/cockroach that referenced this pull request Jun 22, 2020
364ab2b1 backport of facebook/rocksdb#6973

Release note (bug fix): Fixed a RocksDB that could result in
inconsistencies in rare circumstances.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants