Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Transaction::GetForUpdate with do_validate #4680

Closed

Conversation

maysamyabandeh
Copy link
Contributor

@maysamyabandeh maysamyabandeh commented Nov 14, 2018

Transaction::GetForUpdate is extended with a do_validate parameter with default value of true. If false it skips validating the snapshot (if there is any) before doing the read. After the read it also returns the latest value (expects the ReadOptions::snapshot to be nullptr). This allows RocksDB applications to use GetForUpdate similarly to how InnoDB does. Similarly ::Merge, ::Put, ::Delete, and ::SingleDelete are extended with assume_exclusive_tracked with default value of false. It true it indicates that call is assumed to be after a ::GetForUpdate(do_validate=false).
The Java APIs are accordingly updated.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

4 similar comments
@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

s = Get(read_options, column_family, key, &pinnable_val);
if (skip_validate && read_options.snapshot != nullptr) {
ReadOptions ro(read_options);
ro.snapshot = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why we need to do it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skip_validate has two functions: skip validating snapshot, which is done by passing it to TryLock, read the latest committed value independent of what the snapshot is. The latter is missing in the inline documentation. If the explanation above is satisfaction, I go ahead with improving the inline docs with that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the caller expects to read the latest, not from the snapshot, should they not set read_options.snapshot? Users passing in a snapshot but we have an argument to skip it is too complicated to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hermanlee @lth would it work if we return error if snapshot and skip_validate both are set?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be fine. If skip_validate is set, then snapshot should be null.

s = Get(read_options, column_family, key, pinnable_val);
if (skip_validate && read_options.snapshot != nullptr) {
ReadOptions ro(read_options);
ro.snapshot = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here. Why do we need to do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above


// An overload of the above method that receives a PinnableSlice
// For backward compatibility a default implementation is provided
virtual Status GetForUpdate(const ReadOptions& options,
ColumnFamilyHandle* column_family,
const Slice& key, PinnableSlice* pinnable_val,
bool exclusive = true) {
bool exclusive = true,
bool skip_validate = false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Callaghan had a suggestion which might loosely apply here: avoid argument/option name which contains negative in it, because if it is false, it is double negative and makes it difficulty for users to tune. "Skip" is kind of such thing. skip_validate = false is harder to understand than something like do_validation = false.

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

bool exclusive,
const bool do_validate) {
assert(do_validate || read_options.snapshot == nullptr);
if (UNLIKELY(!do_validate && read_options.snapshot != nullptr)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As always, I'm not a fun of UNLIKELY hint. Do we really need it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen cases that it improves performance and here it might too. I do not see it to be hurtful in anyway.

Copy link
Contributor

@siying siying Dec 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is hurtful because it tries to dictate compiler behavior unnecessarily. It's just like that we don't encourage people to deep tune RocksDB behavior unless it's needed, I am not a fan of doing it to compiler either.

Furthermore, just have this macro inside the code for no reason makes it less readable. Imagine someone grabs RocksDB source code and read this function, and sees a "UNLIKELY()".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it is a "hint" rather than "dictation". And it is not "unnecessary" as the previous benchmarks has shown. I actually find them to make the code more readable if they are used properly. Here is a proper use case as the if-then branch is not taken unless there is a user bug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I even feel that we should take the hit even we see some minor regression. "Previous benchmark" is even weaker argument. It's more readable to you because you wrote so many UNLIKELY() in the code base so that you are so familiar with what it is for when you see it. It's not the case for someone new to the code base. In fact, I'm always reluctant to the use of macros in general. This is also the rule written in the Google C++ Style we are following: https://google.github.io/styleguide/cppguide.html#Preprocessor_Macros

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to remove UNLIKELY here as I do not want this minor point to become a blocker for this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maysamyabandeh I appreciate it.

Status s = TryLock(column_family, key, true /* read_only */, exclusive);
bool exclusive,
const bool do_validate) {
assert(do_validate || read_options.snapshot == nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should assert user input. Crashing user program (while running in debug mode) when passing invalid argument doesn't feel like something we should do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is undefined behavior in the API and assert is a way to specify the assumption on the input. We leave the runtime error return just in case there was a corner case that we did not anticipate but during the test we would prefer a highly visible crash over a probably silent error return.
I let @hermanlee , @lth, and @yoshinorim chime in if they prefer it differently.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming assert is debug only, then this is fine. I would prefer it return an error on non-debug builds though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. The next line returns the error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't assume that RocksDB users never run production with debug build. It has happened and is very likely to be happening somewhere too. Turning user input validation into application crashing will prevent users to do it.

@maysamyabandeh why is it an undefined behavior? The behavior is deterministic: return error to users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it easier for users to find the corner case by crashing the program, than return them an error message telling them what mistake they have made?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I let MyRocks team as a user of rocksdb chime in of what behavior would make their life easier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not about this concrete case or concrete user. It's about the general principle we follow in this code base. It needs to be all consistent in the code base.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is prepared for MyRocks primarily. I think it is fair to hear out their experience as the user before making strong opinions about carved-in-stone principles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can hear it as an input for revisiting the model of crashing application while users provide invalid arguments, but I won't change opinion in here, even if they say they love it a lot.

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@maysamyabandeh maysamyabandeh changed the title Extend Transaction::GetForUpdate with skip_validate Extend Transaction::GetForUpdate with do_validate Dec 6, 2018
s = Status::InvalidArgument(
"assume_exclusive_tracked is set but it is not tracked yet");
} else if (lock_upgrade) {
s = Status::InvalidArgument(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. Shared locking should be enough. The goal of validation is just to check whether writes have happened, and a shared lock would be able to prevent that.

HISTORY.md Outdated
@@ -3,6 +3,7 @@
### New Features

### Public API Change
* Transaction::GetForUpdate is extended with a do_validate parameter with default value of true. If false it skips validating the snapshot before doing the read. Similar ::Merge, ::Put, ::Delete, and ::SingleDelete are extended with assume_exclusive_tracked with default value of false. It true it indicates that call is assumed to be after a ::GetForUpdate(do_validate=false).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: typo It -> If

Status s = TryLock(column_family, key, true /* read_only */, exclusive);
bool exclusive,
const bool do_validate) {
assert(do_validate || read_options.snapshot == nullptr);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming assert is debug only, then this is fine. I would prefer it return an error on non-debug builds though.

@@ -559,7 +561,18 @@ Status PessimisticTransaction::TryLock(ColumnFamilyHandle* column_family,
// any writes since this transaction's snapshot.
// TODO(agiardullo): could optimize by supporting shared txn locks in the
// future
if (skip_validate || snapshot_ == nullptr) {
if (!do_validate || snapshot_ == nullptr) {
assert(!assume_exclusive_tracked || (previously_locked && !lock_upgrade));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this assert could just be in an additional else clause after "else if (lock_upgrade)" below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the if-then clause. Let me know if the update has addressed your comment. I kept assert before the if-then clause so that we crash during the tests and the corner cases that have violated our assumptions would show up in tests.

bool exclusive, bool skip_validate) {
bool exclusive, const bool do_validate,
const bool assume_exclusive_tracked) {
assert(!assume_exclusive_tracked || (!do_validate && exclusive));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Manuel mentioned acquiring read locks should be able to set do_validate = false if we want to retrieve the value with snapshot set to nullptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. The latest PR is updated with manuel's comment. Let me know if the latest is correct.

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

Copy link
Contributor

@siying siying left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic makes sense by itself. I'll defer to @lth @yoshinorim @hermanlee to decide whether it makes sense to MyRocks and accept.

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

Copy link

@hermanlee hermanlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes seems fine with me. I'll let Manuel have the final approval.

if (skip_validate || snapshot_ == nullptr) {
if (!do_validate || snapshot_ == nullptr) {
assert(!assume_tracked || previously_locked);
if (assume_tracked) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion. Seems like this could be rewritten as:

if (assume_tracked && !previous_locked) {
s = Status::InvalidArgument();
}

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

Copy link
Contributor

@siying siying left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

HISTORY.md Outdated
@@ -3,6 +3,7 @@
### New Features

### Public API Change
* Transaction::GetForUpdate is extended with a do_validate parameter with default value of true. If false it skips validating the snapshot before doing the read. Similarly ::Merge, ::Put, ::Delete, and ::SingleDelete are extended with assume_tracked with default value of false. If true it indicates that call is assumed to be after a ::GetForUpdate(do_validate=false).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more accurate to say that assume_tracked means the call is assumed to be called after any GetForUpdate call right?

Status s = TryLock(column_family, key, true /* read_only */, exclusive);
bool exclusive,
const bool do_validate) {
if (UNLIKELY(!do_validate && read_options.snapshot != nullptr)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have UNLIKELY only in one place?

const Slice& key, const Slice& value) {
const Slice& key, const Slice& value,
const bool assume_exclusive_tracked) {
const bool do_validate = !assume_exclusive_tracked;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename to just assume_tracked, if we agree that shared locks also imply that we can skip validation?

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

@facebook-github-bot
Copy link
Contributor

@maysamyabandeh has updated the pull request. Re-import the pull request

Copy link
Contributor

@lth lth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maysamyabandeh is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@@ -287,6 +293,9 @@ class Transaction {
// functions in WriteBatch, but will also do conflict checking on the
// keys being written.
//
// assume_tracked=false expects the key be already tracked. If valid then it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant assume_tracked=true here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. will fix it.

facebook-github-bot pushed a commit that referenced this pull request Dec 10, 2018
Summary:
Fix the definition of assumed_tracked in Transaction that was introduced in #4680
Pull Request resolved: #4762

Differential Revision: D13399150

Pulled By: maysamyabandeh

fbshipit-source-id: 2a30fe49e3c44adacd7e45cd48eae95023ca9dca
@@ -834,11 +909,23 @@ public void delete(final byte[] key) throws RocksDBException {
* @throws RocksDBException when one of the TransactionalDB conditions
* described above occurs, or in the case of an unexpected error
*/
public void delete(final ColumnFamilyHandle columnFamilyHandle, final byte[][] keyParts,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Could you please rename assume_tracked to assumeTracked so it follows the Java naming conventions.

  2. Could you please add the javadoc for the additional parameter assumeTracked.

  3. This occurs in several places, could you correct each please?

@@ -433,6 +433,33 @@ public void rollbackToSavePoint() throws RocksDBException {
* @param key the key to retrieve the value for.
* @param exclusive true if the transaction should have exclusive access to
* the key, otherwise false for shared access.
* @param do_validate true if it should validate the snapshot before doing the read
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Could you please rename do_validate to doValidate so it follows the Java naming conventions.

  2. Could you please add the javadoc for the additional parameter doValidate on the other methods which are missing it?

  3. This occurs in several places, could you correct each please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants