Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: non-blocking write-read conflicts for weak isolation txns #103267

Merged

Conversation

nvanbenschoten
Copy link
Member

Fixes #102014.

This commit updates write-read conflict rules to allow non-blocking behavior for weak isolation transactions. Specifically, PUSH_TIMESTAMP requests are now allowed to succeed when encountering PENDING pushees if any of the following conditions is true:

  • the pushee is a weak isolation transaction
  • OR the pusher is a weak isolation transaction with an equal priority as the pushee
  • OR the pusher has a greater priority than the pushee (previous behavior)

The rationale for this behavior is that weak isolation transactions can tolerate write skew without a refresh or retry. Because of this, they face no consequence from being pushed and also expect others to be pushable (non-blocking).

Longer-term, we would like all write-read conflicts to become non-blocking. For now, these rules ensure that all write-read conflicts between weak isolation transactions are non-blocking, and also that write-read conflicts between a weak isolation transaction and a strong isolation (serializable) transaction has reasonable, tunable behavior.

Release note: None

@nvanbenschoten nvanbenschoten requested a review from a team as a code owner May 14, 2023 22:15
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/pushSnapshotTxn2 branch from 9cabf3a to db455f7 Compare May 18, 2023 22:38
Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 10 of 10 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table_waiter.go line 1252 at r1 (raw file):

	}
	var pushType kvpb.PushTxnType
	if s.guardStrength == lock.None {

nit: consider using a switch statement on the guard's strength to assign pushType. That way, we can panic if any of the strengths are unexpected (for now, we only expect Intent and None, but soon we'll expand this to include Exclusive and Shared as well).


pkg/kv/kvserver/txnwait/queue.go line 107 at r1 (raw file):

	case kvpb.PUSH_TIMESTAMP:
		// If the pushee transaction tolerates write skew, the PUSH_TIMESTAMP is
		// harmless, so let it through. If the pushee transaction does not tolerate

nit: consider breaking this comment and placing the relevant bits closer to where the condition is being checked.

I saw this over at IsolatedAtLaterTimestamps and have since been a fan. It's got your blame on it, so maybe you like it too? Feel free to disregard if you no longer agree.


pkg/kv/kvclient/kvcoord/txn_test.go line 369 at r1 (raw file):

// TestTxnWriteReadConflict verifies that write-read conflicts are non-blocking
// to the reader, except when both the writer and reader are both serializable
// transactions. In that case, the reader will block until the writer completes.

Should we add a TODO here? :trollface:


pkg/kv/kvserver/txnwait/queue_test.go line 188 at r1 (raw file):

	}{
		// PUSH_ABORT
		{kvpb.PUSH_ABORT, SSI, SSI, min, min, false},

How do you feel about pulling the PUSH_ABORT and PUSH_TOUCH cases into a (or 2) different test, and running them with all isolation level combinations? That allows us to test they're agnostic to isolation levels, and we can test the more interesting PUSH_TIMESTAMP cases here.

Fixes cockroachdb#102014.

This commit updates write-read conflict rules to allow non-blocking
behavior for weak isolation transactions. Specifically, PUSH_TIMESTAMP
requests are now allowed to succeed when encountering PENDING pushees
if any of the following conditions is true:
- the pushee is a weak isolation transaction
- OR the pusher is a weak isolation transaction with an equal priority
  as the pushee
- OR the pusher has a greater priority than the pushee (previous behavior)

The rationale for this behavior is that weak isolation transactions can
tolerate write skew without a refresh or retry. Because of this, they
face no consequence from being pushed and also expect others to be
pushable (non-blocking).

Longer-term, we would like all write-read conflicts to become non-blocking. For
now, these rules ensure that all write-read conflicts between weak isolation
transactions are non-blocking, and also that write-read conflicts between a weak
isolation transaction and a strong isolation (serializable) transaction has
reasonable, tunable behavior.

Release note: None
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/pushSnapshotTxn2 branch from db455f7 to 5a53d25 Compare May 19, 2023 19:27
Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)


pkg/kv/kvserver/concurrency/lock_table_waiter.go line 1252 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: consider using a switch statement on the guard's strength to assign pushType. That way, we can panic if any of the strengths are unexpected (for now, we only expect Intent and None, but soon we'll expand this to include Exclusive and Shared as well).

While I'm generally a strong proponent of switch statements, I think the if-else structure here is more suggestive of the meaning behind this logic. If the caller is not locking (lock.None), then it wants to use a PUSH_TIMESTAMP. Otherwise, if the caller is locking, regardless of what the specific locking strength is, it wants to use a PUSH_ABORT.

I'm actually going to switch the other instance of this over as well.


pkg/kv/kvserver/txnwait/queue.go line 107 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

nit: consider breaking this comment and placing the relevant bits closer to where the condition is being checked.

I saw this over at IsolatedAtLaterTimestamps and have since been a fan. It's got your blame on it, so maybe you like it too? Feel free to disregard if you no longer agree.

Done.


pkg/kv/kvclient/kvcoord/txn_test.go line 369 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

Should we add a TODO here? :trollface:

Heh, only if you want your name attached to it.


pkg/kv/kvserver/txnwait/queue_test.go line 188 at r1 (raw file):

Previously, arulajmani (Arul Ajmani) wrote…

How do you feel about pulling the PUSH_ABORT and PUSH_TOUCH cases into a (or 2) different test, and running them with all isolation level combinations? That allows us to test they're agnostic to isolation levels, and we can test the more interesting PUSH_TIMESTAMP cases here.

That's a great idea. Done.

Copy link
Collaborator

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewed 3 of 3 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/kv/kvserver/concurrency/lock_table_waiter.go line 1252 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

While I'm generally a strong proponent of switch statements, I think the if-else structure here is more suggestive of the meaning behind this logic. If the caller is not locking (lock.None), then it wants to use a PUSH_TIMESTAMP. Otherwise, if the caller is locking, regardless of what the specific locking strength is, it wants to use a PUSH_ABORT.

I'm actually going to switch the other instance of this over as well.

That's a fair point, sgtm.

@nvanbenschoten
Copy link
Member Author

TFTR!

bors r=arulajmani

@craig
Copy link
Contributor

craig bot commented May 19, 2023

Build failed:

@nvanbenschoten
Copy link
Member Author

Bors' CI has not failed, so it's confused. I now canceled the CI runs. Let's try again.

bors r=arulajmani

@craig
Copy link
Contributor

craig bot commented May 19, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kv: read committed transactions should not wait on PENDING intents
3 participants