Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql,kv: add preliminary SQL savepoints support #45566

Merged

Conversation

andreimatei
Copy link
Contributor

@andreimatei andreimatei commented Mar 2, 2020

Informs (but does not resolve) #10735.

This patch adds support for SAVEPOINT , RELEASE SAVEPOINT ,
ROLLBACK TO SAVEPOINT .
Before this patch, we only had support for the special savepoint
cockroach_restart, which had to be placed at the beginning of the
transaction and was specifically intended for dealing with transaction
retries. This patch implements general support for savepoints, which
provide an error recovery mechanism.

The connExecutor now maintains a stack of savepoints. Rolling back to a
savepoint uses the recent KV api for ignoring a range of write sequence
numbers.

At the SQL level, savepoints differ in two characteristics:

  1. savepoints placed at the beginning of a transaction (i.e. before any
    KV operations) are marked as "initial". Rolling back to an initial
    savepoint is always possible. Rolling back to a non-initial savepoint is
    not possible after the transaction restarts (see below).
  2. the savepoint named "cockroach_restart" retains special RELEASE
    semantics: releasing it (attempts to) commit the underlying KV txn.
    This continues to allow for descovering of deferred serilizability
    errors (i.e. write timestamp pushes by the ts cache). As before, this
    RELEASE can fail with a retriable error, at which point the client can
    do ROLLBACK TO SAVEPOINT cockroach_restart (which is guaranteed to work
    because cockroach_restart needs to be an "initial" savepoint). The
    transaction continues to maintain all its locks after such an error.
    This is all in contrast to releasing any other savepoints, which cannot
    commit the txn and also never fails. See below for more discussion.
    The cockroach_restart savepoint is only special in its release behavior,
    not in its rollback behavior.

With the implementation of savepoints, the state machine driving a SQL
connection's transactions becomes a lot simpler. There's no longer a
distinction between an "Aborted" transaction and one that's in
"RestartWait". Rolling back to a savepoint now works the same way across
the two states, so RestartWait is gone.

This patch also improves the KV savepoints. They now capture and restore
the state of the read spans and the in-flight writes.

Some things don't work (yet):
a) Rolling back to a savepoint created after a schema change will error
out. This is because we don't yet snapshot the transaction's schema
change state.
b) After a retriable error, you can only rollback to an initial
savepoint. Attempting to rollback to a non-initial savepoint generates a
retriable error again. If the trasaction has been aborted, I think this
is the right behavior; no recovery is possible since the transaction has
lost its write intents. In the usual case where the transaction has not
been aborted, I think we want something better but it will take more
work to get it. I think the behavior we want is the following:

  • after a serializability failure, retrying just part of the transaction
    should be doable by attempting a ROLLBACK TO SAVEPOINT. This rollback
    should succeed if all the non-rolled-back reads can be refreshed to the
    desired timestamp. If they can be refreshed, then the client can simply
    retry the rolled back part of the transaction. If they can't, then the
    ROLLBACK should return a retriable error again, allowing the client to
    attempt a deeper rollback - and so on until the client rolls back to an
    initial savepoint (which succeeds by definition).
    Implementing this would allow for the following nifty pattern:
func fn_n() {
  for {
    SAVEPOINT savepoint_n
    try {
      fn_n+1()
    } catch retriable error {
      err := ROLLBACK TO SAVEPOINT outer
      if err != nil {
        throw err
      }
      continue
    }
    RELEASE SAVEPOINT savepoint_n
    break
  }
}

The idea here is that the client is trying to re-do as little work as
possible by successively rolling back to earlier and earlier savepoints.
This pattern will technically work with the current patch already,
except it will not actually help the client in any way since all the
rollbacks will fail until we get to the very first savepoint.
There's an argument to be made for making RELEASE SAVEPOINT check for
deferred serializability violations (and perhaps other deferred checks -
like deferred constraint validation), although Postgres doesn't do any
of these.
Anyway, I've left implementing this for a future patch because I want to
do some KV work for supporting it nicely. Currently, the automatic
restart behavior that KV transactions have is a pain in the ass since it
works against what we're trying to do.

For the time-being, non-initial savepoints remember their txn ID and
epoch and attempting to rollback to them after these changes produces a
retriable error automatically.

Release note (sql change): SQL savepoints are now supported. SAVEPOINT
, RELEASE SAVEPOINT , ROLLBACK TO SAVEPOINT now works.
SHOW SAVEPOINT STATUS can be used to inspect the current stack of active
savepoints.

Co-authored-by: Raphael 'kena' Poss knz@thaumogen.net
Co-authored-by: Andrei Matei andrei@cockroachlabs.com

@andreimatei andreimatei requested review from a team as code owners March 2, 2020 02:43
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything except r5,r6,r7 is elsewhere.

This is based on #43051, but I've changed a lot of stuff so I think it's better to start with a new PR. Compared to that, this version has support for rolling back after an error, different behavior on rolling back after retriable errors and removes the special treatment of the savepoint cockroach_restart when it comes to rollbacks.

I still need to see what more tests need to be written (but you added a bunch of great ones already Rafa) and go over everything and spruce it up, but I think its ready for a first pass. Not ready for nits yet though.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @nvanbenschoten)

@andreimatei andreimatei force-pushed the savepoint.implement-savepoints branch 6 times, most recently from 56e7ce6 to 7b4a5e3 Compare March 3, 2020 19:26
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been adding various tests. Ready for review.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @nvanbenschoten)

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a first pass. It's looking good and I'm very excited for this to be merged, although the PR still needs some polish. I also think it deserves a review from @knz, given that he was the other person most connected to this change.

s/descovering/discovering/ in the commit message.

Reviewed 4 of 4 files at r1, 1 of 1 files at r2, 3 of 3 files at r3, 37 of 37 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @knz)


pkg/kv/txn_coord_sender.go, line 179 at r4 (raw file):

	epochBumpedLocked()

	// createSavepoint is used to populate a savepoint with all the state the

s/the needs/the transaction needs/


pkg/kv/txn_coord_sender.go, line 181 at r4 (raw file):

	// createSavepoint is used to populate a savepoint with all the state the
	// needs to be restored on a rollback.
	createSavepoint(context.Context, *savepoint)

Do these need Locked suffixes?


pkg/kv/txn_coord_sender.go, line 184 at r4 (raw file):

	// rollbackToSavepoint is used to restore the state previously saved by createSavepoint().
	// implementations are allowed to modify the savepoint if they want to, optimizing it for future

s/implementations/Implementations/


pkg/kv/txn_coord_sender_savepoints.go, line 33 at r4 (raw file):

	seqNum enginepb.TxnSeq

	// txnSpanRefresher fields

nit, period after both of these comments.


pkg/kv/txn_coord_sender_savepoints.go, line 42 at r4 (raw file):

	ifBytes  int64

	// txnID is used to verify that a rollback is not used to paper

Move these fields up above seqNum to give the ordering here some notion of logical precedence.


pkg/kv/txn_coord_sender_savepoints.go, line 80 at r4 (raw file):

		epoch: tc.mu.txn.Epoch,
	}
	for _, reqInt := range tc.interceptorStack {

👍 I like this approach.


pkg/kv/txn_coord_sender_savepoints.go, line 172 at r4 (raw file):

	}

	if st.seqNum > 0 && st.txnID != tc.mu.txn.ID {

Why do we need these seqNum checks?


pkg/kv/txn_coord_sender_savepoints.go, line 180 at r4 (raw file):

	}

	if st.seqNum < 0 || st.seqNum > tc.interceptorAlloc.txnSeqNumAllocator.writeSeq {

When is seqNum less than zero?


pkg/kv/txn_interceptor_committer.go, line 465 at r4 (raw file):

func (*txnCommitter) importLeafFinalState(*roachpb.LeafTxnFinalState) {}

// epochBumpedLocked implements the txnReqInterceptor interface.

nit: not your change, but if we wanted to rename epochBumpedLocked to bumpEpochLocked to fit the naming scheme here more appropriately, I'd be very supportive. If not, I'll make the change after this PR lands.


pkg/kv/txn_interceptor_pipeliner.go, line 607 at r4 (raw file):

func (tp *txnPipeliner) createSavepoint(ctx context.Context, s *savepoint) {
	if tp.ifWrites.len() > 0 {
		s.ifWrites = tp.ifWrites.t.Clone()

Let's add a clone method to *inFlightWriteSet.


pkg/kv/txn_interceptor_pipeliner.go, line 615 at r4 (raw file):

// rollbackToSavepoint is part of the txnReqInterceptor interface.
func (tp *txnPipeliner) rollbackToSavepoint(ctx context.Context, s *savepoint) {
	// Intersect the inflight writes from the savepoint to the ones from the

s/to/with/


pkg/kv/txn_interceptor_pipeliner.go, line 627 at r4 (raw file):

			// higher seqnum for the same key. We don't keep track of what sequence
			// numbers we've verified and which we haven't, so we're going to assume
			// that the savepoint's write has not been verified.

I'm confused, we do keep track of what sequence numbers we've verified and which we haven't. There's a Sequence field in inFlightWrite.


pkg/kv/txn_interceptor_pipeliner.go, line 635 at r4 (raw file):

			return true
		})
		// TODO(andrei): Can I delete directly during the iteration above?

No, you can't. The btree library doesn't like that.


pkg/kv/txn_interceptor_pipeliner.go, line 654 at r4 (raw file):

	// Restore the inflightWrites from the savepoint.
	if s.ifWrites == nil {

Should you be setting cloned here?


pkg/kv/txn_interceptor_pipeliner.go, line 655 at r4 (raw file):

	// Restore the inflightWrites from the savepoint.
	if s.ifWrites == nil {
		tp.ifWrites.t = nil

Should we clear(true) here instead?


pkg/kv/txn_interceptor_pipeliner.go, line 825 at r4 (raw file):

	s.t.Clear(reuse /* addNodesToFreelist */)
	s.bytes = 0
	s.alloc.clear()

Is this safe when s.cloned == true?


pkg/kv/txn_interceptor_pipeliner_test.go, line 1231 at r4 (raw file):

}

func TestTxnPipelinerSavepoints(t *testing.T) {

Give this a comment. We've been good about doing so in these tests so far.

Good test though.


pkg/kv/txn_interceptor_pipeliner_test.go, line 1244 at r4 (raw file):

	tp.createSavepoint(ctx, s)

	// Some more write after the savepoint. One of them is on key "c" that is part

s/write/writes/


pkg/kv/txn_interceptor_pipeliner_test.go, line 1267 at r4 (raw file):

	}

	// Now verify one of the writes. When we'll rollback to the savepoint below,

s/we'll/we/


pkg/kv/txn_interceptor_pipeliner_test.go, line 1292 at r4 (raw file):

	require.NotNil(t, br)
	require.Equal(t, []roachpb.Span{{Key: roachpb.Key("a")}}, tp.footprint.asSlice())
	require.Equal(t, 3, tp.ifWrites.len()) // We've verified one out of 4 writes.

Want to check the footprint?


pkg/kv/txn_interceptor_pipeliner_test.go, line 1307 at r4 (raw file):

	{
		var savepointWrites []inFlightWrite
		s.ifWrites.Ascend(func(i btree.Item) bool {

This is interesting. It begs the question of why we need to update the savepoint itself. Is it not enough to update the TxnCoordSender?


pkg/kv/txn_interceptor_seq_num_allocator.go, line 180 at r4 (raw file):

// createSavepoint is part of the txnReqInterceptor interface.
func (s *txnSeqNumAllocator) createSavepoint(ctx context.Context, sp *savepoint) {
	sp.seqNum = s.writeSeq

It's worth adding a very small test for this. I know the implementation is trivial, but the role the txnSeqNumAllocator plays in populating savepoints is critical.


pkg/kv/txn_interceptor_span_refresher_test.go, line 559 at r4 (raw file):

}

func TestTxnSpanRefresherSavepoint(t *testing.T) {

Same point about adding a comment.


pkg/kv/testdata/savepoints, line 322 at r4 (raw file):

subtest end

# !!!

!!!

Can this test be revived?


pkg/sql/conn_executor.go, line 2483 at r3 (raw file):

	case *tree.ReleaseSavepoint:
		// TODO(knz): Sanitize this.
		if err := ex.validateSavepointName(t.Savepoint); err == nil {

I don't understand this. Why does validating the name indicate whether this is the cockroach_restart savepoint?


pkg/sql/conn_executor_exec.go, line 945 at r4 (raw file):

	}

	// TODO(andrei/cuongdo): Figure out what statements to count here.

TODO(andrei): ...

I don't think you're gonna get any help with that one any time soon.

Or just delete the TODO. Is it useful anymore?


pkg/sql/conn_executor_exec.go, line 1006 at r4 (raw file):

}

// execStmtInRollbackWaitState executes a statement in a txn that's in state

Is this used? I thought we just got rid of the restartWait state?


pkg/sql/conn_executor_savepoints.go, line 88 at r1 (raw file):

}

// execSavepointInAbortedState runs a SAVEPOINT statement when a txn is aborted.

Why do we support running a SAVEPOINT while in the Aborted state? Postgres doesn't appear to:

nathan=# begin;
BEGIN
nathan=#
nathan=# savepoint a;
SAVEPOINT
nathan=#
nathan=# savepodint a;
ERROR:  syntax error at or near "savepodint"
LINE 1: savepodint a;
        ^
nathan=# savepoint b;
ERROR:  current transaction is aborted, commands ignored until end of transaction block

EDIT: this was answered later.


pkg/sql/conn_executor_savepoints.go, line 62 at r4 (raw file):

			ev, payload := ex.makeErrEvent(err, s)
			return ev, payload, nil
		}

We actually create a kv.savepoint object for cockroach_restart now? I'm surprised. Isn't there a cost to this? Is this imposing strange behavior on the TxnCoordSender since we still want to restart when we roll back to these savepoints?


pkg/sql/conn_executor_savepoints.go, line 200 at r4 (raw file):

		curID, curEpoch := ex.state.mu.txn.ID(), ex.state.mu.txn.Epoch()
		if !curID.Equal(entry.txnID) {
			return ex.makeErrEvent(roachpb.NewTransactionRetryWithProtoRefreshError(

Should we push these errors into client.RollbackToSavepoint?


pkg/sql/conn_executor_savepoints.go, line 264 at r4 (raw file):

	// a savepoint if there was DDL performed "under it".
	// TODO(knz): support partial DDL cancellation in pending txns.
	numDDL int

Add more to this comment. This is the number of DDL statements that had been issued at the time that the savepoint was created, right?


pkg/sql/conn_executor_savepoints.go, line 276 at r4 (raw file):

	// - if the epoch has changed, then we can only rollback if we performed a
	// refresh for the reads that are not rolled back. We currently don't do this.
	txnID uuid.UUID

I'm still confused about why we need this here and in the kv.savepoint object.


pkg/sql/conn_executor_savepoints.go, line 310 at r4 (raw file):

func (stack savepointStack) clone() savepointStack {
	cpy := make(savepointStack, len(stack))

This copy is unfortunate. Can we do something smarter with aliasing the slices and copying on write? I suspect that even without reference counting to eliminate redundant copied, it would be cheaper to simply copy on every call to push.

Do you mind comparing the relative frequency of these method calls so we can decide on the right solution here? I'm working under the assumption that setTxnRewindPos is going to be called far more often than execSavepointInOpenState.


pkg/sql/conn_executor_savepoints.go, line 337 at r4 (raw file):

}

// commitOnReleaseSavepointName is the name of the savepoint with special

Could you explain again what those semantics are?

Also, move this to the top of the file. It's buried down here but pretty important to understand in order to read the rest of the file.


pkg/sql/conn_executor_savepoints_test.go, line 193 at r4 (raw file):

func isOpenTxn(status string) bool {
	return status == "Open" || status == "NoTxn"

Can we use sql.OpenStateStr and sql.NoTxnStr here?


pkg/sql/conn_fsm.go, line 58 at r4 (raw file):

}

// stateAborted is entered on retriable errors. A ROLLBACK TO SAVEPOINT can

And non-retryable errors, right?


pkg/sql/conn_fsm.go, line 92 at r4 (raw file):

func (stateOpen) State()    {}
func (stateAborted) State() {}

nit: stray line.


pkg/sql/conn_fsm.go, line 115 at r4 (raw file):

// makeEventTxnStartPayload creates an eventTxnStartPayload.
//
// Pass noRolledBackSavepoint for rolledBackSavepoint when the transaction is

What is this referring to?


pkg/sql/conn_fsm.go, line 144 at r4 (raw file):

// eventSavepointRollback is generated when we want to move from Aborted to Open
// through a ROLLBACK TO SAVEPOINT <not cockroach_restart>. Note that it is not
// generated when such a savepoint is rolled back to from the Open state. In

"to from"


pkg/sql/conn_fsm.go, line 190 at r4 (raw file):

// eventTxnRestart is generated by a rollback to a savepoint placed at the
// beginning of the transaction (commonly SAVEPOINT cockroach_restart).
type eventTxnRestart struct{}

What went into the decision to have a separate event instead of a single type eventSavepointRollback struct{Initial fsm.Bool}?


pkg/sql/conn_fsm.go, line 193 at r4 (raw file):

// eventTxnReleased is generated after a successful
// RELEASE SAVEPOINT cockroach_restart. It moves the state to CommitWait.

Are these two events limited to cockroch_restart or just commonly linked to it like eventTxnRestart?


pkg/sql/conn_fsm.go, line 204 at r4 (raw file):

}

func (eventTxnStart) Event()          {}

nit: keep these in the same order as the type defs. Also ideally in the same order as the cases for a given state in the state transition map.


pkg/sql/logictest/testdata/logic_test/manual_retry, line 59 at r4 (raw file):

# BEGIN
#
# !!! rewrite this test somehow to assert the release behavior, not the initial status

!!!


pkg/sql/logictest/testdata/logic_test/manual_retry, line 141 at r4 (raw file):

BEGIN TRANSACTION; SAVEPOINT foo

statement error pq: savepoint bar does not exist

Might as well quote the savepoint name so that we exactly mirror PG:

nathan=# BEGIN;
BEGIN
nathan=# ROLLBACK TO SAVEPOINT bar;
ERROR:  savepoint "bar" does not exist

pkg/sql/logictest/testdata/logic_test/manual_retry, line 165 at r4 (raw file):

ROLLBACK TO SAVEPOINT "Foo Bar"

query TB

Could you query with the column names here and below?


pkg/sql/opt/exec/execbuilder/builder.go, line 76 at r4 (raw file):

	// IsDDL is set to true if the statement contains DDL.
	IsDDL bool

Someone else should sign off on whether this is a valid use of execbuilder.Builder.

cc. @RaduBerinde


pkg/sql/opt/exec/execbuilder/relational.go, line 144 at r4 (raw file):

	var err error

	isDDL := opt.IsDDLOp(e)

nit: no need for the variable.


pkg/sql/testdata/savepoints, line 1 at r4 (raw file):

# This test exercises the savepoint state in the conn executor.

Nice tests!


pkg/sql/testdata/savepoints, line 303 at r4 (raw file):

subtest rollback_after_error

# check that we can rollback after an error

nit: s/check/Check/ here and below.


pkg/sql/testdata/savepoints, line 330 at r4 (raw file):

2: SELECT crdb_internal.force_retry('100ms') -- pq: restart transaction: crdb_internal.force_retry(): TransactionRetryWithProtoRefreshError: forced by crdb_internal.force_retry()
-- Open        -> Aborted     XXXX  init(r)
3: ROLLBACK TO SAVEPOINT init -- 0 rows

It's too bad we aren't testing that this causes a txn retry and that the epoch is larger.


pkg/sql/testdata/savepoints, line 335 at r4 (raw file):

-- Open        -> NoTxn       #...  (none)

# Check that, after a retriable error, rolling back to anything an initial

"other than"?


pkg/sql/testdata/savepoints, line 526 at r4 (raw file):

# Test that the rewinding we do when performing an automatic retry restores the
# savepoint stack properly.
subtest rewing_on_automatic_restarts

s/rewing/rewind/

@knz
Copy link
Contributor

knz commented Mar 3, 2020

Generally I'll trust you on implementation but I'll sit tomorrow over this PR to add more testing too. I believe remaining issues should be rooted out via testing, including via the sqlsmith tests that Rohan already prepared for us.

Also Rafi recommends that we trigger the nightly run on the PR manually to pick up all the new feedback from ORM tests.

@knz
Copy link
Contributor

knz commented Mar 4, 2020

I have kicked a full nightly run on the PR ahead of further changes, to pick up any news from ORMs:

https://teamcity.cockroachdb.com/viewLog.html?buildId=1782977&

@andreimatei andreimatei force-pushed the savepoint.implement-savepoints branch 5 times, most recently from 0f5c697 to a3ddc11 Compare March 5, 2020 00:15
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/descovering/discovering/ in the commit message.

done

I haven't finished all the comments, but responding to some. Take a look if you get around here. The rest tomorrow.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @knz, and @nvanbenschoten)


pkg/kv/txn_coord_sender.go, line 181 at r4 (raw file):

the needs
done


pkg/kv/txn_coord_sender.go, line 184 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/implementations/Implementations/

done


pkg/kv/txn_coord_sender_savepoints.go, line 33 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit, period after both of these comments.

done


pkg/kv/txn_coord_sender_savepoints.go, line 42 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Move these fields up above seqNum to give the ordering here some notion of logical precedence.

done


pkg/kv/txn_coord_sender_savepoints.go, line 172 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why do we need these seqNum checks?

because "initial savepoint" allow rollbacks after retries. See now.


pkg/kv/txn_coord_sender_savepoints.go, line 180 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

When is seqNum less than zero?

never, I guess that's the point of the assertion. Rafa added it, but I'd leave it.


pkg/kv/txn_interceptor_committer.go, line 465 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: not your change, but if we wanted to rename epochBumpedLocked to bumpEpochLocked to fit the naming scheme here more appropriately, I'd be very supportive. If not, I'll make the change after this PR lands.

meh


pkg/kv/txn_interceptor_pipeliner.go, line 607 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Let's add a clone method to *inFlightWriteSet.

well but I'm only cloning the tree, not other crap...


pkg/kv/txn_interceptor_pipeliner.go, line 615 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/to/with/

done


pkg/kv/txn_interceptor_pipeliner.go, line 627 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'm confused, we do keep track of what sequence numbers we've verified and which we haven't. There's a Sequence field in inFlightWrite.

indeed. See now.


pkg/kv/txn_interceptor_pipeliner.go, line 635 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

No, you can't. The btree library doesn't like that.

ack


pkg/kv/txn_interceptor_pipeliner.go, line 654 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should you be setting cloned here?

It's already set from the moment we've created the savepoint. Or maybe I'm missing your point.


pkg/kv/txn_interceptor_pipeliner.go, line 655 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should we clear(true) here instead?

Sometimes. See now.
Perhaps we could always do it if we relied on s.ifWrites == nil (the special nil value) to imply that no earlier savepoint has any writes, but it seems pretty fragile.
This is all pretty nasty; curious for your thoughts.


pkg/kv/txn_interceptor_pipeliner.go, line 825 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this safe when s.cloned == true?

It is not safe when you can still rollback to any savepoints that might have writes in them. But we don't; we only call this on epoch bump (after which you can only roll back to "initial" savepoints), on commit, and, since this review, on a special case of rollbackToSavepoint when we're rolling back to an initial savepoint.

Added a comment to the method about it. And accepting thoughts.


pkg/kv/txn_interceptor_pipeliner_test.go, line 1231 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Give this a comment. We've been good about doing so in these tests so far.

Good test though.

done


pkg/kv/txn_interceptor_pipeliner_test.go, line 1244 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/write/writes/

done


pkg/kv/txn_interceptor_pipeliner_test.go, line 1267 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/we'll/we/

the future here is meant to signal that the comment does not apply to the lines immediately below, but to something that's coming later.


pkg/kv/txn_interceptor_pipeliner_test.go, line 1292 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Want to check the footprint?

I did one line above, no?


pkg/kv/txn_interceptor_pipeliner_test.go, line 1307 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This is interesting. It begs the question of why we need to update the savepoint itself. Is it not enough to update the TxnCoordSender?

Updating the savepoint too is an optimization for the case when we'll rollback to it again.


pkg/kv/testdata/savepoints, line 322 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

!!!

Can this test be revived?

done


pkg/sql/conn_executor_exec.go, line 945 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

TODO(andrei): ...

I don't think you're gonna get any help with that one any time soon.

Or just delete the TODO. Is it useful anymore?

:)

gone


pkg/sql/conn_executor_exec.go, line 1006 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this used? I thought we just got rid of the restartWait state?

not used. gone.


pkg/sql/conn_executor_savepoints.go, line 200 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should we push these errors into client.RollbackToSavepoint?

done


pkg/sql/conn_executor_savepoints.go, line 276 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'm still confused about why we need this here and in the kv.savepoint object.

they're gone, left just in kv

@andreimatei andreimatei force-pushed the savepoint.implement-savepoints branch 3 times, most recently from 13b6c4c to 3c811c6 Compare March 5, 2020 16:28
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @knz, and @nvanbenschoten)


pkg/kv/txn_coord_sender.go, line 179 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/the needs/the transaction needs/

done


pkg/kv/txn_interceptor_span_refresher_test.go, line 559 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Same point about adding a comment.

done


pkg/sql/conn_executor.go, line 2483 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I don't understand this. Why does validating the name indicate whether this is the cockroach_restart savepoint?

I don't know what was going on at the time of this commit, but in the latter commit this is changed to isCommitOnReleaseSavepoint.


pkg/sql/conn_executor_savepoints.go, line 62 at r4 (raw file):

We actually create a kv.savepoint object for cockroach_restart now? I'm surprised. Isn't there a cost to this?

I've optimized the creation of kv.savepoints for initial savepoints - since the empty value suffices for those.

Is this imposing strange behavior on the TxnCoordSender since we still want to restart when we roll back to these savepoints?

We don't need to do anything special when rolling back to these savepoints - which is nice. If we're rolling back to them from the Open state, then they behave just list regular savepoints (we don't need to restart anything; we used to before this patch but not any more). When rolling back to them from the Aborted state, the transaction will have been already restarted if there was a retriable error prior.


pkg/sql/conn_executor_savepoints.go, line 310 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This copy is unfortunate. Can we do something smarter with aliasing the slices and copying on write? I suspect that even without reference counting to eliminate redundant copied, it would be cheaper to simply copy on every call to push.

Do you mind comparing the relative frequency of these method calls so we can decide on the right solution here? I'm working under the assumption that setTxnRewindPos is going to be called far more often than execSavepointInOpenState.

setTxnRewindPos should be called less than execSavepointInOpenState. setTxnRewindPos is called on the order of once per transaction.
Still, I've tried for a hot minute to optimize the common handling of a stack of savepoints only consisting of the commitOnRelease one, but it's not entirely trivial because the name of that savepoint is not always cockroach_restart, depending on session variables. I've optimized the cloning of the nil slice, through, which should be useful for connections that never use savepoints.
I might try harder, but after this PR.


pkg/sql/conn_executor_savepoints.go, line 337 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Could you explain again what those semantics are?

Also, move this to the top of the file. It's buried down here but pretty important to understand in order to read the rest of the file.

done


pkg/sql/conn_executor_savepoints_test.go, line 193 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Can we use sql.OpenStateStr and sql.NoTxnStr here?

done


pkg/sql/conn_fsm.go, line 58 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

And non-retryable errors, right?

done


pkg/sql/conn_fsm.go, line 92 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: stray line.

done


pkg/sql/conn_fsm.go, line 115 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

What is this referring to?

stale. removed.


pkg/sql/conn_fsm.go, line 144 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"to from"

but it's correct :)


pkg/sql/conn_fsm.go, line 190 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

What went into the decision to have a separate event instead of a single type eventSavepointRollback struct{Initial fsm.Bool}?

pretty arbitrary, but I think it looks better in the state machine diagram


pkg/sql/logictest/testdata/logic_test/manual_retry, line 59 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

!!!

done


pkg/sql/opt/exec/execbuilder/relational.go, line 144 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: no need for the variable.

done


pkg/sql/testdata/savepoints, line 1 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Nice tests!

Mostly Raphael.

@andreimatei andreimatei force-pushed the savepoint.implement-savepoints branch from 3c811c6 to e8b3ad4 Compare March 5, 2020 18:16
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @knz, and @nvanbenschoten)


pkg/kv/txn_interceptor_seq_num_allocator.go, line 180 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It's worth adding a very small test for this. I know the implementation is trivial, but the role the txnSeqNumAllocator plays in populating savepoints is critical.

done


pkg/sql/conn_executor_savepoints.go, line 264 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Add more to this comment. This is the number of DDL statements that had been issued at the time that the savepoint was created, right?

done


pkg/sql/conn_fsm.go, line 193 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Are these two events limited to cockroch_restart or just commonly linked to it like eventTxnRestart?

the former. Emphasized more in the comment. Releasing a normal savepoint doesn't generate an event.


pkg/sql/conn_fsm.go, line 204 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: keep these in the same order as the type defs. Also ideally in the same order as the cases for a given state in the state transition map.

shuffled a bit


pkg/sql/logictest/testdata/logic_test/manual_retry, line 141 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Might as well quote the savepoint name so that we exactly mirror PG:

nathan=# BEGIN;
BEGIN
nathan=# ROLLBACK TO SAVEPOINT bar;
ERROR:  savepoint "bar" does not exist

done


pkg/sql/logictest/testdata/logic_test/manual_retry, line 165 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Could you query with the column names here and below?

done


pkg/sql/testdata/savepoints, line 303 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: s/check/Check/ here and below.

meh


pkg/sql/testdata/savepoints, line 330 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It's too bad we aren't testing that this causes a txn retry and that the epoch is larger.

I've added a check that we don't see writes from before the rollback


pkg/sql/testdata/savepoints, line 335 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"other than"?

done


pkg/sql/testdata/savepoints, line 526 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/rewing/rewind/

done

@andreimatei andreimatei force-pushed the savepoint.implement-savepoints branch from e8b3ad4 to a99e49e Compare March 5, 2020 19:53
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @nvanbenschoten)


pkg/sql/opt/exec/execbuilder/builder.go, line 76 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Someone else should sign off on whether this is a valid use of execbuilder.Builder.

cc. @RaduBerinde

Looks fine to me. The whole setsystemconfig trigger thing could be passed to the execbuilder as a function since it's really higher-level logic IMO but no need to do that now.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewed 2 of 3 files at r12, 30 of 41 files at r13, 3 of 3 files at r14.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @knz)


pkg/kv/txn_interceptor_pipeliner.go, line 683 at r14 (raw file):

	}

	// If the tree has not been cloned before, we can attempt a fast path where we

Remove this comment.


pkg/kv/txn_interceptor_pipeliner.go, line 801 at r14 (raw file):

}

// AsSlice returns the in-flight writes, ordered by key.

Since we're not exporting any of these methods, name this asSlice.

@andreimatei andreimatei force-pushed the savepoint.implement-savepoints branch from 06fc1d2 to ebe8e55 Compare March 6, 2020 00:42
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

@rafiss, I've updated the pgjdbc blacklist removing the 56 tests that are now passing.
I've also run the django test, and it passed a bunch of tests and wanted

 var djangoBlacklist20_1 = blacklist{
   "admin_views.tests.GroupAdminTest.test_group_permission_performance": "unknown",
}

but it also claimed that only 10 tests failed unexpectedly and making that change would remove more than 10. So I don't know what's going on and I haven't touched the blacklist.

I also ran the hibernate test which claims that 20 tests failed unexpectedly. I don't immediately see a rhyme in the failed ones, so I'm gonna burry my head in the sand and move on.

sqlalchemy and typeorm passed without any fuss.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @knz and @nvanbenschoten)


pkg/kv/txn_interceptor_pipeliner.go, line 683 at r14 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Remove this comment.

done


pkg/kv/txn_interceptor_pipeliner.go, line 801 at r14 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Since we're not exporting any of these methods, name this asSlice.

done

@andreimatei
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 6, 2020

Merge conflict

andreimatei and others added 4 commits March 5, 2020 21:06
Before this patch, a transaction was releasing its locks immediately
after a statement encountered an error, when entering the Aborted
state. With the upcoming support for savepoints, which will allow error
recovery, this is no longer desirable (or, at least, not always
desirable). This patch removes the rollback that happened when
transitioning to the Aborted state. Instead, we defer the rollback to
the final ROLLBACK statement, which transitions us out of the Aborted
state.

Release note: None
We have some metrics around savepoints which are referenced both in some
SQL category, and in a KV transactions category. I'm removing the
references from the KV section, I don't think they belong.
And generally I understand that this catalog is not used for anything.

Release note: None
Release note (sql change): CockroachDB now collects separate sets of
metrics for usage of SAVEPOINT: one set for regular SQL savepoints and
one set for uses dedicated to CockroachDB's client-side transaction
retry protocol.
This patch adds support for SAVEPOINT <foo>, RELEASE SAVEPOINT <foo>,
ROLLBACK TO SAVEPOINT <foo>.
Before this patch, we only had support for the special savepoint
cockroach_restart, which had to be placed at the beginning of the
transaction and was specifically intended for dealing with transaction
retries. This patch implements general support for savepoints, which
provide an error recovery mechanism.

The connExecutor now maintains a stack of savepoints. Rolling back to a
savepoint uses the recent KV api for ignoring a range of write sequence
numbers.

At the SQL level, savepoints differ in two characteristics:
1) savepoints placed at the beginning of a transaction (i.e. before any
KV operations) are marked as "initial". Rolling back to an initial
savepoint is always possible. Rolling back to a non-initial savepoint is
not possible after the transaction restarts (see below).
2) the savepoint named "cockroach_restart" retains special RELEASE
semantics: releasing it (attempts to) commit the underlying KV txn.
This continues to allow for discovering of deferred serilizability
errors (i.e. write timestamp pushes by the ts cache). As before, this
RELEASE can fail with a retriable error, at which point the client can
do ROLLBACK TO SAVEPOINT cockroach_restart (which is guaranteed to work
because cockroach_restart needs to be an "initial" savepoint). The
transaction continues to maintain all its locks after such an error.
This is all in contrast to releasing any other savepoints, which cannot
commit the txn and also never fails. See below for more discussion.
The cockroach_restart savepoint is only special in its release behavior,
not in its rollback behavior.

With the implementation of savepoints, the state machine driving a SQL
connection's transactions becomes a lot simpler. There's no longer a
distinction between an "Aborted" transaction and one that's in
"RestartWait". Rolling back to a savepoint now works the same way across
the two states, so RestartWait is gone.

This patch also improves the KV savepoints. They now capture and restore
the state of the read spans and the in-flight writes.

Some things don't work (yet):
a) Rolling back to a savepoint created after a schema change will error
out. This is because we don't yet snapshot the transaction's schema
change state.
b) After a retriable error, you can only rollback to an initial
savepoint. Attempting to rollback to a non-initial savepoint generates a
retriable error again. If the trasaction has been aborted, I think this
is the right behavior; no recovery is possible since the transaction has
lost its write intents. In the usual case where the transaction has not
been aborted, I think we want something better but it will take more
work to get it. I think the behavior we want is the following:
- after a serializability failure, retrying just part of the transaction
should be doable by attempting a ROLLBACK TO SAVEPOINT. This rollback
should succeed if all the non-rolled-back reads can be refreshed to the
desired timestamp. If they can be refreshed, then the client can simply
retry the rolled back part of the transaction. If they can't, then the
ROLLBACK should return a retriable error again, allowing the client to
attempt a deeper rollback - and so on until the client rolls back to an
initial savepoint (which succeeds by definition).
Implementing this would allow for the following nifty pattern:

func fn_n() {
  for {
    SAVEPOINT savepoint_n
    try {
      fn_n+1()
    } catch retriable error {
      err := ROLLBACK TO SAVEPOINT outer
      if err != nil {
        throw err
      }
      continue
    }
    RELEASE SAVEPOINT savepoint_n
    break
  }
}

The idea here is that the client is trying to re-do as little work as
possible by successively rolling back to earlier and earlier savepoints.
This pattern will technically work with the current patch already,
except it will not actually help the client in any way since all the
rollbacks will fail until we get to the very first savepoint.
There's an argument to be made for making RELEASE SAVEPOINT check for
deferred serializability violations (and perhaps other deferred checks -
like deferred constraint validation), although Postgres doesn't do any
of these.
Anyway, I've left implementing this for a future patch because I want to
do some KV work for supporting it nicely. Currently, the automatic
restart behavior that KV transactions have is a pain in the ass since it
works against what we're trying to do.

For the time-being, non-initial savepoints remember their txn ID and
epoch and attempting to rollback to them after these changes produces a
retriable error automatically.

Fixes cockroachdb#45477
Touches cockroachdb#10735

Release note (sql change): SQL savepoints are now supported. SAVEPOINT
<foo>, RELEASE SAVEPOINT <foo>, ROLLBACK TO SAVEPOINT <foo> now works.
`SHOW SAVEPOINT STATUS` can be used to inspect the current stack of active
savepoints.

Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
@andreimatei andreimatei force-pushed the savepoint.implement-savepoints branch from ebe8e55 to f1e2a00 Compare March 6, 2020 02:15
@andreimatei
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 6, 2020

Build succeeded

@craig craig bot merged commit ad84601 into cockroachdb:master Mar 6, 2020
@knz knz changed the title sql,kv: add SQL savepoints support sql,kv: add preliminary SQL savepoints support Mar 6, 2020
@knz
Copy link
Contributor

knz commented Mar 6, 2020

There was no reason for you to rush this merge before I had a look, was there?

In fact the release note here is incomplete and there are still a few missing tests. I'll issue the complementary PR.

@knz
Copy link
Contributor

knz commented Mar 6, 2020

actually you even botched it... I'll file the additional issues

@andreimatei
Copy link
Contributor Author

I merged it cause I'm away until Tuesday. I thought you told me you had looked...

@knz
Copy link
Contributor

knz commented Mar 6, 2020

all good - we can fix things next week. Enjoy your weekend

@andreimatei andreimatei deleted the savepoint.implement-savepoints branch March 10, 2020 18:32
rmloveland added a commit to rmloveland/cockroach that referenced this pull request Mar 13, 2020
This change updates the syntax diagram definitions and generated BNF for
several SAVEPOINT-related statements, specifically:

- Add the SHOW SAVEPOINT STATUS statement to the list of syntax diagrams
  generated by pkg/cmd/docgen

- Add the SHOW SAVEPOINT STATUS BNF file to the other generated BNF
  files

- Update ROLLBACK TO SAVEPOINT to note that the savepoint name does not
  have to be 'cockroach_restart'

It uses the changes in cockroachdb#45794, which enabled docgen for SHOW SAVEPOINT
STATUS.

It is part of the work surrounding cockroachdb#45566, which added preliminary SQL
savepoints support.

Release justification: low-risk update to documentation diagrams

Release note: None
craig bot pushed a commit that referenced this pull request Mar 16, 2020
45962:  sql: re-add GC job on schema element deletion r=pbardea a=pbardea

This commit creates GC jobs upon the deletion of an index, table or
database. Similarly to the previous implementation, it considers the
walltime at which the schema change completed to be the drop time of the
schema element.

Release note (sql change): Previously, after deleting an index, table,
or database the relevant schema change job would change its running
status to waiting for GC TTL. The schema change and the GC process are
now decoupled into 2 jobs.

Release justification: This is a follow up to the migration of turning
schema changes into actual jobs. This commit re-adds the ability to
properly GC indexes and tables.

46048: docgen: update savepoint-related definitions, bnfs r=rmloveland a=rmloveland

This change updates the syntax diagram definitions and generated BNF for
several SAVEPOINT-related statements, specifically:

- Add the SHOW SAVEPOINT STATUS statement to the list of syntax diagrams
  generated by pkg/cmd/docgen

- Add the SHOW SAVEPOINT STATUS BNF file to the other generated BNF
  files

- Update ROLLBACK TO SAVEPOINT to note that the savepoint name does not
  have to be 'cockroach_restart'

It uses the changes in #45794, which enabled docgen for SHOW SAVEPOINT
STATUS.

It is part of the work surrounding #45566, which added preliminary SQL
savepoints support.

Release justification: low-risk update to documentation diagrams

Release note: None

Co-authored-by: Paul Bardea <pbardea@gmail.com>
Co-authored-by: Rich Loveland <rich@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants