Update txn layer docs with Parallel Commits info #5394

rmloveland · 2019-09-09T18:38:44Z

Addresses #4723.

Summary of changes:

Add a new section 'Parallel commits' to the transaction layer documentation that describes what it is and how it works, with diagrams.
Update other areas of the document that mention behavior that is affected by this feature.

cockroach-teamcity · 2019-09-09T18:38:50Z

This change is

cockroach-teamcity · 2019-09-09T18:41:06Z

Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/624e7f4e29b47787771c09b6526f6fd2a262f49a/

Edited pages:

v19.2/architecture/transaction-layer.md

cockroach-teamcity · 2019-09-09T18:55:22Z

Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/9d6848a3a4511a735dc40a17587d834d4d8c9d99/

Edited pages:

v19.2/architecture/transaction-layer.md

nvb

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)

v19.2/architecture/transaction-layer.md, line 29 at r1 (raw file):
s/STAGED/STAGING/ throughout

semantically identical to COMMITTED

This isn't really correct. The status isn't semantically identical to COMMITTED. A STAGING transaction record may or may not be committed, it's not clear from the transaction record. Meanwhile, a COMMITTED transaction record is definitely committed.

can be ignored for the purposes of understanding transaction semantics

I don't think it really can be ignored. It implies a large change to the protocol and dramatically changes what it means for a transaction to be committed.

v19.2/architecture/transaction-layer.md, line 240 at r1 (raw file):
s/are/is/

~~for the final batch~~

in which SQL transactions incurred a consensus latency on every write

This isn't true. 2.1 and 19.1 both had transaction pipelining, so they didn't incur a consensus latency on every write. They only incurred (2x) consensus latency when committing.

v19.2/architecture/transaction-layer.md, line 242 at r1 (raw file):

actual commit operation

I wouldn't use this phrasing. The STAGING operation is the actual commit operation. That's what commits the transaction. The async which moves the record to the COMMITTED status is just consolidating this information into a single place.

It is able to do this because it populates the transaction record with enough information to prove that all writes in the transaction are present.

for other transactions to determine whether all writes in the transaction are present and thus prove whether or not the transaction is committed.

rmloveland

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

v19.2/architecture/transaction-layer.md, line 29 at r1 (raw file):

s/STAGED/STAGING/ throughout

Fixed.

This isn't really correct. The status isn't semantically identical

Ah, OK. I was reaching there based on a mental "shorthand" I was using to try to understand it when reading the RFC.

I was trying to update the doc in a "layered" approach where you could leave most of the existing doc as-is and act like STAGING wasn't there for purposes of understanding the basic protocol, i.e. it was layered on as a pure optimization that doesn't change the things you need to understand, unless you need to go deep into the optimization.

I don't think it really can be ignored. It implies a large change to the protocol and dramatically changes what it means for a transaction to be committed.

This makes my idea that it was a pure optimization clearly not so. Therefore I think there will be more changes to do on this PR to update the various mentions of the transaction record states, how the transaction moves through the various states to COMMITTED etc.

I will schedule time with you to ask some more questions. I think it will be faster to go through the existing doc together and look at the changes holistically than to ask you 99 questions here. :-}

v19.2/architecture/transaction-layer.md, line 240 at r1 (raw file):

s/are/is/

Fixed.

~~for the final batch~~

Fixed.

This isn't true.

Fixed by removing that sentence altogether.

v19.2/architecture/transaction-layer.md, line 242 at r1 (raw file):

I wouldn't use this phrasing. The STAGING operation is the actual commit operation. That's what commits the transaction. The async which moves the record to the COMMITTED status is just consolidating this information into a single place.

Fixed by removing "actual".

for other transactions to determine whether all writes in the transaction are present and thus prove whether or not the transaction is committed.

Fixed by updating final sentence in that para to read "It is able to do this because it populates the transaction record with enough information for other transactions to determine whether all writes in the transaction are present and thus prove whether or the transaction is committed." (which is what I think you meant?)

cockroach-teamcity · 2019-09-09T20:34:43Z

Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/b86ab73323ca44aee42ae25ceda54c908cdba79a/

Edited pages:

v19.2/architecture/transaction-layer.md

cockroach-teamcity · 2019-09-23T21:10:47Z

Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/29416d114a768880e3ef45d3ee665ba5207b9ac7/

Edited pages:

v19.2/architecture/transaction-layer.md

rmloveland · 2019-09-23T21:11:26Z

Nathan, I made changes based on our other discussions outside this PR. Please take a look at your convenience.

FYI I'm changing this PR slightly to target just the 'Transaction Layer' update, so I can audit 'Life of a Distributed Transaction' and 'Reads and Writes' and work on them as needed separately.

nvb

This is much better! Great job capturing all of this @rmloveland.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)

v19.2/architecture/transaction-layer.md, line 132 at r2 (raw file):

- `PENDING`: Indicates that the write intent's transaction is still in progress.
- `COMMITTED`: Once a transaction has completed, this status indicates that write intents can be treated as committed values.
- `STAGING`: Used to enable the [Parallel Commits](#parallel-commits) feature.  Depending on the state of the write intents attached to this record, the transaction may or may not be in a committed state.

Write intents aren't really attached to this record. Instead, I'd phrase this more like "write intents referenced by this record".

v19.2/architecture/transaction-layer.md, line 242 at r2 (raw file):

### Parallel commits

<span class="version-tag">New in v19.2</span>: The *Parallel commits* feature introduces a new, optimized atomic commit protocol that cuts the commit latency of a transaction in half, from two rounds of consensus down to one. Combined with [Transaction pipelining](#transaction-pipelining), this brings the latency incurred by common OLTP transactions to near the theoretical minimum: the sum of all read latencies plus one round of consensus latency.

Should this be "Parallel commits" or "Parallel Commits"? We're a little inconsistent about this.

v19.2/architecture/transaction-layer.md, line 242 at r2 (raw file):

### Parallel commits

<span class="version-tag">New in v19.2</span>: The *Parallel commits* feature introduces a new, optimized atomic commit protocol that cuts the commit latency of a transaction in half, from two rounds of consensus down to one. Combined with [Transaction pipelining](#transaction-pipelining), this brings the latency incurred by common OLTP transactions to near the theoretical minimum: the sum of all read latencies plus one round of consensus latency.

Very nice summary!

v19.2/architecture/transaction-layer.md, line 298 at r2 (raw file):

![parallel-commits-04.png](../../images/{{page.version.version}}/parallel-commits-04.png "Parallel Commits Diagram #4")

The transaction is now considered atomically committed, even though its state is still `STAGING`. The reason this is still considered an atomic commit condition is that a transaction is considered committed if it is one of the following logically equivalent states:

s/its state/the state of its transaction record/

v19.2/architecture/transaction-layer.md, line 304 at r2 (raw file):

2. The transaction record's state is `COMMITTED`. Transactions in this state are *explicitly committed*.

Despite their logical equivalence, the transaction coordinator now works as quickly as possible to move the transaction record from the `STAGING` to the `COMMITTED` state so that other transactions do not encounter a possibly conflicting transaction in the `STAGING` state and then have to do the work of verifying that the staging transaction's list of pending writes has succeeded. Doing that verification (also known as the "transaction status recovery process") would be very slow.

s/very slow/slow/ it's not that bad 😉

rmloveland

Thank you Nathan! And thank you for your help.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)

v19.2/architecture/transaction-layer.md, line 132 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Write intents aren't really attached to this record. Instead, I'd phrase this more like "write intents referenced by this record".

Fixed by updating to use that language.

v19.2/architecture/transaction-layer.md, line 242 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Should this be "Parallel commits" or "Parallel Commits"? We're a little inconsistent about this.

Good point.

Fixed by updating all uses of the term to say "Parallel Commits", since as the name of this feature/protocol it's a proper noun.

v19.2/architecture/transaction-layer.md, line 242 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Very nice summary!

Thank you!

v19.2/architecture/transaction-layer.md, line 298 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/its state/the state of its transaction record/

Fixed.

v19.2/architecture/transaction-layer.md, line 304 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/very slow/slow/ it's not that bad 😉

Fixed. 😀

cockroach-teamcity · 2019-09-24T18:53:20Z

Online preview: http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/9fee72467da2773ed15bd6b50a4b8c683834d6d7/

Edited pages:

v19.2/architecture/transaction-layer.md

rmloveland · 2019-09-25T15:11:30Z

@ericharmeling I think this is ready for the first round of Docs review.

@nvanbenschoten please let me know if you have additional comments. And thanks again for your help, and the thorough review(s)!

ericharmeling

Reviewed 6 of 7 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling, @nvanbenschoten, and @rmloveland)

images/v19.2/parallel-commits-00.png, line 0 at r3 (raw file):
I like these diagrams! They are really barebones and provide a lot of clarity to the workflow.

I do have a couple more general questions/comments:

Have there been any discussions about standardizing "workflow" diagrams so we are all using the same tool(s) and the reader is looking at the same kind of graph?
Some of these graphs visualize common steps in the transaction layer workflow as a whole. Do you think the more general discussion of the transaction workflow (i.e. the "Overview" section), and then some more specific components of transaction workflows (i.e. "Transaction pipelining") could benefit from some of these graphs, or at least some subset or modified version of these graphs?

v19.2/architecture/transaction-layer.md, line 242 at r3 (raw file):

Parallel Commits

Throughout this page we capitalize parallel commits. We don't always do this for features (e.g. transaction pipelining)... is that something that we want to change?

v19.2/architecture/transaction-layer.md, line 242 at r3 (raw file):

a new,

You mentioned that this protocol is "new" quite a bit. Do you think most people that are reading this know about the "old" protocol? If not, I don't know how much it helps to compare the "new" and the "old". I honestly don't know though, that's why I'm asking.

v19.2/architecture/transaction-layer.md, line 244 at r3 (raw file):

The optimization is achieved by introducing a new atomic commit protocol that allows the transaction coordinator to return to the client eagerly when it knows that the writes in the transaction have succeeded. Once this occurs, the transaction coordinator can set the transaction record's state to `COMMITTED` and resolve the transaction's write intents asynchronously.

The transaction coordinator is able to do this while maintaining correctness guarantees because it populates the transaction record with enough information (via a new `STAGING` state, and an array of in-flight writes) for other transactions to determine whether all writes in the transaction are present, and thus prove whether or not the transaction is committed.

Here are my readability nit suggestions (to take or leave!):

Mainly simplifying sentence structure...

"Under the atomic commit protocol, the transaction coordinator eagerly returns to the client when it knows that the writes in a transaction have succeeded. After returning to the client, the transaction coordinator sets the transaction record's state to COMMITTED and resolves the transaction's write intents asynchronously."

This is a single sentence, so I think we should split it up...

"While processing an open transaction, the transaction coordinator populates the transaction record with a STAGING state, and an array of in-flight writes. Other transactions can read from the transaction state to see whether writes in a transaction have been committed or not."

v19.2/architecture/transaction-layer.md, line 266 at r3 (raw file):

Apple

I assume this is referring to the 'apple' key from the Transaction pipelining section... if so, shouldn't it be lowercase? Also, should we mention that this is continuing on from the Transaction pipelining example, or maybe be explicit here about what the write to the 'apple' key looks like, in SQL terms (maybe not because it could also be from an ORM or something, but a SQL example might still be nice as an example)?

v19.2/architecture/transaction-layer.md, line 268 at r3 (raw file):

 as an optimization.

Not sure what this means.

v19.2/architecture/transaction-layer.md, line 278 at r3 (raw file):

issues a write to the "Berry" key

OK OK, so this is a separate example from the Transaction pipelining section example. Perhaps we should introduce the example (e.g. "Suppose that you have a table named "Fruits", with a "Key" column and a "Value" column, and you want to write values to the "Apple" and "Berry" key..." or something like that.

v19.2/architecture/transaction-layer.md, line 300 at r3 (raw file):
Readability nit

its list of pending writes  (i.e., `InFlightWrites`) have all succeeded (i.e., achieved consensus across the cluster)

Maybe...

"its list of pending writes have all succeeded (i.e., the InFlightWrites have achieved consensus across the cluster)..."

v19.2/architecture/transaction-layer.md, line 304 at r3 (raw file):

(also known as the "transaction status recovery process")

Is this a term we use elsewhere? If not, I don't think it's necessary...

v19.2/architecture/transaction-layer.md, line 306 at r3 (raw file):
Readability nit...

other transactions, when they encounter a transaction in `STAGING` state, check whether the staging transaction is still in progress by verifying that the transaction coordinator is still heartbeating that staging transaction’s record.

Maybe

"When other transactions encounter a transaction in STAGING state, they check whether the staging transaction..."

ericharmeling

Reviewed 1 of 1 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)

ericharmeling

I added some general questions/comments, and a few nits (all to take or leave). This looks great though!!

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)

ericharmeling

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)

Addresses #4723. Fixes #5465. Summary of changes: - Add a new section 'Parallel commits' to the transaction layer documentation that describes what it is and how it works, with diagrams. - Update other areas of the document that mention behavior that is affected by this feature.

rmloveland

Thanks for the review Eric!

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ericharmeling and @nvanbenschoten)

images/v19.2/parallel-commits-00.png, line at r3 (raw file):

Have there been any discussions about standardizing "workflow" diagrams so we are all using the same tool(s) and the reader is looking at the same kind of graph?

Sean L. told me there had been some discussion a long time ago, but AFAICT it never ended with a decision getting made.

Some of these graphs visualize common steps in the transaction layer workflow as a whole. Do you think the more general discussion of the transaction workflow (i.e. the "Overview" section), and then some more specific components of transaction workflows (i.e. "Transaction pipelining") could benefit from some of these graphs, or at least some subset or modified version of these graphs?

Yes, but I don't want to increase the scope of this PR. I totes agree that as a separate work item we should add WAY more diagrams to our architecture docs, to make them more accessible and understandable. Right now we are making the reader do the work of creating a diagram in their head / on a whiteboard, piece of paper, etc.

v19.2/architecture/transaction-layer.md, line 242 at r3 (raw file):

is that something that we want to change?

IMO yes. The proper noun name of the feature is "Parallel Commits" with a capitalized C. Our style guide already says to capitalize proper nouns, it appears we have not been doing that consistently in all cases. But I'd like to update the uncapitalized proper nouns as a separate work item for sure.

v19.2/architecture/transaction-layer.md, line 242 at r3 (raw file):