-
Notifications
You must be signed in to change notification settings - Fork 472
Update txn layer docs with Parallel Commits info #5394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
624e7f4
to
9d6848a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)
v19.2/architecture/transaction-layer.md, line 29 at r1 (raw file):
s/STAGED/STAGING/
throughout
semantically identical to
COMMITTED
This isn't really correct. The status isn't semantically identical to COMMITTED
. A STAGING
transaction record may or may not be committed, it's not clear from the transaction record. Meanwhile, a COMMITTED
transaction record is definitely committed.
can be ignored for the purposes of understanding transaction semantics
I don't think it really can be ignored. It implies a large change to the protocol and dramatically changes what it means for a transaction to be committed.
v19.2/architecture/transaction-layer.md, line 240 at r1 (raw file):
s/are/is/
for the final batch
in which SQL transactions incurred a consensus latency on every write
This isn't true. 2.1 and 19.1 both had transaction pipelining, so they didn't incur a consensus latency on every write. They only incurred (2x) consensus latency when committing.
v19.2/architecture/transaction-layer.md, line 242 at r1 (raw file):
actual commit operation
I wouldn't use this phrasing. The STAGING
operation is the actual commit operation. That's what commits the transaction. The async which moves the record to the COMMITTED status is just consolidating this information into a single place.
It is able to do this because it populates the transaction record with enough information to prove that all writes in the transaction are present.
for other transactions to determine whether all writes in the transaction are present and thus prove whether or not the transaction is committed.
9d6848a
to
b86ab73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
v19.2/architecture/transaction-layer.md, line 29 at r1 (raw file):
s/STAGED/STAGING/ throughout
Fixed.
This isn't really correct. The status isn't semantically identical
Ah, OK. I was reaching there based on a mental "shorthand" I was using to try to understand it when reading the RFC.
I was trying to update the doc in a "layered" approach where you could leave most of the existing doc as-is and act like STAGING
wasn't there for purposes of understanding the basic protocol, i.e. it was layered on as a pure optimization that doesn't change the things you need to understand, unless you need to go deep into the optimization.
I don't think it really can be ignored. It implies a large change to the protocol and dramatically changes what it means for a transaction to be committed.
This makes my idea that it was a pure optimization clearly not so. Therefore I think there will be more changes to do on this PR to update the various mentions of the transaction record states, how the transaction moves through the various states to COMMITTED
etc.
I will schedule time with you to ask some more questions. I think it will be faster to go through the existing doc together and look at the changes holistically than to ask you 99 questions here. :-}
v19.2/architecture/transaction-layer.md, line 240 at r1 (raw file):
s/are/is/
Fixed.
for the final batch
Fixed.
This isn't true.
Fixed by removing that sentence altogether.
v19.2/architecture/transaction-layer.md, line 242 at r1 (raw file):
I wouldn't use this phrasing. The STAGING operation is the actual commit operation. That's what commits the transaction. The async which moves the record to the COMMITTED status is just consolidating this information into a single place.
Fixed by removing "actual".
for other transactions to determine whether all writes in the transaction are present and thus prove whether or not the transaction is committed.
Fixed by updating final sentence in that para to read "It is able to do this because it populates the transaction record with enough information for other transactions to determine whether all writes in the transaction are present and thus prove whether or the transaction is committed." (which is what I think you meant?)
b86ab73
to
29416d1
Compare
Nathan, I made changes based on our other discussions outside this PR. Please take a look at your convenience. FYI I'm changing this PR slightly to target just the 'Transaction Layer' update, so I can audit 'Life of a Distributed Transaction' and 'Reads and Writes' and work on them as needed separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much better! Great job capturing all of this @rmloveland.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)
v19.2/architecture/transaction-layer.md, line 132 at r2 (raw file):
- `PENDING`: Indicates that the write intent's transaction is still in progress. - `COMMITTED`: Once a transaction has completed, this status indicates that write intents can be treated as committed values. - `STAGING`: Used to enable the [Parallel Commits](#parallel-commits) feature. Depending on the state of the write intents attached to this record, the transaction may or may not be in a committed state.
Write intents aren't really attached to this record. Instead, I'd phrase this more like "write intents referenced by this record".
v19.2/architecture/transaction-layer.md, line 242 at r2 (raw file):
### Parallel commits <span class="version-tag">New in v19.2</span>: The *Parallel commits* feature introduces a new, optimized atomic commit protocol that cuts the commit latency of a transaction in half, from two rounds of consensus down to one. Combined with [Transaction pipelining](#transaction-pipelining), this brings the latency incurred by common OLTP transactions to near the theoretical minimum: the sum of all read latencies plus one round of consensus latency.
Should this be "Parallel commits" or "Parallel Commits"? We're a little inconsistent about this.
v19.2/architecture/transaction-layer.md, line 242 at r2 (raw file):
### Parallel commits <span class="version-tag">New in v19.2</span>: The *Parallel commits* feature introduces a new, optimized atomic commit protocol that cuts the commit latency of a transaction in half, from two rounds of consensus down to one. Combined with [Transaction pipelining](#transaction-pipelining), this brings the latency incurred by common OLTP transactions to near the theoretical minimum: the sum of all read latencies plus one round of consensus latency.
Very nice summary!
v19.2/architecture/transaction-layer.md, line 298 at r2 (raw file):
 The transaction is now considered atomically committed, even though its state is still `STAGING`. The reason this is still considered an atomic commit condition is that a transaction is considered committed if it is one of the following logically equivalent states:
s/its state/the state of its transaction record/
v19.2/architecture/transaction-layer.md, line 304 at r2 (raw file):
2. The transaction record's state is `COMMITTED`. Transactions in this state are *explicitly committed*. Despite their logical equivalence, the transaction coordinator now works as quickly as possible to move the transaction record from the `STAGING` to the `COMMITTED` state so that other transactions do not encounter a possibly conflicting transaction in the `STAGING` state and then have to do the work of verifying that the staging transaction's list of pending writes has succeeded. Doing that verification (also known as the "transaction status recovery process") would be very slow.
s/very slow/slow/
it's not that bad 😉
29416d1
to
9fee724
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Nathan! And thank you for your help.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)
v19.2/architecture/transaction-layer.md, line 132 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Write intents aren't really attached to this record. Instead, I'd phrase this more like "write intents referenced by this record".
Fixed by updating to use that language.
v19.2/architecture/transaction-layer.md, line 242 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should this be "Parallel commits" or "Parallel Commits"? We're a little inconsistent about this.
Good point.
Fixed by updating all uses of the term to say "Parallel Commits", since as the name of this feature/protocol it's a proper noun.
v19.2/architecture/transaction-layer.md, line 242 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Very nice summary!
Thank you!
v19.2/architecture/transaction-layer.md, line 298 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/its state/the state of its transaction record/
Fixed.
v19.2/architecture/transaction-layer.md, line 304 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
s/very slow/slow/
it's not that bad 😉
Fixed. 😀
@ericharmeling I think this is ready for the first round of Docs review. @nvanbenschoten please let me know if you have additional comments. And thanks again for your help, and the thorough review(s)! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 7 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling, @nvanbenschoten, and @rmloveland)
images/v19.2/parallel-commits-00.png, line 0 at r3 (raw file):
I like these diagrams! They are really barebones and provide a lot of clarity to the workflow.
I do have a couple more general questions/comments:
-
Have there been any discussions about standardizing "workflow" diagrams so we are all using the same tool(s) and the reader is looking at the same kind of graph?
-
Some of these graphs visualize common steps in the transaction layer workflow as a whole. Do you think the more general discussion of the transaction workflow (i.e. the "Overview" section), and then some more specific components of transaction workflows (i.e. "Transaction pipelining") could benefit from some of these graphs, or at least some subset or modified version of these graphs?
v19.2/architecture/transaction-layer.md, line 242 at r3 (raw file):
Parallel Commits
Throughout this page we capitalize parallel commits. We don't always do this for features (e.g. transaction pipelining)... is that something that we want to change?
v19.2/architecture/transaction-layer.md, line 242 at r3 (raw file):
a new,
You mentioned that this protocol is "new" quite a bit. Do you think most people that are reading this know about the "old" protocol? If not, I don't know how much it helps to compare the "new" and the "old". I honestly don't know though, that's why I'm asking.
v19.2/architecture/transaction-layer.md, line 244 at r3 (raw file):
The optimization is achieved by introducing a new atomic commit protocol that allows the transaction coordinator to return to the client eagerly when it knows that the writes in the transaction have succeeded. Once this occurs, the transaction coordinator can set the transaction record's state to `COMMITTED` and resolve the transaction's write intents asynchronously. The transaction coordinator is able to do this while maintaining correctness guarantees because it populates the transaction record with enough information (via a new `STAGING` state, and an array of in-flight writes) for other transactions to determine whether all writes in the transaction are present, and thus prove whether or not the transaction is committed.
Here are my readability nit suggestions (to take or leave!):
- Mainly simplifying sentence structure...
"Under the atomic commit protocol, the transaction coordinator eagerly returns to the client when it knows that the writes in a transaction have succeeded. After returning to the client, the transaction coordinator sets the transaction record's state to COMMITTED
and resolves the transaction's write intents asynchronously."
- This is a single sentence, so I think we should split it up...
"While processing an open transaction, the transaction coordinator populates the transaction record with a STAGING
state, and an array of in-flight writes. Other transactions can read from the transaction state to see whether writes in a transaction have been committed or not."
v19.2/architecture/transaction-layer.md, line 266 at r3 (raw file):
Apple
I assume this is referring to the 'apple' key from the Transaction pipelining section... if so, shouldn't it be lowercase? Also, should we mention that this is continuing on from the Transaction pipelining example, or maybe be explicit here about what the write to the 'apple' key looks like, in SQL terms (maybe not because it could also be from an ORM or something, but a SQL example might still be nice as an example)?
v19.2/architecture/transaction-layer.md, line 268 at r3 (raw file):
as an optimization.
Not sure what this means.
v19.2/architecture/transaction-layer.md, line 278 at r3 (raw file):
issues a write to the "Berry" key
OK OK, so this is a separate example from the Transaction pipelining section example. Perhaps we should introduce the example (e.g. "Suppose that you have a table named "Fruits", with a "Key" column and a "Value" column, and you want to write values to the "Apple" and "Berry" key..." or something like that.
v19.2/architecture/transaction-layer.md, line 300 at r3 (raw file):
Readability nit
its list of pending writes (i.e., `InFlightWrites`) have all succeeded (i.e., achieved consensus across the cluster)
Maybe...
"its list of pending writes have all succeeded (i.e., the InFlightWrites
have achieved consensus across the cluster)..."
v19.2/architecture/transaction-layer.md, line 304 at r3 (raw file):
(also known as the "transaction status recovery process")
Is this a term we use elsewhere? If not, I don't think it's necessary...
v19.2/architecture/transaction-layer.md, line 306 at r3 (raw file):
Readability nit...
other transactions, when they encounter a transaction in `STAGING` state, check whether the staging transaction is still in progress by verifying that the transaction coordinator is still heartbeating that staging transaction’s record.
Maybe
"When other transactions encounter a transaction in STAGING
state, they check whether the staging transaction..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some general questions/comments, and a few nits (all to take or leave). This looks great though!!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @rmloveland)
9fee724
to
a9e1873
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review Eric!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @ericharmeling and @nvanbenschoten)
images/v19.2/parallel-commits-00.png, line at r3 (raw file):
Have there been any discussions about standardizing "workflow" diagrams so we are all using the same tool(s) and the reader is looking at the same kind of graph?
Sean L. told me there had been some discussion a long time ago, but AFAICT it never ended with a decision getting made.
Some of these graphs visualize common steps in the transaction layer workflow as a whole. Do you think the more general discussion of the transaction workflow (i.e. the "Overview" section), and then some more specific components of transaction workflows (i.e. "Transaction pipelining") could benefit from some of these graphs, or at least some subset or modified version of these graphs?
Yes, but I don't want to increase the scope of this PR. I totes agree that as a separate work item we should add WAY more diagrams to our architecture docs, to make them more accessible and understandable. Right now we are making the reader do the work of creating a diagram in their head / on a whiteboard, piece of paper, etc.
v19.2/architecture/transaction-layer.md, line 242 at r3 (raw file):
is that something that we want to change?
IMO yes. The proper noun name of the feature is "Parallel Commits" with a capitalized C. Our style guide already says to capitalize proper nouns, it appears we have not been doing that consistently in all cases. But I'd like to update the uncapitalized proper nouns as a separate work item for sure.
v19.2/architecture/transaction-layer.md, line 242 at r3 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
a new,
You mentioned that this protocol is "new" quite a bit. Do you think most people that are reading this know about the "old" protocol? If not, I don't know how much it helps to compare the "new" and the "old". I honestly don't know though, that's why I'm asking.
It may not help a ton in terms of understanding the technical content. However, since the feature is new in 19.2 I'd like to argue for keeping it in, since it provides a contrast between version 19.2 and all older versions, which will hopefully play a small role in getting people excited to upgrade -- for better performance, in this case. This type of language will get edited out as we move into 20.1 (for example, I edited out some dated language from the Transaction Pipelining section elsewhere in this PR that landed back in 2.1 days, but which doesn't make sense anymore).
v19.2/architecture/transaction-layer.md, line 244 at r3 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
The optimization is achieved by introducing a new atomic commit protocol that allows the transaction coordinator to return to the client eagerly when it knows that the writes in the transaction have succeeded. Once this occurs, the transaction coordinator can set the transaction record's state to `COMMITTED` and resolve the transaction's write intents asynchronously. The transaction coordinator is able to do this while maintaining correctness guarantees because it populates the transaction record with enough information (via a new `STAGING` state, and an array of in-flight writes) for other transactions to determine whether all writes in the transaction are present, and thus prove whether or not the transaction is committed.
Here are my readability nit suggestions (to take or leave!):
- Mainly simplifying sentence structure...
"Under the atomic commit protocol, the transaction coordinator eagerly returns to the client when it knows that the writes in a transaction have succeeded. After returning to the client, the transaction coordinator sets the transaction record's state to
COMMITTED
and resolves the transaction's write intents asynchronously."
- This is a single sentence, so I think we should split it up...
"While processing an open transaction, the transaction coordinator populates the transaction record with a
STAGING
state, and an array of in-flight writes. Other transactions can read from the transaction state to see whether writes in a transaction have been committed or not."
I took a slightly modified suggestion #1 , thanks! Simpler and clearer.
I'm more wary of edit #2 because it changes the meaning slightly and I don't want to introduce a subtle error. This language is pretty consistent across some of the internal materials (blog post draft, RFC). For example "while processing an open transaction" introduces a notion of "when" which is not entirely accurate in this case, since the coordinator does not populate the record yet, it waits until as late as possible in the transaction lifecycle. Same with saying "after returning to the client", since the timing of that is more concurrent/async.
(PS def not trying to be annoying, I'm just worried about staying super precise with these descriptions, esp. around what happens when.)
v19.2/architecture/transaction-layer.md, line 266 at r3 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
Apple
I assume this is referring to the 'apple' key from the Transaction pipelining section... if so, shouldn't it be lowercase? Also, should we mention that this is continuing on from the Transaction pipelining example, or maybe be explicit here about what the write to the 'apple' key looks like, in SQL terms (maybe not because it could also be from an ORM or something, but a SQL example might still be nice as an example)?
They're not meant to be the same example, sorry for introducing that confusion. Each section is kinda standalone. In this section we are only concerned with writes in the key-value layer sense. I think this makes it a little easier to understand than using SQL, since we just mean "a write" in a KV sense, whereas a SQL statement could be much more complex and write to multiple keys. However I do think this could be improved in future by using a similar, minimal transaction example across the whole page. I'd rather not try to do that in this PR for scope reasons tho.
v19.2/architecture/transaction-layer.md, line 268 at r3 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
as an optimization.
Not sure what this means.
It means that it avoids creating a transaction record "object" or "struct" (or whatever it actually is in the implementation, a struct I think) which would mean writing stuff in memory and linking it to other stuff in memory, which might impose a time/space cost.
Do you think it's worth adding a sentence or two to clarify that?
Nathan do you have an opinion on the above?
v19.2/architecture/transaction-layer.md, line 278 at r3 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
issues a write to the "Berry" key
OK OK, so this is a separate example from the Transaction pipelining section example. Perhaps we should introduce the example (e.g. "Suppose that you have a table named "Fruits", with a "Key" column and a "Value" column, and you want to write values to the "Apple" and "Berry" key..." or something like that.
I think that's a good suggestion, but I'd like to avoid it if possible for the reasons mentioned above about trying to stay mostly out of SQL (aside from BEGIN/COMMIT) and stick closer to KV layer concepts since that means we have to explain less stuff in this case.
v19.2/architecture/transaction-layer.md, line 300 at r3 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
Readability nit
its list of pending writes (i.e., `InFlightWrites`) have all succeeded (i.e., achieved consensus across the cluster)
Maybe...
"its list of pending writes have all succeeded (i.e., the
InFlightWrites
have achieved consensus across the cluster)..."
Updated with that, thanks! Much better - dang parentheticals!
v19.2/architecture/transaction-layer.md, line 304 at r3 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
(also known as the "transaction status recovery process")
Is this a term we use elsewhere? If not, I don't think it's necessary...
It's used in the next paragraph a couple times. It's sort of a succinct way of referring to a whole sequence of steps that may need to be taken in the hopefully rare situation that a crash happens. The term is introduced and defined in the RFC for this feature and in Nathan's blog post (yet to be published), so having it here will help searchability for folks who want to know more, I think.
v19.2/architecture/transaction-layer.md, line 306 at r3 (raw file):
Previously, ericharmeling (Eric Harmeling) wrote…
Readability nit...
other transactions, when they encounter a transaction in `STAGING` state, check whether the staging transaction is still in progress by verifying that the transaction coordinator is still heartbeating that staging transaction’s record.
Maybe
"When other transactions encounter a transaction in
STAGING
state, they check whether the staging transaction..."
Ah, better phrasing - thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work, @rmloveland. Sorry for the delay. Please merge this so we can call out these docs!
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ericharmeling, @jseldess, and @nvanbenschoten)
Thanks @jseldess !!!1! |
Update txn layer docs with Parallel Commits info
Addresses #4723.
Fixes #5465.
Summary of changes:
Add a new section 'Parallel commits' to the transaction layer documentation that describes what it is and how it works, with diagrams.
Update other areas of the document that mention behavior that is affected by this feature.