New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPEC-1185: Apply majority write concern when retrying commitTransaction #442
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you missed this one test ("multiple commits in a row" in commit.yml):
FAIL: test_transactions_commit_multiple_commits_in_a_row (test.test_transactions.TestTransactions)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/shane/git/mongo-python-driver/test/__init__.py", line 411, in wrap
return f(*args, **kwargs)
File "/Users/shane/git/mongo-python-driver/test/test_transactions.py", line 594, in run_scenario
self.check_events(test, listener, session_ids)
File "/Users/shane/git/mongo-python-driver/test/test_transactions.py", line 427, in check_events
key, actual))
AssertionError: Unexpected key [writeConcern] in SON([('commitTransaction', 1), ('writeConcern', {'w': 'majority'}), ('lsid', 'session0'), ('txnNumber', 1L), ('autocommit', False)])
specified via TransactionOptions during the ``startTransaction`` call or | ||
otherwise inherited), any other write concern options (e.g. ``wtimeout``) MUST | ||
be left as-is when applying ``w: majority``. See | ||
`Majority write concern is used when retrying commitTransaction`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any subtleties with replacing the w
option with majority
? If w
is a tag set then it would be possible for the write to be confirmed with w:majority
but not necessarily to the set of nodes the user requested. Is there any way to combine majority with a tag set?
I think this is an acceptable trade off to make. Rerunning a transaction by mistake seems worse than possibly not acknowledging the transaction has replicated to the desired nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A write concern tag set is expressed as a string via the w
option, so it's mutually exclusive with a majority write concern. Furthermore, you actually can't have a custom tag set named "majority". I'm not sure if the server actually prevents you from defining one, but I do expect that a "majority" string for w
will be interpreted as a "majority of the replica set" before consulting for a tag set with that name. See: https://docs.mongodb.com/manual/reference/write-concern/#w-option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, but what about my other question? If the application is using w:3
on a 3 node replica set and the driver replaces this with w:majority
then the transaction will only wait for the write to be present on 2 nodes. Should we add a note explaining this subtlety?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to add a note about the edge case of altering a write concern to the Q&A section. Your question here overlaps with some discussion @p-mongo and I had in the Slack channel for the withTransaction()
spec:
Oleg: So for 1185, we are going to downgrade user's write concern to majority if they specified a number which is greater than majority?
Jeremy: i don't recall if "majority" is special in this respect, beyond its normal specialness of always being the computed majority for the particular RS topology. I think this might have come up in since-resolved comments on the design document, but Tess could certainly confirm. That said, since a replica set could have hidden nodes (which afaik do count towards a majority), I don't think a driver could even reliably calculate the majority -- assuming Oleg is asking about having the driver possibly compare if the original write concern was > majority in order to preserve it
Jeremy: i'm also not sure if there's a case for a 5 node replica set (just for example) where w:4 has any benefit over w:majority. Possibly for geographic distribution but then I'd expect users to use a custom write concern string -- and with those a driver certainly doesn't know if such a string would be rollback-proof, which majority affords us. therefore, i think "majority" is always reasonable for our purpose of retrying
Aly pointed out that another special property of "majority" is that it implies journaling. That should probably be mentioned here as well. It also makes me wonder what would happen if the commit was originally using something like {w: 2, j: false}
and we "upgraded" it to {w: "majority", j: false}
-- would that result in an error, or would the server ignore j: false
? Something I should test before finalizing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this SGTM. Note that another option could be for drivers to issue a retry with w:majority
and then perform a final attempt with the user's original write concern. This would ensure that both w:majority
and the user's write concern are respected but I think this idea is too complex for too little benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I concur that the final attempt with the original write concern would invite more complexity than it's worth. What if that fails? Do we end up retrying a w:majority
for no tangible benefit?
I've also noted discussion in Slack about writeConcernMajorityJournalDefault)
. While w:majority
may default to j:true
, it's still kosher to pass { w: "majority", j: false }
as a write concern. Since it's unlikely for users to customize writeConcernMajorityJournalDefault
, I think we're fine relying on the existing design of only overriding w
and leaving other WC options as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShaneHarvey: Did you get confirmation regarding your Slack comment?
Also AFAIK a w:majority is “special” in that it updates the primary’s majority commit point whereas a w:N write does not.
I'd like to mention that in any note added to justify why we feel overriding w
with "majority" is safe, even if the user's original write concern was more strict (e.g. w:4
in a 5 node replica set).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a final paragraph in "Majority write concern is used when retrying commitTransaction" to elaborate on why we're OK with applying "majority" over any existing w
value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From Judah on slack (paraphrased a bit):
Me:
We have a question that’s related to a drivers spec change: how does writeConcern:{w:“majority”} differ from say {w:2}? AFAIK w:majority is “special” in that it updates the primary’s majority commit point whereas a w:N write may not.
More specifically, we are wondering whether the edge case described in https://jira.mongodb.org/browse/SPEC-1185 would still be possible if the transaction was committed with {w:2} instead of {w:1} (assuming this is a 3 node replica set).
Judah:
w:majority waits on the majority commit point, rather than just what gets replicated to a majority of nodes. In general those are mostly the same. w:majority waits a little longer to get read your own writes for read concern majority, but that doesn’t give actual guarantees. nodes also wait for the “first optime of their term” to be committed for w:majority but not for w:2.
So the short answer is, w:2 may be safe, but w:majority definitely is, without me looking a lot more into that Spec
Siyuan:
Because [nodes also wait for the “first optime of their term” to be committed for w:majority but not for w:2], w:2 isn’t safe since it may mistake something written in earlier terms but replicated in the new term as “committed”, even though the write can roll back as it’s before the “first optime of their term”. We could make w:2 to wait longer to consider “first optime of their term” though
@ShaneHarvey: I saw the same failure with "multiple commits in a row" in mongodb/mongo-c-driver#558. Should be fixed now. |
of committing the transaction twice. Note that users may still be vulnerable to | ||
rollbacks by using ``w:1`` (as with any write operation). | ||
|
||
While it's possible that the original write concern may provide greater |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShaneHarvey: Justification for overriding w
with "majority" is here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Is there still a chance for the txn to be reapplied with {w: "majority", j: False}
? If so, I think we should also set "j: True".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, very good point. Do you think we should rename this section and the SPEC ticket to talk about applying a durable write concern instead of just "majority"? Alternatively, we can just include "j" overrides (as-needed) in the implementation details. This is probably an edge case, although not as edge-casey as a user setting writeConcernMajorityJournalDefault
to false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, I think we should omit overriding j
. In the event that the server is running with journaling disabled the write concern might never be satisfiable. If the user has explicitly provided a j
value in their original write concern, I think we should leave it as-is. Otherwise, we should rely on writeConcernMajorityJournalDefault
. If writeConcernMajorityJournalDefault
, we should trust that it was done so for good reason (again, perhaps journaling is disabled) and not interfere with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! For those wondering a server started with nojournal
gives an error with j:true
:
> db.test.runCommand({insert:'test', documents:[{}], writeConcern: {w:'majority', j: true}})
{
"ok" : 0,
"errmsg" : "cannot use 'j' option when a host does not have journaling enabled",
"code" : 2,
"codeName" : "BadValue"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we still worried about potentially blocking for a long time when the set includes arbiters? Should this PR be on hold until we figure out what to do about wtimeout
?
of committing the transaction twice. Note that users may still be vulnerable to | ||
rollbacks by using ``w:1`` (as with any write operation). | ||
|
||
While it's possible that the original write concern may provide greater |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Is there still a chance for the txn to be reapplied with {w: "majority", j: False}
? If so, I think we should also set "j: True".
I've updated the spec and tests for applying a default Also adding @jyemin as a reviewer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of the tests that used an explicit writeConcern: {w:majority} are now failing because they are missing the wtimeout:10000
. Rather than pointing them all out in this review I've opened a PR that fixes them in your branch: jmikola#1.
Add missing wtimeout:10000 to remaining tests
https://jira.mongodb.org/browse/SPEC-1185
Opening the PR with just the spec language change. Test changes will be added in time.