-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate persisted responses without knowing their revision to prevent duplicating state during linearization #18214
Separate persisted responses without knowing their revision to prevent duplicating state during linearization #18214
Conversation
@serathius: GitHub didn't allow me to request PR reviews from the following users: fykaa. Note that only etcd-io members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
51cfad7
to
df4b8bd
Compare
I have idea for big rewrite of history patching, however it would require more tests. As this PR would also benefit from them I wait with it until we have tests. |
// MaybeEtcdResponse extends EtcdResponse to include partial information about responses to a request. | ||
// Possible response state information: | ||
// * Normal response. Client observed response. Only EtcdResponse is set. | ||
// * Persisted. Client didn't observe response, but we know it was persisted by etcd. Only Persisted is set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q. So, meaning, the PUT was finalized on the consensus + storage, but response wasn't received from a client point of view? Seems confusing, given the function above describeEtcdResponse
strings this as "unknown"? Should we make the naming more consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A great question, I have been thinking about changing the naming here.
So, meaning, the PUT was finalized on the consensus + storage, but response wasn't received from a client point of view?
Yes, from client perspective it got error so we don't know the response, but we know that request was persisted thanks to looking up WAL.
Seems confusing, given the function above describeEtcdResponse strings this as "unknown"? Should we make the naming more consistent?
The naming in robustness tests is constantly evolving based on feedback. Name "unknown" was meant to just mean we don't know the exact response but we knew it's revision due to matching watch response. Name "persisted" was introduced only recently into code with PR that added reading the WAL.
I think it would make sense to rename "unknown" to "persisted" in describeEtcdResponse. Does that sound good for everyone? @MadhavJivrajani @siyuanfoundation @fuweid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. "persisted" makes sense to me. Defer to other contributors :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"persisted" makes sense to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to "persisted"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming to 'persisted' makes sense in the context of WAL.
I am trying to understand could there be any edge case where 'persisted' might not fully capture the state, or scenarios where additional qualifiers might be necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to understand could there be any edge case where 'persisted' might not fully capture the state, or scenarios where additional qualifiers might be necessary?
Good question, I think the second case would be when request generates a event and we observe it on watch. This is where the "PersistedRevision" comes from. So name "persisted" might not be capturing all cases. Maybe "observed" would be capturing the whole meaning. Observed not by client, but by other side effects, like WAL or WATCH. Don't know if "observed" is better than "persisted" though.
Let's not block this PR on picking the name. If someone is interested feel free to file a separate PR to propose and discuss a new name.
I am confused by the PR. Can you give some context why the code before this change would result in duplicating states? |
This code is meant to remove the duplicated states caused by This is codes in the following function, the switch is from the most lenient to the most strict option of validating response:
If you check the
If we know that request was persisted by etcd WAL, then we can just assume that it executed and use different apply code:
|
44921d3
to
972d2be
Compare
…t duplicating state during linearization Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
972d2be
to
1870222
Compare
/retest |
Any reviewers willing to review? Just making sure we merge it soon as the linearization timeouts have become frequent enough that they happen on presubmit https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/etcd-io_etcd/18227/pull-etcd-robustness-amd64/1805320012408295424 |
Thanks for explaining the details! LGTM |
/cc @siyuanfoundation @MadhavJivrajani @fuweid @fykaa @henrybear327