Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix progress notification for watch that doesn't get any events #17557

Merged
merged 1 commit into from
Mar 11, 2024

Conversation

serathius
Copy link
Member

When implementing the fix for progress notifications (#15237) we made a incorrect assumption that that unsynched watches will always get at least one event.

Unsynched watches include not only slow watchers, but also newly created watches that requested old revision. In case that non of the events match watch filter, those newly created watches might become synched without any event going through.

Fixes #17507
/cc @scpmw @ahrtr

@k8s-ci-robot k8s-ci-robot requested a review from ahrtr March 8, 2024 10:24
@k8s-ci-robot
Copy link

@serathius: GitHub didn't allow me to request PR reviews from the following users: scpmw.

Note that only etcd-io members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

When implementing the fix for progress notifications (#15237) we made a incorrect assumption that that unsynched watches will always get at least one event.

Unsynched watches include not only slow watchers, but also newly created watches that requested old revision. In case that non of the events match watch filter, those newly created watches might become synched without any event going through.

Fixes #17507
/cc @scpmw @ahrtr

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@serathius
Copy link
Member Author

/cc @wojtek-t @p0lyn0mial

@k8s-ci-robot
Copy link

@serathius: GitHub didn't allow me to request PR reviews from the following users: wojtek-t, p0lyn0mial.

Note that only etcd-io members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @wojtek-t @p0lyn0mial

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@serathius
Copy link
Member Author

/retest

@serathius
Copy link
Member Author

ping @ahrtr

tests/integration/v3_watch_test.go Outdated Show resolved Hide resolved
tests/integration/v3_watch_test.go Outdated Show resolved Hide resolved
@p0lyn0mial
Copy link

/cc @wojtek-t @p0lyn0mial

Thanks @serathius for putting together this PR!

@k8s-ci-robot
Copy link

@p0lyn0mial: GitHub didn't allow me to request PR reviews from the following users: p0lyn0mial, wojtek-t.

Note that only etcd-io members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @wojtek-t @p0lyn0mial

Thanks @serathius for putting together this PR!

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

When implementing the fix for progress notifications
(etcd-io#15237) we made a incorrect
assumption that that unsynched watches will always get at least one event.

Unsynched watches include not only slow watchers, but also newly created
watches that requested current or older revision. In case that non of the events
match watch filter, those newly created watches might become synched
without any event going through.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
sws.deferredProgress = true
}
}
sws.watchStream.RequestProgressAll()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: print a debug log when it fails to send out the progress notification.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me address it in a separate PR, I think more places in watch code need debug logging.

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@scpmw
Copy link
Contributor

scpmw commented Apr 5, 2024

So the logic here is that because the server failed to answer the progress notification request on newly created watches, we now also don't send it in other scenarios where it is unsynced? If so it should really be documented, because this effectively changes the semantics of WatchProgressRequest to:

// Requests a watch stream progress status be sent in the watch response stream. The server does
// not guarantee such a progress notification WatchResponse to be sent, and may even not send
// any WatchResponse at all. The client should re-try as required.

Which - as #15237 - is okay with me, as I had to implement the re-try workaround anyway. But this might be confusing to others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Etcd not respond to progress notification on newly created watch RPC until there is a event in a watch
5 participants