Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] cacher: short circuit conditional progress notifier #124928

Conversation

MadhavJivrajani
Copy link
Contributor

@MadhavJivrajani MadhavJivrajani commented May 17, 2024

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

There can exist an extremely unlikely scenario where
the progress requester can repeatedly request progress
notifications even when the feature might not be supported.

The feature checker returns true if any node in an etcd
cluster has the feature enabled. However, it may change this
to false if it eventually detects a node which does not
have this enabled. Ideally, till we detect false we work
under the assumption that this feature is supported, however
we should stop requesting progress notifications as soon as
we get a false in order to prevent loading etcd with
redundant requests.

Which issue(s) this PR fixes:

xref: #124867 (comment)

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


/sig api-machinery scalability
/cc @p0lyn0mial @wojtek-t

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 17, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels May 17, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: MadhavJivrajani
Once this PR has been reviewed and has the lgtm label, please assign liggitt for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

There can exist an extremely unlikely scenario where
the progress requester can repeatedly request progress
notifications even when the feature might not be supported.

The feature checker returns `true` if any node in an etcd
cluster has the feature enabled. However, it may change this
to `false` if it eventually detects a node which does not
have this enabled. Ideally, till we detect `false` we work
under the assumption that this feature is supported, however
we should stop requesting progress notifications as soon as
we get a `false` in order to prevent loading etcd with
redundant requests.

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
// redundant requests.
if !etcdfeature.DefaultFeatureSupportChecker.Supports(storage.RequestWatchProgress) {
pr.mux.Lock()
pr.stopped = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we should stop the requestor here, maybe we could just change the shouldRequest function.

The question is whether we want to check the feature gate in the requestor or simply reserve/delegate the checking of that flag to the higher layers.

/cc @wojtek-t @serathius

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup - this is not a place where we should check it.

We're aware of the issue where it may change, but it was conscious decision and we generally don't expect setups where there are multiple etcd clusters backing a single cluster being in different versions.
So we consciously went for simplicity and accept a corner case we're not handling that should never happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MadhavJivrajani does the above make sense to you? We could always revisit this PR in the future if our assumptions are wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me @p0lyn0mial @wojtek-t, thanks.

For my curiosity, if we were to revisit this in the future sometime, what would be the right place to check this if not the progress checker? I would have assumed that since it can change at runtime, its better to check it at the place where its periodically requesting the additional request.

/close

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 26, 2024
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

@MadhavJivrajani: Closed this PR.

In response to this:

That makes sense to me @p0lyn0mial @wojtek-t, thanks.

For my curiosity, if we were to revisit this in the future sometime, what would be the right place to check this if not the progress checker? I would have assumed that since it can change at runtime, its better to check it at the place where its periodically requesting the additional request.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants