Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] data consistency checker for list requests #124963

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

p0lyn0mial
Copy link
Contributor

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot
Copy link
Contributor

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. area/code-generation cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. sig/auth Categorizes an issue or PR as relevant to SIG Auth. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 20, 2024
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 20, 2024
@k8s-ci-robot k8s-ci-robot requested a review from deads2k May 20, 2024 10:14
@k8s-ci-robot k8s-ci-robot added the sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. label May 20, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 20, 2024
@p0lyn0mial
Copy link
Contributor Author

/assign @wojtek-t
/cc @serathius

//
// if ResourceVersion = "" and ConsistendListFromCache is disabled or RequestWatchProgress isn't supported,
// then the request will be served from the storage.
func wasListRequestServedFromStorage(opts metav1.ListOptions) bool {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copied from the cacher.go - it looks like it does what we want expect the case mentioned in the comment and after clarifying rows 1 and 3 from the KEP.

//
// Note that this function will panic when data inconsistency is detected.
// This is intentional because we want to catch it in the CI.
func CheckListAgainstCacheDataConsistencyIfRequested[T runtime.Object](ctx context.Context, identity string, listItemsFn listItemsFunc[T], optionsUsedToReceiveList metav1.ListOptions, receivedList runtime.Object) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to find a better home for this function. Depending on where it is defined and whether it will be public, we could consider adding more validation for opts (e.g., checking if it was actually a list request).

@@ -64,7 +115,7 @@ func checkWatchListDataConsistencyIfRequested(ctx context.Context, identity stri
// it is guarded by an environmental variable.
// we cannot manipulate the environmental variable because
// it will affect other tests in this package.
func checkDataConsistency(ctx context.Context, identity string, lastSyncedResourceVersion string, listItemsFn listItemsFunc, retrieveCollectedItemsFn retrieveCollectedItemsFunc) {
func checkDataConsistency[T runtime.Object, U any](ctx context.Context, identity string, lastSyncedResourceVersion string, listItemsFn listItemsFunc[T], retrieveCollectedItemsFn retrieveCollectedItemsFunc[U]) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to find a better home for this function. This function is common for checkWatchListDataConsistencyIfRequested and CheckListAgainstCacheDataConsistencyIfRequested.

@p0lyn0mial p0lyn0mial force-pushed the upstream-data-consistency-checker-for-list-requests branch from 6fe1b36 to 6583cc6 Compare May 20, 2024 10:27
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 20, 2024
@@ -461,6 +462,11 @@ func new$.type|publicPlural$(c *$.GroupGoName$$.Version$Client) *$.type|privateP
var listTemplate = `
// List takes label and field selectors, and returns the list of $.resultType|publicPlural$ that match those selectors.
func (c *$.type|privatePlural$) List(ctx context.Context, opts $.ListOptions|raw$) (result *$.resultType|raw$List, err error) {
defer func() {
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really what you want? We rather want to call it if error was non-nil, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ups, yeah, it should have been if err == nil - thanks.

@@ -697,7 +697,7 @@ func (r *Reflector) watchList(stopCh <-chan struct{}) (watch.Interface, error) {
// we utilize the temporaryStore to ensure independence from the current store implementation.
// as of today, the store is implemented as a queue and will be drained by the higher-level
// component as soon as it finishes replacing the content.
checkWatchListConsistencyIfRequested(stopCh, r.name, resourceVersion, r.listerWatcher, temporaryStore)
checkWatchListDataConsistencyIfRequested(wait.ContextForChannel(stopCh), fmt.Sprintf("watch-list reflector with name: %q", r.name), resourceVersion, wrapListFuncWithContext(r.listerWatcher.List), temporaryStore.List)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, seeing how you want to use it now and where are the problems, let's first merge this commit as a #124446

}
`

var privateListTemplate = `
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the reasoning behind this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it allows for passing c.list to the cache.CheckListAgainstCacheDataConsistencyIfRequested :

cache.CheckListAgainstCacheDataConsistencyIfRequested(ctx, "list request for examples", c.list, opts, result)

checkDataConsistency(ctx, identity, lastSyncedResourceVersion, listItemsFn, func() []runtime.Object { return rawListItems })
}

// wasListRequestServedFromStorage based on the passed ListOptions determines
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rather evolve it to something like:

"if continuation wasn't set, the call LIST with RV= and ResourceVersionMatch=Exact and rest parameters the same"

Seems simpler, potentially makes this check unnecessarily, but it's not a big deal.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 21, 2024
@p0lyn0mial p0lyn0mial force-pushed the upstream-data-consistency-checker-for-list-requests branch from 6583cc6 to f7457cb Compare May 27, 2024 13:41
@p0lyn0mial p0lyn0mial changed the title [POC][WIP] data consistency checker for list requests [WIP] data consistency checker for list requests May 27, 2024
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 27, 2024
@p0lyn0mial
Copy link
Contributor Author

/retest

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 27, 2024
@p0lyn0mial p0lyn0mial force-pushed the upstream-data-consistency-checker-for-list-requests branch from f7457cb to 57c4f77 Compare May 29, 2024 09:16
@@ -172,6 +172,7 @@ func (g *genClientForType) GenerateType(c *generator.Context, t *types.Type, w i
"RESTClientInterface": c.Universe.Type(types.Name{Package: "k8s.io/client-go/rest", Name: "Interface"}),
"schemeParameterCodec": c.Universe.Variable(types.Name{Package: path.Join(g.clientsetPackage, "scheme"), Name: "ParameterCodec"}),
"jsonMarshal": c.Universe.Type(types.Name{Package: "encoding/json", Name: "Marshal"}),
"CheckListFromCacheDataConsistencyIfRequested": c.Universe.Function(types.Name{Package: "k8s.io/client-go/util/consistencydetector", Name: "CheckListFromCacheDataConsistencyIfRequested"}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's merge the first two commits together

@p0lyn0mial
Copy link
Contributor Author

/test pull-kubernetes-node-e2e-containerd

@p0lyn0mial p0lyn0mial force-pushed the upstream-data-consistency-checker-for-list-requests branch from 57c4f77 to 472043e Compare May 29, 2024 10:33
@p0lyn0mial p0lyn0mial force-pushed the upstream-data-consistency-checker-for-list-requests branch from 472043e to 42da3cf Compare May 29, 2024 11:04
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: p0lyn0mial
Once this PR has been reviewed and has the lgtm label, please ask for approval from wojtek-t. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

@p0lyn0mial: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-unit 42da3cf link true /test pull-kubernetes-unit
pull-kubernetes-integration 42da3cf link true /test pull-kubernetes-integration

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apiserver area/code-generation cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
Status: Needs Triage
Development

Successfully merging this pull request may close these issues.

None yet

3 participants