New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TestWebhookConverter storage version wait is flaky #78913
Comments
/cc @roycaihw |
several of the webhook converter tests have similar flakes. it looks like the wait for storage version to become effective times out occasionally https://storage.googleapis.com/k8s-gubernator/triage/index.html?pr=1&test=TestWebhookConverter Rare flakes seen in all of these tests:
|
cc @jpbetz |
Yikes, I’ll sort it out. |
iiuc sending an empty patch gets short-circuited and doesn't bump CR generation, even if the storage version gets changed, which means we do not re-write to etcd and go through the encoding path kubernetes/staging/src/k8s.io/apiextensions-apiserver/test/integration/conversion/conversion_test.go Line 958 in ec02afb
this would be racing and doing noop's if the first pass of I will send a fix to make the patch do actual mutation |
It doesn't bump generation, but the bytes to persist in etcd would still be different if the storage version had changed, so they should persist. You can verify that by changing the storage version on a CRD yourself, do an empty patch, and see if the resourceVersion changes. |
you're right. Generation is for non-metadata change only, and the RV changes. I queried etcd locally and the apiVersion did change for an empty patch (sidetracking: another observation is after changing the storage version (say v1 -> v2; noop convertor), empty-patching v2 endpoint would bump generation, while empty-patching v1 endpoint wouldn't) |
another theory: our cached storage strategy could be stale if the CRD informer merges multiple events together (e.g. the informer does a re-list). Two scenarios:
the strategy with UID:b is supposed to be deleted and re-created on demand, to reflect the last update. But if the CRD informer merges the delete-create-update events into a single update event, the strategy with UID:a will be deleted, but the strategy with UID:b won't.
the strategy with UID:a is supposed to be deleted and re-created on demand, to reflect the last update. But if the CRD informer merges the create-update events into a single create event, the strategy with UID:a won't be deleted (as we don't react on create events). another race found in #79114 (comment), which could happen in a single update. |
I think I just saw this:
|
Which jobs are failing:
pull-kubernetes-integration
https://storage.googleapis.com/k8s-gubernator/triage/index.html?pr=1&test=TestWebhookConverter
Which test(s) are failing:
TestWebhookConverterWithDefaulting
/assign @sttts
/priority important-soon
/area custom-resources
/sig api-machinery
/kind flake
/milestone v1.16
The text was updated successfully, but these errors were encountered: