-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: flake: workspace(s) not initialized in time #2479
Comments
We need to amend that |
One specific (new?) reason this happens: #2440 (comment) |
Saw this happen again in https://github.com/kcp-dev/kcp/actions/runs/3968949742/jobs/6802786983. Did some digging into the logs, and it looks like the Then the apibinder initializer tries multiple times to remove its initializer. I see this over and over:
It's trying to patch at resourceVersion 3861 to remove the This is the timing of the attempts from the apibinder controller to remove its initializer, each resulting in a conflict:
It looks like it's gone into backoff. I am not clear why the reconciler continues to try to reconcile from RV 3861 instead of seeing a newer version in the cache. The last entry (and another one that followed) were after the test had failed, cleanup had occurred, and the LogicalCluster no longer existed. Note that it's still trying RV 3861. I'm a little confused why it's showing both
The
|
Is there any chance that the forwarding storage (pkg/virtual/framework/forwardingregistry) is causing this staleness? |
Seeing stuff like this
|
Here's a filtered set of logs showing reflector activity plus httplog for list/watch of logicalclusters for the time period in question
|
Can repro the 1-minute timeout easily locally with cmd/sharded-test-server. Looking into why some reverse proxy somewhere is terminating watches after 1 minute. |
Fun fact: if you use Line 353 in ab2757a
|
Hmm, that's not it. Still digging |
Aha, so:
Because of this, the |
Describe the bug
We've lately been seeing random test failures that look like this:
Steps To Reproduce
Expected Behaviour
No failure
Additional Context
No response
The text was updated successfully, but these errors were encountered: