-
Notifications
You must be signed in to change notification settings - Fork 829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaking test: setup e2e test environment #3667
Comments
Firstly, we record several similar but not identical errors:
corresponding kind logs:
corresponding kind logs:
|
I may find the root cause for Summary in one sentence : when deleting a cluster, it is necessary to modify kubeconfig. In My conclusion is based on the logs of Firstly,the failure logs of
we get a clue that this failure is related to deleting the cluster and file locks. Secondly,we read the source code of we further learned that Thirdly,we read the source code of as you see, when execute where does so, in this case, neither If my inference holds,how to solve this problem? Method one I considered that:
It cannot completely solve the problem, but it can greatly reduce the probability of occurrence. The advantage is that the changes are minimal. Method two
It can solve the problem, but the changes may relatively major. Thanks! |
Nice finding!!! I tend to |
It sounds good! Thanks for your excellent work @chaosi-zju! This tool kubecm seems helpful. Perhaps we could consider using it, but that would introduce a new dependency. kubecm merge member1.config member2.config member3.config --config /root/.kube/members.config -y This is the sample using
However, we can see that the context-name $ cat merge.config |
Thanks for your commendable advice ! @yike21 @RainbowMango I have carefully considered your advice and taken into account my own concerns, I got another method ! Method three
kind create cluster --name member1 --kubeconfig="/root/.kube/member-member1.config" --image kindest/node:v1.26.0
kind create cluster --name member2 --kubeconfig="/root/.kube/member-member2.config" --image kindest/node:v1.26.0 as you can see, every cluster use individual kubeconfig (consider of file lock issues also exist in creating process) 2、Merge kubeconfig by # export KUBECONFIG=/root/.kube/member-member1.config:/root/.kube/member-member2.config
export KUBECONFIG=$(find /root/.kube -maxdepth 1 -type f | grep member- | tr '\n' ':')
kubectl config view --flatten > /root/.kube/members.config no additional dependency, no context-name modified issue, and maintain the same state after installation as before. 3、When deleting cluster, we do like this: kind delete clusters --kubeconfig /root/.kube/members.config --all or kind delete clusters --kubeconfig /root/.kube/members.config member1 member2 it is worth noting that you must specify one however, with Here just use member cluster as an example, host cluster is the same. Perhaps there are other problems here, please point them out, thanks ! |
Cool! I agree with you, this method is more reasonable. 👍 |
/assign @chaosi-zju |
There are two types of errors here, I only solved the No.1 error before: However, the No.2 error occurred frequently yesterday: You can see the related failed CI: https://github.com/karmada-io/karmada/actions/runs/5367319510/jobs/9737422778 Now, I Find the reason for this: We expect the Kind version to be v0.17.0, but the ubuntu image in our CI Runner now comes with kind@v0.20.0. The error mentioned above almost always occurs in ubuntu-20.04 when using kind@v0.20.0, but it runs normally on ubuntu-22.04. Our CI logic checks whether there is a kind command and if so, it will not install it again, so we are using the kind@v0.20.0 that comes with the image by default . Therefore, the solution is either to upgrade to ubuntu-22.04 or force installation of kind@v0.17 . https://github.com/kubernetes-sigs/kind/releases/tag/v0.20.0 |
I got it. Thanks for your excellent work! |
Which jobs are flaking:
e2e test
Which test(s) are flaking:
e2e test(setup e2e test environment)
Reason for failure:
Anything else we need to know:
The text was updated successfully, but these errors were encountered: