New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pods test failures after master upgrade 0.19.3 → 0.21.2 #11355
Comments
Nevermind, the defaulting would have happened in the old podspec as well... so there wouldn't be a diff. I can see now the submitted podspec serviceAccountName is empty, because the 19.3 e2e client code didn't know about it. Here's the sequence:
|
Yes, I believe the failing e2e code is at the 0.19.3 level as well. I think @mbforbes is reattempting pivoting the e2e versions after the master upgrades as well, but we might run into other issues if the nodes are version skewed. (My hope is fewer.) |
Filed #11380 for the serviceAccount issue. |
Since a couple issues have popped up already, I decided to look through API changes between 0.19 and 0.20: pod.status.reason: Additive. kubectl will just print less information if it isn't populated. So hopefully there aren't more incompatibilities lurking. |
Thanks for checking Brian. With luck we won't need more patch releases into the 0.20 branch. |
Important: running a "pure" 0.19.3 (master, nodes, and e2e code all at 0.19.3) has zero flakes for the three flaky pod tests above. This seems to imply that #10523 didn't prevent these, and the 0.21.2 master is causing the flakes. Could the flakes also be related to the |
The field name change caused the pod update failure. The liveness exec failures were caused by the e2e code in that test not using the standard method for constructing a test namespace, which meant the e2e test wasn't waiting for the namespace to be ready before creating pods. In 19.3, the admission plugin would let pods get created before their service account and API token was ready, but that was fixed post 19.3 |
@liggitt but by that logic, wouldn't all 0.19.3 e2es see the same flakiness in those tests? In #11355 (comment) I'm describing that we have a Jenkins job (pictured) that's running pure 0.19.3 and never seeing those tests flake. (Sorry if I'm confused—attempting to understand!) |
@mbforbes admission control changed since 19.3 to require the service account token to exist before admitting a pod. Prior to that, pods that didn't make use of the token (like the liveness-exec pod) were getting admitted when they shouldn't have, but didn't notice because they never attempted to use the API token they were supposed to have available. In a pure 19.3 env, liveness-exec pods get admitted (incorrectly) before the namespace is ready, but didn't fail their test because they weren't depending on anything service account token related. In a 19.3 e2e against a 20.0+ master, liveness-exec pods get rejected (correctly) because the namespace's service accounts and their tokens aren't finished initializing yet. The liveness-exec e2e test in particular needed to use the common method to correctly get a test namespace. The e2e test fixes in https://github.com/GoogleCloudPlatform/kubernetes/pull/10523/files#diff-92d176a1025dcbee0981bb7f16cda942 are applicable. |
@liggitt ahhhhh I understand now. Thank you so much for being patient with me. |
can you please tell what is the solution to this. I am seeing same kubectl create -f nginx.yml |
@n1603 make sure you're starting the apiserver and controller manager with the service account arguments needed to auto-generate service account tokens. See local-up-cluster.sh for an example. |
thanks, I am using fedora22 atomic. apiserver and controller manager are running. i must be missing someting, but not sure, |
any further suggestions on fixing this ? |
Can you show the command lines options you are using to start the apiserver and controller manager? |
Used below command to start services for SERVICES in docker kube-proxy.service kubelet.service etcd.service kube-apiserver.service kube-controller-manager.service kube-scheduler.service; do systemctl restart $SERVICES; systemctl enable $SERVICES; systemctl status $SERVICES ; done Here is how "kube-apiserver.service" looks like in my install [Unit] [Service] [Install] $cat "/usr/lib/systemd/system/kube-controller-manager.service" [Unit] [Service] [Install] |
@n1603 looks like the systemd unit files include the ServiceAccount admission controller without specifying the needed signing key. Not sure what to do about that, since those files don't really have a setup script that can create that key... To get your setup working, you can do the same thing local-up-cluster.sh is doing:
|
This works, thanks a lot. |
@liggitt thanks for your answer, I encountered the same issue and get solved by following your steps. |
hi,
4: restarted the kube-apiserver and kube-controller-manager services to restart the services when i run the command : No API token found for service account "default" retry after the token is automatically created and added to the service account How the environment is setup :
please suggest what could be additional checks i need to do to get it resolved. |
there's a typo in your openssl command above, I assume the file was actually created without the typo what service account and secrets show up in your namespace?
|
thanks Liggitt. regards |
the typo was "serviceaccount.ket" vs "serviceaccount.key" do you have the logs from the controllermanager? |
sorry to say but i dont have /var/log/kube-controller-manager.log file i ran the command : |
please suggest. |
this is resolved. i created secrets manually and attach it to "default" service account. after that, restarts the controller manager and pods started running. |
@irfanjs,how to create the secrets and attach it to "default" service account ? |
I also want know how to create the secrets and attach it to "default" service account ? |
Service Account show 0 secrets attached with it.
Is this problem resolved in kubernetes 1.10? Still it is complaining for API Token, how does API token gets generated?
In apiserver config file
Or this apporach
In controller-manager config file
|
@liggitt thaks, I encountered the same issue and get solved by following your steps. but now the
|
Conditions
Tests
Flake history
### Output #### Pods should be restarted with a docker exec "cat /tmp/health" liveness probe
Pod "liveness-exec" is forbidden: no API token found for service account
e2e-test-5d05436c-2a5a-11e5-916e-42010af01555/default, retry after the token
is automatically created and added to the service account
Pods should not be restarted with a docker exec "cat /tmp/health" liveness probe
Pod "liveness-exec" is forbidden: no API token found for service
account e2e-test-10b92925-2b3f-11e5-8ace-42010af01555/default, retry after the token
is automatically created and added to the service account
Pods should be restarted with a /healthz http liveness probe
Pod "liveness-http" is forbidden: no API token found for service account
e2e-test-50513bd8-2a76-11e5-948c-42010af01555/default, retry after the token
is automatically created and added to the service account
Pods should be updated
may not update fields other than container.image
(#11343 improves this error message)
The text was updated successfully, but these errors were encountered: