Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container failed to run because Kubelet failed to apply oom-score-adj #22639

Closed
wojtek-t opened this issue Mar 7, 2016 · 8 comments
Closed
Assignees
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@wojtek-t
Copy link
Member

wojtek-t commented Mar 7, 2016

STEP: Creating a pod to test consume service account root CA
Mar  4 21:17:10.419: INFO: Waiting up to 5m0s for pod pod-service-account-809f8d5c-e291-11e5-8df7-42010af00007 status to be success or failure
Mar  4 21:17:10.452: INFO: No Status.Info for container 'token-test' in pod 'pod-service-account-809f8d5c-e291-11e5-8df7-42010af00007' yet
Mar  4 21:17:10.452: INFO: Waiting for pod pod-service-account-809f8d5c-e291-11e5-8df7-42010af00007 in namespace 'e2e-tests-svcaccounts-iiqzt' status to be 'success or failure'(found phase: "Pending", readiness: false) (33.779421ms elapsed)
Mar  4 21:17:12.485: INFO: Pod "pod-service-account-809f8d5c-e291-11e5-8df7-42010af00007" in namespace "e2e-tests-svcaccounts-iiqzt" disappeared. Error: pods "pod-service-account-809f8d5c-e291-11e5-8df7-42010af00007" not found
Mar  4 21:17:12.486: INFO: Unexpected error occurred: pods "pod-service-account-809f8d5c-e291-11e5-8df7-42010af00007" not found

https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/22578/kubernetes-pull-build-test-e2e-gce/31815/build-log.txt

@wojtek-t wojtek-t added area/test priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. kind/flake Categorizes issue or PR as related to a flaky test. labels Mar 7, 2016
@ncdc
Copy link
Member

ncdc commented Mar 7, 2016

Not sure if this is the root cause, but I see several log messages like this one for the pod in question:

Error running pod "pod-service-account-809f8d5c-e291-11e5-8df7-42010af00007_e2e-tests-svcaccounts-iiqzt(809deedc-e291-11e5-bb48-42010af00002)" container "token-test": failed to apply oom-score-adj to container "exceeded maxTries, some processes might not have desired OOM score"- /k8s_token-test.db003057_pod-service-account-809f8d5c-e291-11e5-8df7-42010af00007_e2e-tests-svcaccounts-iiqzt_809deedc-e291-11e5-bb48-42010af00002_4216abff

@lavalamp
Copy link
Member

Sounds like the pod never started. I don't think the line @ncdc mentions is a root cause, but maybe @dchen1107 can verify that?

@dchen1107
Copy link
Member

The failure is the container is not started properly here:

Mar  4 21:17:12.586: INFO: At 2016-03-04 21:17:07 -0800 PST - event for pod-service-account-809f8d5c-e291-11e5-8df7-42010af00007: {kubelet e2e-gce-master-1-minion-2zwf} FailedSync: Error syncing pod, skipping: [failed to "StartContainer" for "token-test" with RunContainerError: "failed to apply oom-score-adj to container \"exceeded maxTries, some processes might not have desired OOM score\"- /k8s_token-test.db003057_pod-service-account-809f8d5c-e291-11e5-8df7-42010af00007_e2e-tests-svcaccounts-iiqzt_809deedc-e291-11e5-bb48-42010af00002_4216abff"

The root cause of the container start is a known race when applying oom_score_adj by scanning cgroup.procs when a process might be forked right before Kubelet writes its oom_score_adj. The race should be rare and the problem should be handled by #21741

@dchen1107
Copy link
Member

I am leaving this one open to see if #21741 fixed the issue.

@dchen1107 dchen1107 added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed kind/flake Categorizes issue or PR as related to a flaky test. area/test labels Mar 28, 2016
@dchen1107 dchen1107 changed the title e2e flake: ServiceAccounts [It] should mount an API token into pods Container failed to run because Kubelet failed to apply oom-score-adj Mar 28, 2016
@smarterclayton
Copy link
Contributor

Is #23607 a dup of this?

@maverick-racheal
Copy link

Is there any workaround for this. I see this issue every time I try to bring up a RC/pod

@vgs24
Copy link

vgs24 commented Jun 14, 2016

I have the same issue,
Error syncing pod, skipping: failed to "StartContainer" for "hello" with RunContainerError: "failed to apply oom-score-adj to container "exceeded maxTries, some processes might not have desired OOM score"- /k8s_hello.5ba915eb_pod11_default_649b02e2-3283-11e6-8624-1add237a3002_3900ea72"
can someone help

@dchen1107
Copy link
Member

The issue should be fixed if the user uses docker 1.11+ or uses Kubernetes 1.3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

7 participants