New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1823967: Add the --pod-infra-container-image flag to the kubelet service #3712
Conversation
@umohnani8: This pull request references Bugzilla bug 1823967, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@abhinavdahiya @rphillips PTAL |
hmm. I don't think it is fair for all users of kubelet to set this value :( is there no way kubelet can offload this responsibility to container runtime?
|
@abhinavdahiya not really. The runtime overrides this value when you set it there, but the kubelet needs to to know about the pause image being used, especially since it changes hash references, so that it can get the images and accurately gc the old ones. |
/retest |
1 similar comment
/retest |
/retest |
lgtm |
@abhinavdahiya @patrickdillon this is ready, PTAL |
/lgtm |
# Need to set the --pod-infra-container-image flag for the kubelet to point to the pause image from the payload | ||
# So we add MACHINE_CONFIG_INFRA_IMAGE to an environment file and source that in the kubelet service | ||
|
||
. /usr/local/bin/release-image.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this to correctly load/finish requires that release image has been downloaded by the release-image.service unit.
How will we ensure that kubelet waits or keeps retrying until I that unit suceeds.
If there is a failure will kubelet service retry or just fail and sit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubelet can't start until crio has been started. And https://github.com/openshift/installer/blob/master/data/data/bootstrap/systemd/units/crio-configure.service.template#L5 ensures that the release image has been downloaded before crio is started.
But to make sure, I can add After/Wants=release-image.service
to the kubelet service here https://github.com/openshift/installer/pull/3712/files#diff-367cd6ecdc6c4f5bbe9560468fad15f3R4 as well to ensure you can't start kubelet till release-image.service has been successful. Wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added release-image.service
to After and Wants
in the kubelet service template below.
Override the --pod-infra-container-image flag to point to the pause image from the release payload. By default, this flag is set to k8s.gcr.io/pause:3.1 and the kubelet asks the runtime for the image status of this image every few minutes, which is why we were seeing the warning logs in cri-o saying that this image was not found. This is avoided by making the pod-infra-container-image flag point to the actual pause image being used from the release payload. Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: abhinavdahiya The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
/lgtm |
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@umohnani8: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@umohnani8: All pull requests linked via external trackers have merged: openshift/machine-config-operator#1776, openshift/installer#3712. Bugzilla bug 1823967 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/cherry-pick release-4.5 |
@umohnani8: new pull request created: #3731 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1823967
Override the --pod-infra-container-image flag to point to the pause
image from the release payload. By default, this flag is set to k8s.gcr.io/pause:3.1
and the kubelet asks the runtime for the image status of this image every few minutes,
which is why we were seeing the warning logs in cri-o saying that this image was not
found. This is avoided by making the pod-infra-container-image flag point to the actual
pause image being used from the release payload.
The MCO PR is openshift/machine-config-operator#1776
Signed-off-by: Urvashi Mohnani umohnani@redhat.com