New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubelet status manager sync the status of local Pods #57106
Kubelet status manager sync the status of local Pods #57106
Conversation
/ok-to-test |
LGTM /assign @Random-Liu @Random-Liu Could you take another look at this? |
/sig node |
LGTM |
Another option is to call |
@dashpole I tried this approach and it works too. I draft something like: diff --git a/pkg/kubelet/kubelet_getters.go b/pkg/kubelet/kubelet_getters.go
index fefd8aac14..d4a3b174e6 100644
--- a/pkg/kubelet/kubelet_getters.go
+++ b/pkg/kubelet/kubelet_getters.go
@@ -142,7 +142,13 @@ func (kl *Kubelet) getPodContainerDir(podUID types.UID, ctrName string) string {
// GetPods returns all pods bound to the kubelet and their spec, and the mirror
// pods.
func (kl *Kubelet) GetPods() []*v1.Pod {
- return kl.podManager.GetPods()
+ pods := kl.podManager.GetPods()
+ for _, po := range pods {
+ if status, ok := kl.statusManager.GetPodStatus(po.UID); ok {
+ po.Status = status
+ }
+ }
+ return pods
} Personally I may prefer updating the pods statuses in the |
pkg/kubelet/status/status_manager.go
Outdated
// on the master, where the kubelet is responsible for bootstrapping the pods | ||
// of the master components. | ||
if m.kubeClient == nil { | ||
glog.Infof("Kubernetes client is nil, not starting status manager.") | ||
glog.Infof("Kubernetes client is nil, sync pod status locally") | ||
go wait.Forever(func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can just do for syncRequest := range m.podStatusChannel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dashpole of course, I did the changes.
5309d55
to
c044f8d
Compare
@JulienBalestra @dashpole I think we should keep pod manager reflect newest pod from apiserver, which is the assumption we've always being make. The |
pkg/kubelet/status/status_manager.go
Outdated
// updateLocalPod updates the kubelet pod in the podManager if this pod isn't from an apiserver | ||
// The common use-case is providing a manifest-path containing Pod resources to bootstrap the Master role | ||
func (m *manager) updateLocalPod(uid types.UID, status versionedPodStatus) { | ||
pod, podOk := m.podManager.GetPodByUID(uid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still a race that the pod is updated by kubelet syncloop after this point but before UpdatePod
underlying.
In that case, the pod spec might still be changed accidentally.
@Random-Liu thank you for your feedback ! In the meantime I caught a missing field: This is the PodList of a node running a single static Pod from a file manifest with both implementations (the initial one and the comment one) : {
"kind": "PodList",
"apiVersion": "v1",
"metadata": {},
"items": [
{
"metadata": {
"name": "nginx-192.168.128.135",
"namespace": "default",
"selfLink": "/api/v1/namespaces/default/pods/nginx-192.168.128.135",
"uid": "d00b60dec4eb99d4e04099e43459a100",
"creationTimestamp": null,
"labels": {
"app": "nginx"
},
"annotations": {
"kubernetes.io/config.hash": "d00b60dec4eb99d4e04099e43459a100",
"kubernetes.io/config.seen": "2018-01-03T11:01:26.642008795+01:00",
"kubernetes.io/config.source": "file"
}
},
"spec": {
"containers": [
{
"name": "nginx",
"image": "nginx:latest",
"resources": {},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"imagePullPolicy": "IfNotPresent"
}
],
"restartPolicy": "Always",
"terminationGracePeriodSeconds": 30,
"dnsPolicy": "ClusterFirst",
"nodeName": "192.168.128.135",
"securityContext": {},
"schedulerName": "default-scheduler",
"tolerations": [
{
"operator": "Exists",
"effect": "NoExecute"
}
]
},
"status": {
"phase": "Running",
"conditions": [
{
"type": "Initialized",
"status": "True",
"lastProbeTime": null,
"lastTransitionTime": "2018-01-03T10:01:31Z"
},
{
"type": "Ready",
"status": "True",
"lastProbeTime": null,
"lastTransitionTime": "2018-01-03T10:01:45Z"
},
{
"type": "PodScheduled",
"status": "True",
"lastProbeTime": null,
"lastTransitionTime": "2018-01-03T10:01:31Z"
}
],
"podIP": "172.17.0.2",
"startTime": "2018-01-03T10:01:31Z",
"containerStatuses": [
{
"name": "nginx",
"state": {
"running": {
"startedAt": "2018-01-03T10:01:45Z"
}
},
"lastState": {},
"ready": true,
"restartCount": 0,
"image": "nginx:latest",
"imageID": "docker-pullable://nginx@sha256:cf8d5726fc897486a4f628d3b93483e3f391a76ea4897de0500ef1f9abcd69a1",
"containerID": "docker://38027962ea06462f28ca390d9ccc3f494e221ba00efeb2c1b806f59196842682"
}
],
"qosClass": "BestEffort"
}
}
]
} To fulfill this field, I created the following implementation: diff --git a/pkg/kubelet/kubelet_getters.go b/pkg/kubelet/kubelet_getters.go
index fefd8aac14..d10a48b56f 100644
--- a/pkg/kubelet/kubelet_getters.go
+++ b/pkg/kubelet/kubelet_getters.go
@@ -139,10 +139,36 @@ func (kl *Kubelet) getPodContainerDir(podUID types.UID, ctrName string) string {
return filepath.Join(kl.getPodDir(podUID), config.DefaultKubeletContainersDirName, ctrName)
}
+// setStatusHostIP ensure the field status.hostIP is properly set when the status Phase is Running
+func (kl *Kubelet) setStatusHostIP(status *v1.PodStatus) error {
+ if status.Phase != v1.PodRunning {
+ return nil
+ }
+ if status.HostIP != "" {
+ return nil
+ }
+ ip, err := kl.GetHostIP()
+ if err != nil {
+ return err
+ }
+ status.HostIP = ip.String()
+ return nil
+}
+
// GetPods returns all pods bound to the kubelet and their spec, and the mirror
// pods.
func (kl *Kubelet) GetPods() []*v1.Pod {
- return kl.podManager.GetPods()
+ pods := kl.podManager.GetPods()
+ for _, p := range pods {
+ if status, ok := kl.statusManager.GetPodStatus(p.UID); ok {
+ p.Status = status
+ err := kl.setStatusHostIP(&p.Status)
+ if err != nil {
+ glog.Warningf("fail to update the status.hostIP for Pod %s: %s", p.Name, err)
+ }
+ }
+ }
+ return pods
} Another option could be to create a new getter like: diff --git a/pkg/kubelet/kubelet_getters.go b/pkg/kubelet/kubelet_getters.go
index fefd8aac14..f6b1686f2a 100644
--- a/pkg/kubelet/kubelet_getters.go
+++ b/pkg/kubelet/kubelet_getters.go
@@ -139,10 +139,41 @@ func (kl *Kubelet) getPodContainerDir(podUID types.UID, ctrName string) string {
return filepath.Join(kl.getPodDir(podUID), config.DefaultKubeletContainersDirName, ctrName)
}
+// getUpdatedPodStatus get an updated status of the Pod and ensure the field status.hostIP is properly set
+// when the status Phase is Running
+func (kl *Kubelet) getUpdatedPodStatus(podUID types.UID) (v1.PodStatus, error) {
+ var status v1.PodStatus
+ status, ok := kl.statusManager.GetPodStatus(podUID)
+ if !ok {
+ return status, fmt.Errorf("fail to get PodStatus for Pod %s", podUID)
+ }
+ if status.Phase != v1.PodRunning {
+ return status, nil
+ }
+ if status.HostIP != "" {
+ return status, nil
+ }
+ ip, err := kl.GetHostIP()
+ if err != nil {
+ return status, err
+ }
+ status.HostIP = ip.String()
+ return status, nil
+}
+
// GetPods returns all pods bound to the kubelet and their spec, and the mirror
// pods.
func (kl *Kubelet) GetPods() []*v1.Pod {
- return kl.podManager.GetPods()
+ pods := kl.podManager.GetPods()
+ for _, p := range pods {
+ status, err := kl.getUpdatedPodStatus(p.UID)
+ if err != nil {
+ glog.Warningf("fail to update the status for Pod %s: %s", p.Name, err)
+ continue
+ }
+ p.Status = status
+ }
+ return pods
} But I'm not 💯 sure it's the most appropriate place to do that. Also, as you said, the Pod manager is dedicated to work with the apiserver only, so the interface created here isn't adapted too. |
WRT HostIP, it looks like we intentionally dont set HostIP when we dont have a kubeClient |
Looks like that check was added in #9722, which appears to be because at that point the kubelet would only get the node object from the API server. |
c044f8d
to
e22ce76
Compare
Thanks for the relevant feedback @dashpole ! |
/lgtm |
pkg/kubelet/kubelet_getters.go
Outdated
@@ -142,7 +142,13 @@ func (kl *Kubelet) getPodContainerDir(podUID types.UID, ctrName string) string { | |||
// GetPods returns all pods bound to the kubelet and their spec, and the mirror | |||
// pods. | |||
func (kl *Kubelet) GetPods() []*v1.Pod { | |||
return kl.podManager.GetPods() | |||
pods := kl.podManager.GetPods() | |||
for _, p := range pods { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you add a comment explaining that this is required to get the status of static pods when there is no apiserver?
e22ce76
to
0ce0a87
Compare
0ce0a87
to
dbbbef5
Compare
dbbbef5
to
4c28901
Compare
/lgtm |
/test pull-kubernetes-unit |
/test pull-kubernetes-unit |
/test pull-kubernetes-e2e-kops-aws |
/approve @JulienBalestra Thanks for the fix! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dashpole, JulienBalestra, Random-Liu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these OWNERS Files:
Approvers can indicate their approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
@JulienBalestra @dashpole I think this PR broke the reboot test. https://k8s-testgrid.appspot.com/google-gce#gci-gce-reboot I guess it is related to the Let's revert or fix. |
@Random-Liu sgtm |
Automatic merge from submit-queue (batch tested with PRs 59877, 59886, 59892). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. kubelet: revert the status HostIP behavior **What this PR does / why we need it**: This PR partially revert #57106 to fix #59889. The PR #57106 changed the behavior of `generateAPIPodStatus` when a **kubeClient** is nil. **Release note**: ```release-note NONE ```
…t-pod-status Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. kubelet: revert the get pod with updated status **What this PR does / why we need it**: Following kubernetes#59892 this PR finish to revert kubernetes#57106 The first revert didn't solve the reboot test in the [test grid](https://k8s-testgrid.appspot.com/google-gce#gci-gce-reboot). **Special notes for your reviewer**: cc @dashpole @Random-Liu **Release note**: ```release-note None ```
What this PR does / why we need it:
In the kubelet, when using
--pod-manifest-path
the kubelet creates static pods but doesn't update the status accordingly in thePodList
.This PR fixes the incorrect status of each Pod in the kubelet's
PodList
.This is the setup used to reproduce the issue:
manifest:
cat ~/kube/staticpod.yaml
kubelet:
You can observe this by querying the kubelet API
/pods
:The status of the nginx
Pod
will remain in Pending state phase.It's reported as expected on the apiserver side:
It doesn't work either with a standalone kubelet:
Special notes for your reviewer:
Release note: