Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpumanager will remove all entries in state file after restart. #81324

Open
choury opened this issue Aug 13, 2019 · 6 comments

Comments

@choury
Copy link
Contributor

commented Aug 13, 2019

What happened:
PR #68619 , it will remove "inactive containers" while reconciling. But the cpumanager started very beginning, it will get empty activePods at it first run. This make state file no sense.

What you expected to happen:
We should wait a moment or notification from podmanager before starting reconcile.
Some log may be added before remove "inactive containers".

How to reproduce it (as minimally and precisely as possible):
restart kubelet

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Network plugin and version (if this is a network-related bug):
  • Others:
@choury

This comment has been minimized.

Copy link
Contributor Author

commented Aug 13, 2019

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node and removed needs-sig labels Aug 13, 2019

@tedyu

This comment has been minimized.

Copy link
Contributor

commented Aug 13, 2019

In reconcileState :

	for containerID := range m.state.GetCPUAssignments() {

You mean the cpu assignments would be wiped for the first reconciliation ?

@choury

This comment has been minimized.

Copy link
Contributor Author

commented Aug 13, 2019

@tedyu Yes, you can test it by restarting kubelet.

@tedyu

This comment has been minimized.

Copy link
Contributor

commented Aug 13, 2019

Do you see any of the following log:

				klog.Warningf("[cpumanager] reconcileState: skipping pod; status not found (pod: %s)", pod.Name)

				klog.Warningf("[cpumanager] reconcileState: skipping container; ID not found in status (pod: %s, container: %s, error: %v)", pod.Name, container.Name, err)

					klog.V(4).Infof("[cpumanager] reconcileState: container is not present in state - trying to add (pod: %s, container: %s, container id: %s)", pod.Name, container.Name, containerID)
@choury

This comment has been minimized.

Copy link
Contributor Author

commented Aug 13, 2019

I see this
klog.V(4).Infof("[cpumanager] reconcileState: container is not present in state - trying to add (pod: %s, container: %s, container id: %s)", pod.Name, container.Name, containerID)

@zouyee

This comment has been minimized.

Copy link
Member

commented Aug 13, 2019

/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.