New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use GlobalMemoryStatusEx to get total physical memory on Windows node #57124

Merged
merged 1 commit into from Feb 6, 2018

Conversation

@JiangtianLi
Contributor

JiangtianLi commented Dec 13, 2017

What this PR does / why we need it:
This PR fixes issue #57110 due to failure in getting total physical memory on some Windows VM such as in VMWare Fusion or Virtualbox. This change uses GlobalMemoryStatusEx instead of GetPhysicallyInstalledSystemMemory to retrieve total physical memory on Windows node. The amount obtained this way is also closer in parity with reading MemTotal from /proc/meminfo on Linux node.
(thanks to @martinivanov and @marono for the help)

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #57110

Special notes for your reviewer:

Release note:

Use a more reliable way to get total physical memory on windows nodes
@dims

This comment has been minimized.

Member

dims commented Dec 13, 2017

/release-note-none
/ok-to-test

@JiangtianLi

This comment has been minimized.

Contributor

JiangtianLi commented Dec 13, 2017

@marono @trondhindenes Can you help to do sanity check to run kubelet? I don't have VMware fusion/Mac and VirtualBox/Linux handy. I've verified with HyperV/Windows.

@marono

This comment has been minimized.

marono commented Dec 13, 2017

@JiangtianLi I can do VMware Fusion tomorrow

@marono

This comment has been minimized.

marono commented Dec 14, 2017

@JiangtianLi delighted to confirm kubelet.exe starts on VMWare Fusion (MAC) running Windows Server 1709, with your changes ...

Thanks and well done!

@JiangtianLi

This comment has been minimized.

Contributor

JiangtianLi commented Dec 14, 2017

@marono Thanks so much for the help!

@alinbalutoiu

This comment has been minimized.

Contributor

alinbalutoiu commented Dec 20, 2017

This was previously also failing on Windows Server RTM. Confirmed that it works on Windows Server RTM and 1709 now on VMware Workstation. The change looks good to me, good job!

I am still thinking if kubelet should fail with an error if it couldn't read the memory. On the Linux node from what I see, they ignore the error if it couldn't read the memory. https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockershim/cm/container_manager_linux.go#L104-L107
It will always return nil as error (https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/dockershim/cm/container_manager_linux.go#L122)

Should the error be ignored if kubelet couldn't fetch the memory?

@JiangtianLi

This comment has been minimized.

Contributor

JiangtianLi commented Dec 22, 2017

@alinbalutoiu Thanks for the confirmation and good suggestion! As to handling reading memory capacity failure, what I read from code is that reading memory capacity on Linux calls cadvisor: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cadvisor/cadvisor_linux.go#L117

cadvisor gets machine info in https://github.com/google/cadvisor/blob/master/manager/manager.go#L220 and https://github.com/google/cadvisor/blob/master/machine/info.go#L107

And if there is an error in https://github.com/google/cadvisor/blob/master/machine/machine.go#L86-L97, the error is surfaced up and not ignored.

What you pointed out seems to be ignoring reading memory capacity error in setting memory limit when creating cgroup manager.

Please feel free to correct me and let me know your thoughts.

@JiangtianLi

This comment has been minimized.

Contributor

JiangtianLi commented Dec 27, 2017

Kindly ping @dashpole @tallclair

@dashpole

This comment has been minimized.

Contributor

dashpole commented Jan 2, 2018

cc @kubernetes/sig-windows-bugs
We could use someone with some context on windows nodes to review and lgtm

@michmike

This comment has been minimized.

michmike commented Jan 5, 2018

@bsteciuk can you please review?

@michmike

This comment has been minimized.

michmike commented Jan 5, 2018

@alinbalutoiu as well

@alinbalutoiu

Thanks @JiangtianLi for the clarifications.

The changes looks good to me.

@dashpole

This comment has been minimized.

Contributor

dashpole commented Jan 8, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Jan 8, 2018

@JiangtianLi

This comment has been minimized.

Contributor

JiangtianLi commented Jan 19, 2018

/test pull-kubernetes-cross

@JiangtianLi

This comment has been minimized.

Contributor

JiangtianLi commented Jan 23, 2018

I restarted the tests and it is green (I think there was flakiness in cross test before and it is skipped this time)
/cc @michmike

@k8s-ci-robot k8s-ci-robot requested a review from michmike Jan 23, 2018

@JiangtianLi

This comment has been minimized.

Contributor

JiangtianLi commented Jan 24, 2018

/retest

@k8s-merge-robot k8s-merge-robot removed the lgtm label Jan 25, 2018

@JiangtianLi

This comment has been minimized.

Contributor

JiangtianLi commented Jan 25, 2018

/cc @dashpole @kubernetes/sig-windows-bugs @michmike again. Need approve after rebase.

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Jan 25, 2018

@JiangtianLi: Reiterating the mentions to trigger a notification:
@kubernetes/sig-windows-bugs

In response to this:

/cc @dashpole @kubernetes/sig-windows-bugs @michmike again. Need approve after rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@JiangtianLi

This comment has been minimized.

Contributor

JiangtianLi commented Jan 26, 2018

/assign @tallclair

@k8s-ci-robot k8s-ci-robot requested a review from tallclair Jan 26, 2018

@michmike

This comment has been minimized.

michmike commented Jan 30, 2018

/lgtm

@michmike

This comment has been minimized.

michmike commented Jan 30, 2018

/assign @tallclair

@aserdean

This comment has been minimized.

Contributor

aserdean commented Feb 6, 2018

Acked-by: Alin Gabriel Serdean aserdean@ovn.org

I tested the changes using VMware Workstation 12.

/lgtm
/ping @tallclair

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Feb 6, 2018

@aserdean: changing LGTM is restricted to assignees, and only kubernetes org members may be assigned issues.

In response to this:

Acked-by: Alin Gabriel Serdean aserdean@ovn.org

I tested the changes using VMware Workstation 12.

/lgtm
/ping @tallclair

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@michmike

This comment has been minimized.

michmike commented Feb 6, 2018

@tallclair can you please look into this and approve

@tallclair

This comment has been minimized.

Member

tallclair commented Feb 6, 2018

/approve

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Feb 6, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aserdean, dashpole, JiangtianLi, michmike, tallclair

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@fejta-bot

This comment has been minimized.

fejta-bot commented Feb 6, 2018

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@k8s-merge-robot

This comment has been minimized.

Contributor

k8s-merge-robot commented Feb 6, 2018

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-merge-robot

This comment has been minimized.

Contributor

k8s-merge-robot commented Feb 6, 2018

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-merge-robot k8s-merge-robot merged commit 8201e4b into kubernetes:master Feb 6, 2018

13 of 14 checks passed

Submit Queue Required Github CI test is not green: pull-kubernetes-e2e-gce
Details
cla/linuxfoundation JiangtianLi authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-e2e-kubeadm-gce-canary Skipped
pull-kubernetes-kubemark-e2e-gce Job succeeded.
Details
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-unit Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
@feiskyer

This comment has been minimized.

Member

feiskyer commented Feb 7, 2018

1.9 cherry-pick PR created at #59455

k8s-merge-robot added a commit that referenced this pull request Feb 24, 2018

Merge pull request #59455 from feiskyer/automated-cherry-pick-of-#57124
…-upstream-release-1.9

Automatic merge from submit-queue.

Automated cherry pick of #57124: Use GlobalMemoryStatusEx to get total physical memory on

Cherry pick of #57124 on release-1.9.

#57124: Use GlobalMemoryStatusEx to get total physical memory on
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment