Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

for every huge page resource, we need to remove it from allocatable memory when Updating Node Allocatable limit across pods #86758

Conversation

mysunshine92
Copy link
Contributor

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:

/kind bug

What this PR does / why we need it:

we need to remove huge page resource from allocatable memory when Updating Node Allocatable limit across pods.
here:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/node_container_manager_linux.go#L177

Which issue(s) this PR fixes:

Fixes #84426

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

/sig node

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 1, 2020
Copy link
Contributor

@mattjmcnaughton mattjmcnaughton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boddumanohar I see you commented on the issue - do you mind weighing in here? From your comment, I'm having a little trouble determining if you were in favor of, or opposed to, this proposed change :)

@mysunshine92 mysunshine92 force-pushed the cgroup-remove-hugepage-from-memory branch from 35d124f to b2865ab Compare January 2, 2020 01:20
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 2, 2020
@mysunshine92 mysunshine92 force-pushed the cgroup-remove-hugepage-from-memory branch from b2865ab to 706dcaf Compare January 2, 2020 01:39
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 2, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mysunshine92
To complete the pull request process, please assign vishh
You can assign the PR to them by writing /assign @vishh in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mysunshine92
Copy link
Contributor Author

/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Jan 2, 2020
Copy link

@drsantos20 drsantos20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @mysunshine92 do we have tests for this change?

@mysunshine92 mysunshine92 force-pushed the cgroup-remove-hugepage-from-memory branch from 706dcaf to 90cf46d Compare January 2, 2020 06:17
@mysunshine92
Copy link
Contributor Author

Hi @mysunshine92 do we have tests for this change?

I have test on my k8s 1.13 cluster,it's ok.

@mysunshine92
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce

1 similar comment
@mysunshine92
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce

@mattjmcnaughton
Copy link
Contributor

Hi @mysunshine92 do we have tests for this change?

I have test on my k8s 1.13 cluster,it's ok.

Gotcha! Thoughts on if unit tests could also be beneficial? The advantageous aspect of unit tests is that they will run continuously, ensuring this feature continues to work in the future.

…emory when Updating Node Allocatable limit across pods
@mysunshine92
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jan 3, 2020

@mysunshine92: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-conformance-kind-ipv6-parallel b2865abb319ebcd2b062abd50289888e080ff6cc link /test pull-kubernetes-conformance-kind-ipv6-parallel

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mysunshine92
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce

@mysunshine92
Copy link
Contributor Author

/lgtm

@k8s-ci-robot
Copy link
Contributor

@mysunshine92: you cannot LGTM your own PR.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mysunshine92
Copy link
Contributor Author

/priority backlog

@k8s-ci-robot k8s-ci-robot added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 3, 2020
@mysunshine92
Copy link
Contributor Author

/assign @vishh

Copy link
Member

@odinuge odinuge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the same as we do when calculating node.Status.Allocatable[v1.ResourceMemory] here: https://github.com/kubernetes/kubernetes/blob/0599ca2/pkg/kubelet/nodestatus/setters.go#L354-L366

Think I would prefer a test in order to avoid regressions tho.

Other than the comments this looks good to me

/priority important-longterm

@@ -190,6 +191,21 @@ func (cm *containerManagerImpl) getNodeAllocatableAbsoluteImpl(capacity v1.Resou
}
result[k] = value
}

// for every huge page reservation, we need to remove it from allocatable memory
for k, v := range result {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for k, v := range result {
for k, v := range capacity {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should be calculated by capacity not allocatable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since memory uses does not include hugepage uses,here we should use allocatable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since memory uses does not include hugepage uses,here we should use allocatable

Not sure if I follow. If we reserve huge page memory in system-reserved and/or kube-reserved, we would like to decrement by all huge page memory, not only allocatable. Or?

// for every huge page reservation, we need to remove it from allocatable memory
for k, v := range result {
if v1helper.IsHugePageResourceName(k) {
allocatableMemory := result[v1.ResourceMemory]

This comment was marked as resolved.

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jan 7, 2020
@odinuge
Copy link
Member

odinuge commented Feb 20, 2020

gentle ping @mysunshine92.

Would be nice to get this into 1.18 😄

@mysunshine92
Copy link
Contributor Author

gentle ping @mysunshine92.

Would be nice to get this into 1.18

please add lgtm labels,thanks

@odinuge
Copy link
Member

odinuge commented Jun 8, 2020

ping @mysunshine92

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 8, 2020
@k8s-ci-robot
Copy link
Contributor

@mysunshine92: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 6, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 6, 2020
@mysunshine92
Copy link
Contributor Author

/close

@k8s-ci-robot
Copy link
Contributor

@mysunshine92: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. priority/backlog Higher priority than priority/awaiting-more-evidence. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
7 participants