Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extending Hugepage Feature #1539

Closed
bg-chun opened this issue Feb 5, 2020 · 32 comments
Closed

Extending Hugepage Feature #1539

bg-chun opened this issue Feb 5, 2020 · 32 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/node Categorizes an issue or PR as relevant to SIG Node. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team
Milestone

Comments

@bg-chun
Copy link
Member

bg-chun commented Feb 5, 2020

Enhancement Description

  • One-line enhancement description:

    • Extend hugepages feature to overcome limitations, This enhancement consists of 1) container isolation of hugepages 2) support multiple sizes of hugepages
  • Kubernetes Enhancement Proposal: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/20190129-hugepages.md

  • Primary contact (assignee): @bg-chun

  • Responsible SIGs: sig-node

  • Enhancement target (which target equals to which milestone):

    • Alpha release target (1.18) // This enhancement is the extending of GA stage feature.
    • Beta release target (x.y)
    • Stable release target (x.y)

PR Tacker

Hugepages KEP have been updated for serveral enhancements.

  • Support container isolation of hugepages / KEP Update1(merged)
  • Support multi size hugepages at host level / KEP Update2(merged)
  • Support multi size hugepages at container level / KEP Update2(merged)
  • Support hugepage reservation for system-level service / part of original KEP

PRs of container isolation of hugepages

Kubernetes Side

PR Description Status Target Owner
kubernetes/kubernetes#83614 Update CRI to support hugepages merged 1.18 @bg-chun
kubernetes/kubernetes#84154 Support for setting hugepages limit during container creation merged 1.18 @ohsewon
kubernetes/kubernetes#87118 e2e_node test for container isolation of hugepage need review 1.19 @ohsewon

CRI Runtime Side

PR Description Status Target Owner
kubernetes/kubernetes#84701 Update Dockershim WIP 1.19 @admanV
moby/moby#40160 Add hugepages field to resource(moby) Approved 1.18 @bg-chun
cri-o/cri-o#2940 Update Container runtimes(cri-o) Merged 1.18 @bg-chun
containerd/cri#1332 Update Container runtimes(containerd) Merged 1.18 @bg-chun

PRs of support multiple sizes of hugepages

PR Description Status Target Owner
kubernetes/kubernetes#82820 Support for pre-allocated hugepages with 2+ sizes(for host) Merged 1.18 @odinuge
kubernetes/kubernetes#84051 Support for multiple sizes huge pages(for pod) Merged 1.18 @bart0sh

Hugepages feautre related PRs(out of scope of KEP updates)

PR Description Status Target Owner
#80831 Add support for removing unsupported huge page sizes got lgtm @odinuge
#83541 Support for reserving hugepages for system and kubelet need 2rd review @odinuge
#80605 Add huge page usage stats to kubectl describe node need CLI review @odinuge
#81774 Bugfix: Kubelet doesn’t update /sys/fs/cgroup/hugetlb/kubepods/hugetlb.2MB.limit_in_bytes upon Node Status Update got lgtm @rojkov
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Feb 5, 2020
@bg-chun
Copy link
Member Author

bg-chun commented Feb 5, 2020

/milestone v1.18

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Feb 5, 2020

@bg-chun: You must be a member of the kubernetes/milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your and have them propose you as an additional delegate for this responsibility.

In response to this:

/milestone v1.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@bg-chun
Copy link
Member Author

bg-chun commented Feb 5, 2020

/sig node
/kind feature

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. kind/feature Categorizes issue or PR as related to a new feature. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 5, 2020
@bg-chun
Copy link
Member Author

bg-chun commented Feb 5, 2020

/assign @bg-chun

@bg-chun
Copy link
Member Author

bg-chun commented Feb 5, 2020

add related conversation with @justaugustus, @jeremyrickard at the slack
https://kubernetes.slack.com/archives/C2C40FMNF/p1580827241393800

justaugustus:kubecat:  11:40 PM
So the biggest thing I see is that this is missing a way for the Release Team to track it  
The tracking issue that you have open should really be in the enhancements repo  
The KEP is missing the Release Team checklist, which might have been implemented after the enhancement went GA  
but given that this is a revisit of the KEP, I'd suggest opening another enhancements issue and following the template.  
From there, you'll need to add the Release Team Checklist: https://github.com/kubernetes/enhancements/blob/master/keps/YYYYMMDD-kep-template.md#release-signoff-checklist

https://kubernetes.slack.com/archives/C2C40FMNF/p1580827722403800

jerickar  11:48 PM
Was just about to write that :) @bg.chun the enhancement freeze for 1.18 was last week.  
 You’d need to get the Issue created and the KEP updated ASAP and we would need to grant you an exception to get into the release.  
You should for sure do what @justaugustus has called out above and file the exception request : https://github.com/kubernetes/sig-release/blob/master/releases/EXCEPTIONS.md  
The new issue and the KEP will need to happen regardless of release though, so even if we can’t grant an exception for 1.18 you will need those for 1.19

@bg-chun
Copy link
Member Author

bg-chun commented Feb 5, 2020

/cc @justaugustus, @jeremyrickard @derekwaynecarr, @bart0sh ,@odinuge, @kad
sig-release: @justaugustus, @jeremyrickard
sig-node: @derekwaynecarr, @bart0sh ,@odinuge, @kad , @bg-chun

As guidance of sig-release, I created an issue for release and added a checklist on the issue.

I have some questions for rel-checklist.
We extend hugepages feature, which is implemented status and GA stage.

So, it is hard to meet the checklist just right now.
Here, I organized the list of questions.

  1. KEP approvers have set the KEP status to implementable
    The KEP is already implemented/GA.
    What status, KEP should have for this case.
    Should we change status then start alphav2/betav2/GAv2?

  2. Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
    I opened a PR to update the KEP to have a test plan. Is it sufficient?
    Update hugepages KEP for 1.18 rel #1540
    => Done

  3. "Implementation History" section is up-to-date for milestone
    I opened a PR to update KEP for impl history. Is it sufficient?
    Update hugepages KEP for 1.18 rel #1540
    => Done

  4. User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
    Update hugepages documentation website#19008
    => Done

@bg-chun
Copy link
Member Author

bg-chun commented Feb 6, 2020

PR for webdoc is opened :)
kubernetes/website#19008

@bg-chun
Copy link
Member Author

bg-chun commented Feb 10, 2020

We have requested an release exception request for 1.18 rel.

And the exception is granted from sig-rel.

@jeremyrickard
Copy link
Contributor

jeremyrickard commented Feb 10, 2020

/milestone v1.18

@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Feb 10, 2020
@jeremyrickard jeremyrickard added the tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team label Feb 10, 2020
@VineethReddy02
Copy link

VineethReddy02 commented Feb 10, 2020

Hello, @bg-chun, I'm 1.18 docs lead
Does this enhancement work planned for 1.18 require any new docs (or modifications to existing docs)? If not, can you please update the 1.18 Enhancement Tracker Sheet (or let me know and I'll do so)
If so, just a friendly reminder we're looking for a PR against k/website (branch dev-1.18) due by Friday, Feb 28th, it can just be a placeholder PR at this time. Let me know if you have any questions!

@bart0sh
Copy link
Contributor

bart0sh commented Feb 11, 2020

@VineethReddy02 Yes, this enhancement requires documentation update. Here is a PR for it: kubernetes/website#19008

@jeremyrickard
Copy link
Contributor

jeremyrickard commented Feb 11, 2020

Hey @bg-chun @bart0sh,

Thanks so much for all the effort in getting this through! Just a friendly reminder that code freeze for 1.18 is March 05, 2020. Is there anything we should track, aside from your very helpful PR tracker up at the top of the issue?

@bg-chun
Copy link
Member Author

bg-chun commented Feb 12, 2020

Is there anything we should track
=> I think so, we have one un-merged PR(kubernetes/kubernetes#84051)
And @liggitt requested a change for validation logic.

@jeremyrickard
Copy link
Contributor

jeremyrickard commented Mar 3, 2020

@bg-chun thanks for getting that PR merged. You mentioned @liggitt suggested a change for the validation logic, do you have a PR for that?

@bart0sh
Copy link
Contributor

bart0sh commented Mar 4, 2020

@jeremyrickard validation logic change included in that PR, no other changes have been requested.

The only 2 PRs that still need to be reviewed and merged are:

I've asked sig-node maintainers to review and merge them on 2 last sig-node meetings.

@palnabarun palnabarun added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Apr 17, 2020
@palnabarun
Copy link
Member

palnabarun commented Apr 29, 2020

/milestone clear

(removing this enhancement issue from the v1.18 milestone as the milestone is complete)

@k8s-ci-robot k8s-ci-robot removed this from the v1.18 milestone Apr 29, 2020
@fejta-bot
Copy link

fejta-bot commented Jul 28, 2020

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Dec 27, 2020

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ehashman
Copy link
Member

ehashman commented Jan 8, 2021

/reopen
/stage alpha

@k8s-ci-robot k8s-ci-robot added the stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status label Jan 8, 2021
@ehashman
Copy link
Member

ehashman commented Jan 8, 2021

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 8, 2021
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jan 8, 2021

@ehashman: Reopened this issue.

In response to this:

/reopen
/stage alpha

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fejta-bot
Copy link

fejta-bot commented Apr 9, 2021

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 9, 2021
@ehashman
Copy link
Member

ehashman commented Apr 9, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 9, 2021
@ehashman
Copy link
Member

ehashman commented Apr 28, 2021

/milestone v1.22

@k8s-ci-robot k8s-ci-robot added this to the v1.22 milestone Apr 28, 2021
@JamesLaverack JamesLaverack added tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team and removed tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team labels May 1, 2021
@ehashman
Copy link
Member

ehashman commented May 4, 2021

/stage stable

This was alpha in 1.8, beta in 1.19 and should graduate this release, 1.22.

  • The HugePages feature was alpha in 1.8, beta in ??? and graduated in 1.14.
  • This feature, HugePageStorageMediumSize, was alpha in 1.18, beta in 1.19, and seeks to graduate in 1.22.

It's a bit confusing because both are using the same design document.

@k8s-ci-robot k8s-ci-robot added stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status and removed stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status labels May 4, 2021
@k8s-triage-robot
Copy link

k8s-triage-robot commented Aug 5, 2021

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 5, 2021
@salaxander salaxander added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Aug 12, 2021
@salaxander
Copy link

salaxander commented Aug 18, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2021
@salaxander
Copy link

salaxander commented Aug 18, 2021

With the KEP now marked as implemented, I'm going to close out this issue. Great job everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/node Categorizes an issue or PR as relevant to SIG Node. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team
Projects
None yet
Development

No branches or pull requests