Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Fix cgroup hugetlb size prefix for kB #78495
What type of PR is this?
What this PR does / why we need it:
The behavior in the kernel has not changed since the introduction, and the current code using "kB" will therefore fail on devices with small amounts of ram (see #77169) running a kernel with config flag CONFIG_HUGETLBFS=y
As seen from the code in "mem_fmt" inside hugetlb_cgroup.c, only "KB",
Here is a real world example of the files inside the
And the corresponding cgroup files:
Noticed this when tinkering around with kubernetes on a Raspberry PI (1GB ram) running Arch Linux (
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?:
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Hi @odinuge. Thanks for your PR.
I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with
Once the patch is verified, the new status will be reflected by the
I understand the commands that are listed here.
mattjmcnaughton left a comment
Thanks for your work on this issue @odinuge :)
I think the largest question I'm trying to wrap my head around is whether this feature ever worked, or if it has been broken since it was first introduced?
If the former, will this change cause issues for those for whom it was working before?
Additionally, thoughts on how hard it would be to add a unit/integration test to assert this behavior is working?
Lmk when the change on which this depends is merged, and I'll mark this as ok-to-test.
I have twisted my head around this quite a few times myself. HugePages smaller than 1MB doesn't make that much sense after all... And the code is kinda strange, as some stuff is done inside runc, and more or less the same is done inside k8s.
After a bit more research I think I found a relavant patch (doing changes inside architecture code of arm64 only), https://lkml.org/lkml/2018/10/23/143. The line
So, it looks like this that patch introduced the HugePage size
Yee, I do indeed prefer writing some tests. The file containing the code has ~18% coverage, but the func using it has 0.. :( Guess I can write a simple test ensuring KB is used, but I don't think that would give any value, or, what do you think?
Also, as I said in the runc-PR, this looks like the result of a quick copy-paste of the ByteSizes, meant for making huge numbers more human readable, from the docker/go-units package. https://github.com/docker/go-units/blob/519db1ee28dcc9fd2474ae59fca29a810482bfb1/size.go#L37. And another copy-paste from
Thanks for your detail here :) Tbh, I don't have a ton of experience with the cgroup manager component of the kubelet, so I'm not positive what exact type of test coverage already exists. At the very least, adding a unit test would be helpful. It would also be interesting to have a integration testing the entire huge page workflow (if that's possible). I'm not sure how difficult that integration test would be to write (and it also may be more appropriate to do it outside of this diff).
After some considerations I made the list of units public/expoted in
A integration test for cgroup (all the different parts) handling inside kubelet would be nice, but that is way outside my expertise atm... Without some serious rewrites, doing that in a general way (for testing this particular problem) would be quite hard I imagine. Internally
Cool, this sounds great! Agree that an integration test is a large undertaking and we shouldn't block this immediate fix on it. Can you please ping me when your upstream runc change is merged?
Thanks for your work here :)
referenced this pull request
Jun 3, 2019
re-read your comment, sounds like the runc bump is also required to fix the issue. If that's true, then using the constant from runc is more reasonable, but makes fixing this in past releases harder.
Did a rebase now. As you see above @liggitt, @derekwaynecarr (from sig-node & approver in
Also, how far back do you think we should cherry-pick this?
[APPROVALNOTIFIER] This PR is APPROVED
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing
Jun 28, 2019
22 of 23 checks passed
Hi~@odinuge @liggit again~ Penny from ARM. super excited to see this PR got merged. ;)
Hi @Pennyzct! Yes, currently, the only way to run k8s on AArch64 is to compile the master branch of k8s. I think this can become a huge pain point when people gets their hands on the new Raspberry Pi, and starts testing with 5.0 kernels and k8s. :/
I will make a patch for the
I can certainly do all the back porting, as long as there is will in sig-node to get it merged!
This was referenced
Jul 1, 2019
I agree let's start with 1.15.
For me, the largest question is how often someone would be using this