Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cgroup hugetlb size prefix for kB #2065

Merged
merged 3 commits into from Jun 6, 2019

Conversation

Projects
None yet
3 participants
@odinuge
Copy link
Contributor

commented May 29, 2019

The hugetlb cgroup control files (introduced here in 2012:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abb8206cb0773)
use "KB" and not "kB"
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/hugetlb_cgroup.c?h=v5.0#n349).

The behavior in the kernel has not changed since the introduction, and
the current code using "kB" will therefore fail on devices with small
amounts of ram (see
kubernetes/kubernetes#77169) running a kernel
with config flag CONFIG_HUGETLBFS=y

As seen from the code in "mem_fmt" inside hugetlb_cgroup.c, only "KB",
"MB" and "GB" are used, so the others may be removed as well.

Here is a real world example of the files inside the
"/sys/kernel/mm/hugepages/" directory:

  • "hugepages-64kB"
  • "hugepages-2048kB"
  • "hugepages-32768kB"
  • "hugepages-1048576kB"

And the corresponding cgroup files:

  • "hugetlb.64KB._____"
  • "hugetlb.2MB._____"
  • "hugetlb.32MB._____"
  • "hugetlb.1GB._____"

Signed-off-by: Odin Ugedal odin@ugedal.com

Screenshot from 2019-05-29 22-37-33

Noticed this when tinkering around with kubernetes on a Raspberry PI (1GB ram) running Arch Linux (Linux k8s-003 5.1.5-1-ARCH #1 SMP Sat May 25 13:23:49 MDT 2019 aarch64 GNU/Linux)

Fix cgroup hugetlb size prefix for kB
The hugetlb cgroup control files (introduced here in 2012:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abb8206cb0773)
use "KB" and not "kB"
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/hugetlb_cgroup.c?h=v5.0#n349).

The behavior in the kernel has not changed since the introduction, and
the current code using "kB" will therefore fail on devices with small
amounts of ram (see
kubernetes/kubernetes#77169) running a kernel
with config flag CONFIG_HUGETLBFS=y

As seen from the code in "mem_fmt" inside hugetlb_cgroup.c, only "KB",
"MB" and "GB" are used, so the others may be removed as well.

Here is a real world example of the files inside the
"/sys/kernel/mm/hugepages/" directory:
- "hugepages-64kB"
- "hugepages-2048kB"
- "hugepages-32768kB"
- "hugepages-1048576kB"

And the corresponding cgroup files:
- "hugetlb.64KB._____"
- "hugetlb.2MB._____"
- "hugetlb.32MB._____"
- "hugetlb.1GB._____"

Signed-off-by: Odin Ugedal <odin@ugedal.com>

odinuge added a commit to odinuge/kubernetes that referenced this pull request May 29, 2019

Fix cgroup hugetlb size prefix for kB
The hugetlb cgroup control files (introduced here in 2012:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abb8206cb0773)
use "KB" and not "kB"
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/hugetlb_cgroup.c?h=v5.0#n349).

The behavior in the kernel has not changed since the introduction, and
the current code using "kB" will therefore fail on devices with small
amounts of ram (see
kubernetes#77169) running a kernel
with config flag CONFIG_HUGETLBFS=y

As seen from the code in "mem_fmt" inside hugetlb_cgroup.c, only "KB",
"MB" and "GB" are used, so the others may be removed as well.

Here is a real world example of the files inside the
"/sys/kernel/mm/hugepages/" directory:
- "hugepages-64kB"
- "hugepages-2048kB"
- "hugepages-32768kB"
- "hugepages-1048576kB"

And the corresponding cgroup files:
- "hugetlb.64KB._____"
- "hugetlb.2MB._____"
- "hugetlb.32MB._____"
- "hugetlb.1GB._____"

This does also require the following changes in runc:
opencontainers/runc#2065

Signed-off-by: Odin Ugedal <odin@ugedal.com>
@odinuge

This comment has been minimized.

Copy link
Contributor Author

commented May 29, 2019

Looks like this is a result of a quick copy-paste of the ByteSizes, meant for making huge numbers more human readable, from the docker/go-units package. https://github.com/docker/go-units/blob/519db1ee28dcc9fd2474ae59fca29a810482bfb1/size.go#L37

@mrunalp

This comment has been minimized.

Copy link
Contributor

commented May 29, 2019

LGTM

Approved with PullApprove

@odinuge

This comment has been minimized.

Copy link
Contributor Author

commented May 30, 2019

Some more background info here: kubernetes/kubernetes#78495 (comment)

@odinuge

This comment has been minimized.

Copy link
Contributor Author

commented May 30, 2019

Since the kernel only use "KB", "MB" and "GB", the most correct would be to only use them (including B, in order to make go-units work). For sizes larger than 1TB that would result in index-out-of-range, so that would also require a update of the docker/go-units (fixed here: docker/go-units@a09cd47).

What do you think @mrunalp?

Also, It would be nice having the list exported, so that other users (eg. kubernetes) can import it instead of copying it. 😄

Add tests for GetHugePageSize
Add tests to avoid regressions

Signed-off-by: Odin Ugedal <odin@ugedal.com>

@odinuge odinuge force-pushed the odinuge:master branch from fc12118 to b33e947 May 30, 2019

Export list of HugePageSizeUnits
This will allow others to import it instead of copying it.

Signed-off-by: Odin Ugedal <odin@ugedal.com>

@odinuge odinuge force-pushed the odinuge:master branch from b33e947 to 6f77e35 May 30, 2019

odinuge added a commit to odinuge/linux that referenced this pull request May 30, 2019

docs cgroups: add another example size for hugetlb
Add another example to clarify that HugePages smaller than 1MB will
be displayed using "KB", with an uppercased K (eg. 20KB), and not the
normal SI prefix kilo (small k).

Because of a misunderstanding/copy-paste error inside runc
(see opencontainers/runc#2065), it tried
accessing the cgroup control file of a 64kB HugePage using
"hugetlb.64kB._____" instead of the correct "hugetlb.64KB._____".

Adding a new example will make it clear how sizes smaller than 1MB are
handled.

Signed-off-by: Odin Ugedal <odin@ugedal.com>

ColinIanKing pushed a commit to ColinIanKing/linux-next-mirror that referenced this pull request May 31, 2019

docs cgroups: add another example size for hugetlb
Add another example to clarify that HugePages smaller than 1MB will
be displayed using "KB", with an uppercased K (eg. 20KB), and not the
normal SI prefix kilo (small k).

Because of a misunderstanding/copy-paste error inside runc
(see opencontainers/runc#2065), it tried
accessing the cgroup control file of a 64kB HugePage using
"hugetlb.64kB._____" instead of the correct "hugetlb.64KB._____".

Adding a new example will make it clear how sizes smaller than 1MB are
handled.

Signed-off-by: Odin Ugedal <odin@ugedal.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
@crosbymichael

This comment has been minimized.

Copy link
Member

commented Jun 3, 2019

LGTM

Approved with PullApprove

@odinuge

This comment has been minimized.

Copy link
Contributor Author

commented Jun 5, 2019

Looks like you have to post LGTM again since I changed the PR @mrunalp. Would be nice to get this into master, so that we can fix the issue in k8s. 😄

odinuge added a commit to odinuge/kubernetes that referenced this pull request Jun 5, 2019

Fix cgroup hugetlb size prefix for kB
The hugetlb cgroup control files (introduced here in 2012:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abb8206cb0773)
use "KB" and not "kB"
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/mm/hugetlb_cgroup.c?h=v5.0#n349).

The behavior in the kernel has not changed since the introduction, and
the current code using "kB" will therefore fail on devices with small
amounts of ram (see
kubernetes#77169) running a kernel
with config flag CONFIG_HUGETLBFS=y

As seen from the code in "mem_fmt" inside hugetlb_cgroup.c, only "KB",
"MB" and "GB" are used, so the others may be removed as well.

Here is a real world example of the files inside the
"/sys/kernel/mm/hugepages/" directory:
- "hugepages-64kB"
- "hugepages-2048kB"
- "hugepages-32768kB"
- "hugepages-1048576kB"

And the corresponding cgroup files:
- "hugetlb.64KB._____"
- "hugetlb.2MB._____"
- "hugetlb.32MB._____"
- "hugetlb.1GB._____"

This does also require the following changes in runc:
opencontainers/runc#2065

Signed-off-by: Odin Ugedal <odin@ugedal.com>
@mrunalp

This comment has been minimized.

Copy link
Contributor

commented Jun 6, 2019

LGTM

Approved with PullApprove

@mrunalp mrunalp merged commit b4a0b1d into opencontainers:master Jun 6, 2019

3 checks passed

DCO DCO
Details
code-review/pullapprove Approved by crosbymichael, mrunalp
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

odinuge added a commit to odinuge/agent that referenced this pull request Jun 22, 2019

vendor: update dependency opencontainers/runc
This updates openconatainer/runc (and its deps). Ref. opencontainers/runc#2065,
running a version before that fix will result in some strange behavior on
Aarch64 (Linux 5.0.X+).

Changes in opencontainers/runc:
    652297c7 Update dependency libseccomp-golang
    6770c869 Allow to define `COMMIT` by env
    b54fd85b libcontainer: change seccomp test for clone syscall
    6f77e35d Export list of HugePageSizeUnits
    c6445b1c Add tests for GetHugePageSize
    273e7b74 Fix cgroup hugetlb size prefix for kB
    65032b55 libcontainer: fix TestGetContainerState to check configs.NEWCGROUP
    8383c724 main: not reopen /dev/stderr
    46351eb3 Move systemd.Manager initialization into a function in that module
    62bd2593 VERSION: back to development
    425e105d VERSION: release 1.0.0-rc8
    8362cd02 Vendor in latest selinux code for keycreate errors
    a1460818 Write logs to stderr by default
    68b4ff5b Simplify bail logic & minor nsexec improvements
    17b37ea3 libcontainer: intelrdt: add missing destroy handler in defer func
    475aef10 Remove redundant log function
    ba3cabf9 Improve nsexec logging
    da5a2dd4 `r.destroy` can defer exec in `runner.run` method.
    8296826d specconv: always set "type: bind" in case of MS_BIND
    c486e3c4 Address comments in PR 1861
    feebfac3 Remove pipe close before exec.
    9a599f62 Support for logging from children processes
    3e6688f5 add selinux label for runc exec
    dcf994b4 Fix SELinux failures on disabled SELinux Machines
    6b5ee713 VERSION: back to development
    69ae5da6 VERSION: release v1.0.0-rc7
    eab53309 Fixes regression causing zombie runc:[1:CHILD] processes
    cd96170c Need to setup labeling of kernel keyrings.

Changes in seccomp/libseccomp-golang
    689e3c1 all: Update CHANGELOG for v0.9.1
    0353a0b golang: Add ActLog test
    798ec96 golang: Add support for SCMP_ACT_LOG
    23edf06 golang: Add filterAttrLog getter/setters test
    4b17538 golang: Add support for SCMP_FLTATR_CTL_LOG
    62d5d2b golang: Add API level bindings
    f6ec81d golang: add more info to errors with fmt.Errorf()
    da59163 golang: Fix unit test failures on 32-bit systems
    84e90a9 golang: Fix compile error on Debian
    06e7a29 golang: Resolve bug with handling of multiple argument rules
    fc02980 golang: Remove TSync functions, and set unconditionally
    9814e55 golang: Use `seccomp_version` API to obtain library version

Changes in sirupsen/logrus:
    839c75f Release 1.4.2
    744fc4c fix build break for plan9
    f2849a8 add full cross compilation in travis (#963)
    1bc909a Add a checkTerminal for nacl to support running on play.golang.org
    1a601d2 remove go 1.10 from ci build matrix
    5521996 Update x/sys/unix to fix AIX support
    c1b6154 Fix solaris build
    8bdbc7b Release 1.4.1
    6c615e1 remove field if val is empty string for func and file field in text formatter
    ede5b63 Make isTerminal un-exported
    3e06420 Move files to main directory
    38bc297 return new entry for Entry.WithContext
    7d700cd Test more platforms
    c49ef1d Move terminal package
    5d8c3bf Updated travis.yml
    41ee4dd Moved moved unix-related parts into terminal
    7de3dd8 Removed golang.org/x/crypto refs
    10ff0d0 Got rid of IsTerminal call to reduce external dependencies
    c076594 Add Go 1.12 to Travis CI build matrix
    02141df Add CHANGELOG for v1.4.0
    68e41f6 Add WithContext
    cf1b9fd fix sync.Once usage instead of adding a mutex lock
    b9d4514 fix ReportCaller race condition
    99a5172 Add and example for CallerPrettyfier
    5c2b39a Remove debug trace
    ffec2f2 Add a CallerPrettyfier callback to the text formatter
    5e9b246 Add a CallerPrettyfier callback to the json formatter
    4f5fd63 Fix infinite recursion on unknown Level.String()
    c4e4882 prevent string formatting in Entry.Logf when log level is not enabled
    774bb8e Fix error formatting based on best practices from Code Review Comments
    4ea4861 Add a DeferExitHandler function
    68a2b57 Add nested-logrus-formatter to README.md
    f61e48b logger: fix wrong callback method
    0f544bf Add a unit test to ensure hook are called in their registration order
    a99ca47 Add an example hook which adds default fields
    78fb385 Remove unused variables in TextFormatter
    eef6b76 Update Changelog for 1.3.0
    bd9534b Test Log
    e8fd0ba Remove sensitivity to file line changes
    ff695da Implement TextUnmarshaller interface for Level type
    a6668e7 Add Generic Log functions with level via argument
    9abefb9 do not clear error formatting informative field
    d962013 respect ForceColor and environment variables over OS check
    08e8d65 Skip func pointer type value in fields
    0c5e33c Travis: fix checkout dir to help contributors run Travis on their fork
    f1b98e4 ignore expected color on windows
    e902658 Disable colored output on windows entirely
    eab2c44 fix hook example
    c7183bf fix missing parameter
    2cafb78 fix race condition caused by writing to entry.Data, using the same technique as JSONFormatter
    bcd833d v1.2.0 changelog
    d10c2f9 fix panic in text formatter
    5a78c38 make file name comparison os independant
    d2654b7 add file and line number in output when report caller is enabled
    fa01b53 move test functions and test utils functions in their own package
    ec57031 store a runtime.Frame in Entry instead of the caller function name
    975c406 Use a sync.Once to init the reportCaller data
    5fcd19e add a SetReportCaller on Logger object
    0c52582 Add GELF to third party formatters
    5c1f2cd Make logrus.Level implement encoding.TextUnmarshaler
    bb98c6c Fix the version of windows coloring library dependency
    ed3ffa0 PR#844: Added Trace to TestLogLevelEnabled() (requested by @dgsb)
    b54cafe Addresses @stevvooe's backward compatibility concerns.
    ef9d84e Added trace log level.
    c7a33dc Add Trace level logging
    4981d81 Added TRACE level logging.
    9c7692c disable colors on hook example
    f2ab87f Add an example for tracing global variable with hook
    ff92509 Attempt to fix build break on aix
    a13c5db Fix typo in comment
    4346c76 Remove unnecessary wrapper function on `os.Exit`
    99bc300 Add a method Exit on Logger that calls `os.Exit` or alternate exit function.
    ad15b42 Update changelog for v1.1.1 release
    3f90cee Rationalize os specific build constraints
    2be6202 Add option to panic in `test.NewNullLogger` to allow testing of calls to `Fatal*`
    7b467df Skip func type value in fields.
    a67f783 Update changelog for v1.1.0 release
    73bc94e Add custom sorting function in text formatter
    5a88d3c Add missing module dependency for windows build
    629982b DisableColors in two tests to fix AppEngine configuration
    0a8fc8d Add AppEngine test configurations to travis to a void regression
    f1ce1ba Fix copypasta
    90501cf Fix AppEngine builds
    98c898c Fix gopherjs build constraint name
    eed7c22 Fix travis build for go 1.11 with modules
    66895ce Fix module name and remove unused dependencies
    88eb166 Fix spelling in Entry.Buffer comment
    f75951b Add go module support
    4bcb47b commit to trigger appveyor build
    8b12043 Fix example build on windows
    7556e24 Use syslog instead of airbrake as syslog example
    e58aa84 bump go toolchain version in travis
    98d0f31 Add previously forgotten v1.0.6 description in changelog
    90bf2e7 feat(LogLevel): taking in account code review from David Bariod
    13d10d8 return old hooks from RelplaceHooks
    7a0120e logger.ReplaceHooks
    b5e6fae Cleanup on unit test on isColored
    cadf2ce Add unit test for TextFormatter.isColored
    eb968b6 Fix for CLICOLOR_FORCE handling
    8a6a17c Fixed missing brace after wrong merge
    d950ecd Remove unnecessary text_formatter file
    da39da2 Keep terminal check naming convention
    37d651c Add CLICOLOR support
    179037f Ensure a new entry data fields are empty
    d316277 Add logger benchmark
    54db2bb limit the build/test matrix to the two latest stable version
    6999e59 properly fix the hooks race test
    725f3be Adds WithTime to Logger and Entry types, as well as a pure module-level function.
    52b92f5 Allows overriding Entry.Time.
    fc9bbf2 [kata-containers#241] Allow to set writer during logger usage.
    eed1c0f Fix GopherJS build tags
    2ce6c0c Support for Entry data under nested JSON dictionary.
    6b28c2c error message
    5d60369 Fixed prefixFieldClashes for TextFormatter and added coverage
    4225d69 feat: new methods to check enabled log level
    070c81d Revert the change introduced in #707 and do the proper fix. Fixes #729
    098a5a7 Move the hook services list to a wiki page
    caed59e Fix Logger.WithField doscription
    aa6766a PERF: use buffer pool in json formatter
    b1e82be Update go versions in travis configuration.
    8369e2f Fix a race condition in TestLoggingWithHooksRace
    507c822 add mysql hook
    e63a8df added Anexia CloudLog to list of hooks
    5513c60 Improve documentation for Fatal* class functions
    2f58bc8 Unified terminal initialization code handling
    9bc59a5 Fixed initTerminal() was run for non-terminals
    cf5eba7 Simplified file structure
    c9a46a1 Added terminal check on Windows
    7d2a521 Extended conditions to include non-native builds
    f142d81 Improved building of non-windows code
    bb487e0 Added support for text coloring on Windows 10
    19b9c9e delete dead link
    b537da5 Fix run-on sentence
    723dd3c changed prettyprinting to use spaces as opposed to /t
    c155da1 changelog: add 1.0.5
    91b159d Add Kafka REST Proxy hook to README
    c840e59 add gopherjs build tag
    1893e9a Fixed: comment
    f4118d2 reamde: add logrus-clickhouse-hook
    efab7f3 Have prefixFieldClashes respect the JSON FieldMap
    be56909 Make fireHooks() method receive a copy of Entry structure to avoid race conditions
    178041e Fix typo in README.md
    828a649 rename fieldLogger to entry
    eeb6535 Lock mutex before formatting to avoid race
    efbfdb5 Add failing test for using a FieldLogger with hooks inside goroutines
    0cf9f0b Made text consistent with other hooks
    516f6c1 Add Application Insights hook to README
    977e033 Fix deadlock on panics at Entry.log
    92aece5 TextFormatter behaviour aligned with stdlib log (fixes kata-containers#167)
    eb15690 remove .gitignore changes and update AddHook
    20cc8e2 remove .gitignore changes
    0c03a05 mirror and wrap Logger instance methods in exported.go
    d682213 changelog: 1.0.4
    b9eceae fix example
    bf1fb70 Add FieldMap support to TestFormatter
    73a1342 Fix typo in README.md
    10d6a5b removed useless line from readme
    639325f added pretty print option for json logs
    9700beb Update README.md
    1858a85 Adds `logbeat` hook to README
    c44d524 Fix typo in docstring
    4844e58 Add promrus to list of hooks.
    7d3ddc6 Split terminal check to add build tags to support App Engine.
    cd1114d Added reference to AzureTableHook
    9bc52e3 Fix test assertion
    c830992 Take lock on mutex when firing hooks
    66230b2 Add test for race condition in hooks
    3d1341c Add AddHook method for logger
    5efed00 Update README.md to fix link to Kafka hook
    3bd397e Add Telegram hook to README.md.
    e3d1776 MD formatting
    9ce1c9e add github path to log message in readme
    b1db1b9 regex assertion rather than literal, for github path
    3cb9e18 test updates
    bc6d984 add caller logic to DisableTimestamp case
    1f59c9a Add DisableLevelTruncation description to README
    88dd8df responses to code review
    d8fd234 add syntax hilighting to new example blocks
    2e7c40e README formatting tweak
    802fba1 add note on caller-reporting overhead to README
    306956c tweak timing tests to handle slower VMs and older GoLang
    65f3af3 simplify hasCaller check
    a5c845c responses to review comments
    4575b7a revert slight added complexity in NewEntry()
    05a8f4d fix test description
    348bace doc updates, and relabel ReportMethod
    1e21450 push compilation even higher, to reduce to one call
    8161d93 performance: precompile regex before iterating
    473c344 Add README notes and CHANGELOG entries
    93af604 First cut at adding calling method
    e5b6713 Added testing for DisableLevelTruncation
    7a1f601 Added ability to disable level text truncation. Fixes kata-containers#406

Signed-off-by: Odin Ugedal <odin@ugedal.com>

thaJeztah added a commit to thaJeztah/containerd that referenced this pull request Jun 25, 2019

bump runc v1.0.0-rc8-32-gf4982d86
full diff: opencontainers/runc@v1.0.0-rc8...f4982d8

possibly relevant changes included:

- opencontainers/runc#2074 Update dependency libseccomp-golang
  - fixes https://nvd.nist.gov/vuln/detail/CVE-2017-18367
- opencontainers/runc#2065 Fix cgroup hugetlb size prefix for kB
- opencontainers/runc#2042 libcontainer: intelrdt: add missing destroy handler in defer func
- opencontainers/runc#2042 main: not reopen /dev/stderr
- opencontainers/runc#2038 `r.destroy` can defer exec in `runner.run` method
- opencontainers/runc#2035 specconv: always set "type: bind" in case of MS_BIND
- opencontainers/runc#2035 Move systemd.Manager initialization into a function in that module
- opencontainers/runc#2034 Support for logging from children processes

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

gs0622 added a commit to gs0622/linux that referenced this pull request Jun 25, 2019

docs cgroups: add another example size for hugetlb
Add another example to clarify that HugePages smaller than 1MB will
be displayed using "KB", with an uppercased K (eg. 20KB), and not the
normal SI prefix kilo (small k).

Because of a misunderstanding/copy-paste error inside runc
(see opencontainers/runc#2065), it tried
accessing the cgroup control file of a 64kB HugePage using
"hugetlb.64kB._____" instead of the correct "hugetlb.64KB._____".

Adding a new example will make it clear how sizes smaller than 1MB are
handled.

Signed-off-by: Odin Ugedal <odin@ugedal.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.