Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Default setting is only 1 CPU on macOS for the QEMU machine which causes surprises on Java #17066

Closed
isenberg opened this issue Jan 11, 2023 · 7 comments · Fixed by containers/common#1659
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine remote Problem is in podman-remote

Comments

@isenberg
Copy link

isenberg commented Jan 11, 2023

Issue Description

Quick summary for those who have the same issue and need a quick workaround: Add the number of CPUs during machine init. Example: podman machine init --cpus 8

On the macOS podman where QEMU is used for the underlying Linux VM machine, the default machine size is only 1 CPU and not all CPUs of the hardware. When starting a container with podman run --cpus=4 for example, the resulting cgroups CPU quota and CPU period makes it looks like there are 4 CPUs available now, but other Linux parameters see only 1 CPU.

That causes surprises in application behavior for example for Java applications as the JVM only sees 1 CPU and the Java method Runtime.availableProcessors() returns 1 which is then used by many open source frameworks to calculate Thread pool sizes.

To check that output on the command line with Java 11 or newer:
echo 'Runtime.getRuntime().availableProcessors();/exit' | jshell -q

There may be a good reason why the choice was made originally to use 1 as default in podman on QEMU. Then please close the bug with a note about that. Otherwise if there is concern to overuse the hardware, maybe as alternative 50% of the hardware CPUs as default?

Steps to reproduce the issue

Steps to reproduce the issue

  1. brew install podman
  2. podman machine init
  3. podman machine start
  4. podman run --cpus=4 ...

Describe the results you received

1 CPU

Describe the results you expected

  • 4 CPUs when using podman run --cpus=4
  • 12 CPUs when using podman run without any cpu parameter which should result in an unlimited container only limited by the hardware which as 12 CPUs in this case

podman info output

% podman info              
host:
  arch: amd64
  buildahVersion: 1.28.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.5-1.fc37.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.5, commit: '
  cpuUtilization:
    idlePercent: 99.87
    systemPercent: 0.09
    userPercent: 0.04
  cpus: 8
  distribution:
    distribution: fedora
    variant: coreos
    version: "37"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 506
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 6.0.16-300.fc37.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 1639018496
  memTotal: 2063712256
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.7.2-3.fc37.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.7.2
      commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
      rundir: /run/user/506/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/506/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-8.fc37.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 3h 7m 52.00s (Approximately 0.12 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106769133568
  graphRootUsed: 2254102528
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 0
  runRoot: /run/user/506/containers
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.3.1
  Built: 1668178887
  BuiltTime: Fri Nov 11 07:01:27 2022
  GitCommit: ""
  GoVersion: go1.19.2
  Os: linux
  OsArch: linux/amd64
  Version: 4.3.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

macOS Ventura 13.1 on intel hardware

Additional information

Only relevant for podman on QEMU, i.e. on macOS and most likely also on Windows.

@isenberg isenberg added the kind/bug Categorizes issue or PR as related to a bug. label Jan 11, 2023
@github-actions github-actions bot added the remote Problem is in podman-remote label Jan 11, 2023
@Luap99 Luap99 added the machine label Jan 11, 2023
@Luap99
Copy link
Member

Luap99 commented Jan 11, 2023

The problem is having a default that satisfies everyone is hard. We have no idea how users are using the VM. Although I agree that 1 core might not the best default. Maybe something like using cores/2 as default is better?

@baude @ashley-cui WDYT?

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Feb 19, 2023

I agree two cores and allow users to set it in the containers.conf

@isenberg
Copy link
Author

Two would be better than one, but why the absolute number of 2? What speaks technically against a percentage of the hardware CPUs? For example 50% would match well most desktop and notebook CPUs as those either have hyperthreads or on arm64 Macbooks 50% performance cores.

The idle usage of just having the podman VM machine running without any container is barely noticeable on a Macbook, even when having set it to a number equivalent to 50% of all hardware cores. That might have been different in the past when the decision was made for the 1 default and I also remember to have seen higher idle usage in the past.

The common use case from the enduser side is to run either a container without CPU limits or with CPU limits and then usually a minimum 2 are given with "podman run --cpus=2.0 ID" as example. The use case with --cpus is currently prevented to the enduser as those don't want/know how to change the backend VM machine settings.

About the use case without CPU limit I'd say the enduser expectations would be to have all or at least a good fraction of the hardware CPUs available for peak loads as that's usually the use case for containers without limits. You want to run a quick configuration test or just having a container in the background for test-build purposes and then you want to have it complete it's batch jobs quickly and not being throttled by a single core.

So my recommendation would be 50% of the hardware core number as default.

Waiting for discussions...

@isenberg
Copy link
Author

isenberg commented Feb 20, 2023

To illustrate the problem with the default 1 better, especially for Java where it is more obvious than with other applications:

Here the results with podman's default of 1 machine CPU on macOS where for the enduser an unexpected 1 CPU is seen on Java while he assigns 3 CPUs to the container.

% podman run --cpus=3.0 -i -t --rm ubuntu:22.04
root@c7d6ce81b81c:/# apt update && apt install openjdk-17-jdk-headless
root@c7d6ce81b81c:/# echo 'Runtime.getRuntime().availableProcessors()' | jshell
|  Welcome to JShell -- Version 17.0.5
jshell> Runtime.getRuntime().availableProcessors()$1 ==> 1
root@c7d6ce81b81c:/# cat /sys/fs/cgroup/cpu.max
300000 100000

I'm installing OpenJDK 17 there and checking with the Java method Runtime.getRuntime().availableProcessors() about the available number of CPUs. That method is also used internally in Java for calculation thread pool sizes for application frameworks and JVM-internals. When only 1 CPU is available, the JVM switched to some unusual mode of SerialGC you will never see on any server system and I guess threadpool calculations of many open source frameworks will also be surprised.

At the same time, the Linux cgroup shows 3 CPUs, that's visible in the cpu.max parameter read in the example above (300000 / 100000 = 3). Or on systems with the old cgroups v1 it would be cat /sys/fs/cgroup/cpu/cpu.cfs_quota_us and
cat /sys/fs/cgroup/cpu/cpu.cfs_period_us.

When changing the podman VM machine to more CPUs, for example podman machine set --cpus=5 and then running the same 3-CPU container as in the example above, the result in Java / jshell is 3 as expected and for the cgroups also as expected 300000 / 100000. For an unlimited container the result is then 5 inside Java as expected and in cgroups max / 100000 also as expected.

[updated some more details with 5 CPUs]

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Mar 27, 2023

Sure N Cores would be fine.

@ashley-cui PTAL

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Dec 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. machine remote Problem is in podman-remote
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants