Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline and CI sometimes hitting EAGAIN (Resource temporarily unavailable) on fork #1186

Open
jlebon opened this issue Apr 11, 2022 · 9 comments

Comments

@jlebon
Copy link
Member

jlebon commented Apr 11, 2022

Encountered this during a build of the next stream:

[2022-04-11T17:07:58.571Z] + cosa buildextend-vultr
[2022-04-11T17:07:58.646Z] [INFO]: CLI is a symlink for cmd-buildextend-openstack
[2022-04-11T17:07:58.646Z] [INFO]: Target 'OPENSTACK' is a Qemu Variant image
[2022-04-11T17:07:58.824Z] [INFO]: CLI is a symlink for cmd-buildextend-virtualbox
[2022-04-11T17:07:58.824Z] [INFO]: Targeting architecture: x86_64
[2022-04-11T17:07:58.824Z] [INFO]: Targeting build: 35.20220410.2.0
[2022-04-11T17:07:58.824Z] [ERROR]: unexpected linkcount: -1
[2022-04-11T17:07:58.824Z] [INFO]: CLI is a symlink for cmd-buildextend-vmware
[2022-04-11T17:07:58.824Z] [INFO]: Targeting architecture: x86_64
[2022-04-11T17:07:58.824Z] [INFO]: Targeting build: 35.20220410.2.0
[2022-04-11T17:07:58.824Z] [INFO]: Processed build for: Fedora CoreOS testing (FEDORA-COREOS-x86_64) 35.20220410.2.0
[2022-04-11T17:07:58.824Z] [INFO]: Processing the build artifacts
[2022-04-11T17:07:58.824Z] [INFO]: Staging temp image: /home/jenkins/agent/workspace/build/tmp/tmpijnutbt3/fedora-coreos-35.20220410.2.0-vmware.x86_64.vmdk
[2022-04-11T17:07:58.824Z] + /usr/lib/coreos-assembler/gf-platformid builds/35.20220410.2.0/x86_64/fedora-coreos-35.20220410.2.0-qemu.x86_64.qcow2 /home/jenkins/agent/workspace/build/tmp/tmpijnutbt3/fedora-coreos-35.20220410.2.0-qemu.x86_64.qcow2.working vmware
[2022-04-11T17:07:58.824Z] Traceback (most recent call last):
[2022-04-11T17:07:58.824Z]   File "/usr/lib/coreos-assembler/cmd-buildextend-vmware", line 120, in <module>
[2022-04-11T17:07:58.824Z]     artifact_cli().build_artifacts()
[2022-04-11T17:07:58.824Z]   File "/usr/lib/coreos-assembler/cosalib/build.py", line 414, in build_artifacts
[2022-04-11T17:07:58.824Z]     self._build_artifacts(*args, **kwargs)
[2022-04-11T17:07:58.824Z]   File "/usr/lib/coreos-assembler/cosalib/qemuvariants.py", line 314, in _build_artifacts
[2022-04-11T17:07:58.824Z]     meta_patch = self.mutate_image()
[2022-04-11T17:07:58.824Z]   File "/usr/lib/coreos-assembler/cosalib/qemuvariants.py", line 236, in mutate_image
[2022-04-11T17:07:58.824Z]     self.set_platform()
[2022-04-11T17:07:58.824Z]   File "/usr/lib/coreos-assembler/cosalib/qemuvariants.py", line 214, in set_platform
[2022-04-11T17:07:58.824Z]     run_verbose(['/usr/lib/coreos-assembler/gf-platformid',
[2022-04-11T17:07:58.824Z]   File "/usr/lib/coreos-assembler/cosalib/cmdlib.py", line 67, in run_verbose
[2022-04-11T17:07:58.824Z]     process = subprocess.run(args, **kwargs)
[2022-04-11T17:07:58.824Z]   File "/usr/lib64/python3.10/subprocess.py", line 501, in run
[2022-04-11T17:07:58.824Z]     with Popen(*popenargs, **kwargs) as process:
[2022-04-11T17:07:58.824Z]   File "/usr/lib64/python3.10/subprocess.py", line 966, in __init__
[2022-04-11T17:07:58.824Z]     self._execute_child(args, executable, preexec_fn, close_fds,
[2022-04-11T17:07:58.824Z]   File "/usr/lib64/python3.10/subprocess.py", line 1775, in _execute_child
[2022-04-11T17:07:58.824Z]     self.pid = _posixsubprocess.fork_exec(
[2022-04-11T17:07:58.824Z] BlockingIOError: [Errno 11] Resource temporarily unavailable
[2022-04-11T17:07:58.827Z] [Pipeline] }
[2022-04-11T17:07:58.829Z] [Pipeline] // withEnv
[2022-04-11T17:07:58.855Z] [Pipeline] }
[2022-04-11T17:07:58.855Z] Failed in branch VMware
[2022-04-11T17:07:59.201Z] [INFO]: Targeting architecture: x86_64
[2022-04-11T17:07:59.201Z] [INFO]: Targeting build: 35.20220410.2.0
[2022-04-11T17:07:59.201Z] [INFO]: Processed build for: Fedora CoreOS testing (FEDORA-COREOS-x86_64) 35.20220410.2.0
[2022-04-11T17:07:59.201Z] [INFO]: Processing the build artifacts
[2022-04-11T17:07:59.201Z] [INFO]: Staging temp image: /home/jenkins/agent/workspace/build/tmp/tmpqoh99aiz/fedora-coreos-35.20220410.2.0-openstack.x86_64.qcow2
[2022-04-11T17:07:59.201Z] + /usr/lib/coreos-assembler/gf-platformid builds/35.20220410.2.0/x86_64/fedora-coreos-35.20220410.2.0-qemu.x86_64.qcow2 /home/jenkins/agent/workspace/build/tmp/tmpqoh99aiz/fedora-coreos-35.20220410.2.0-qemu.x86_64.qcow2.working openstack
[2022-04-11T17:07:59.379Z] [INFO]: Processed build for: Fedora CoreOS testing (FEDORA-COREOS-x86_64) 35.20220410.2.0
[2022-04-11T17:07:59.379Z] [INFO]: Processing the build artifacts
[2022-04-11T17:07:59.379Z] [INFO]: Staging temp image: /home/jenkins/agent/workspace/build/tmp/tmp70_zbf9g/fedora-coreos-35.20220410.2.0-virtualbox.x86_64.vmdk
[2022-04-11T17:07:59.379Z] + /usr/lib/coreos-assembler/gf-platformid builds/35.20220410.2.0/x86_64/fedora-coreos-35.20220410.2.0-qemu.x86_64.qcow2 /home/jenkins/agent/workspace/build/tmp/tmp70_zbf9g/fedora-coreos-35.20220410.2.0-qemu.x86_64.qcow2.working virtualbox
[2022-04-11T17:07:59.379Z] Traceback (most recent call last):
[2022-04-11T17:07:59.379Z]   File "/usr/lib/coreos-assembler/cmd-buildextend-virtualbox", line 120, in <module>
[2022-04-11T17:07:59.379Z]     artifact_cli().build_artifacts()
[2022-04-11T17:07:59.379Z]   File "/usr/lib/coreos-assembler/cosalib/build.py", line 414, in build_artifacts
[2022-04-11T17:07:59.379Z]     self._build_artifacts(*args, **kwargs)
[2022-04-11T17:07:59.379Z]   File "/usr/lib/coreos-assembler/cosalib/qemuvariants.py", line 314, in _build_artifacts
[2022-04-11T17:07:59.379Z]     meta_patch = self.mutate_image()
[2022-04-11T17:07:59.379Z]   File "/usr/lib/coreos-assembler/cosalib/qemuvariants.py", line 236, in mutate_image
[2022-04-11T17:07:59.379Z]     self.set_platform()
[2022-04-11T17:07:59.379Z]   File "/usr/lib/coreos-assembler/cosalib/qemuvariants.py", line 214, in set_platform
[2022-04-11T17:07:59.379Z]     run_verbose(['/usr/lib/coreos-assembler/gf-platformid',
[2022-04-11T17:07:59.379Z]   File "/usr/lib/coreos-assembler/cosalib/cmdlib.py", line 67, in run_verbose
[2022-04-11T17:07:59.379Z]     process = subprocess.run(args, **kwargs)
[2022-04-11T17:07:59.379Z]   File "/usr/lib64/python3.10/subprocess.py", line 501, in run
[2022-04-11T17:07:59.379Z]     with Popen(*popenargs, **kwargs) as process:
[2022-04-11T17:07:59.379Z]   File "/usr/lib64/python3.10/subprocess.py", line 966, in __init__
[2022-04-11T17:07:59.379Z]     self._execute_child(args, executable, preexec_fn, close_fds,
[2022-04-11T17:07:59.379Z]   File "/usr/lib64/python3.10/subprocess.py", line 1775, in _execute_child
[2022-04-11T17:07:59.379Z]     self.pid = _posixsubprocess.fork_exec(
[2022-04-11T17:07:59.379Z] BlockingIOError: [Errno 11] Resource temporarily unavailable
[2022-04-11T17:07:59.384Z] [Pipeline] }
[2022-04-11T17:07:59.385Z] [Pipeline] // withEnv
[2022-04-11T17:07:59.410Z] [Pipeline] }
[2022-04-11T17:07:59.410Z] Failed in branch VirtualBox
[2022-04-11T17:08:06.622Z] **
[2022-04-11T17:08:06.622Z] ERROR:src/lib/rpmostree.c:122:rpm_ostree_get_basearch: assertion failed (local_error == NULL): Failed to invoke rpm-ostree shlib-backend: thread 'main' panicked at 'OS can't spawn a new worker thread: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', /builddir/build/BUILD/rpm-ostree-2022.5.80.gb7f91619/vendor/tokio/src/runtime/blocking/pool.rs:297:14
[2022-04-11T17:08:06.622Z] note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2022-04-11T17:08:06.622Z]  (g-io-error-quark, 0)
[2022-04-11T17:08:06.622Z] Bail out! ERROR:src/lib/rpmostree.c:122:rpm_ostree_get_basearch: assertion failed (local_error == NULL): Failed to invoke rpm-ostree shlib-backend: thread 'main' panicked at 'OS can't spawn a new worker thread: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', /builddir/build/BUILD/rpm-ostree-2022.5.80.gb7f91619/vendor/tokio/src/runtime/blocking/pool.rs:297:14
[2022-04-11T17:08:06.622Z] note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2022-04-11T17:08:06.622Z]  (g-io-error-quark, 0)
[2022-04-11T17:08:06.875Z] /home/jenkins/agent/workspace/build@tmp/durable-d83c6a4a/script.sh: line 7: 23305 Aborted                 (core dumped) cosa buildextend-vultr
[2022-04-11T17:08:06.878Z] [Pipeline] }
[2022-04-11T17:08:06.880Z] [Pipeline] // withEnv
[2022-04-11T17:08:06.896Z] [Pipeline] }
[2022-04-11T17:08:06.897Z] Failed in branch Vultr
[2022-04-11T17:08:07.252Z] **
[2022-04-11T17:08:07.252Z] ERROR:src/lib/rpmostree.c:122:rpm_ostree_get_basearch: assertion failed (local_error == NULL): Failed to invoke rpm-ostree shlib-backend: thread 'main' panicked at 'OS can't spawn a new worker thread: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', /builddir/build/BUILD/rpm-ostree-2022.5.80.gb7f91619/vendor/tokio/src/runtime/blocking/pool.rs:297:14
[2022-04-11T17:08:07.252Z] note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2022-04-11T17:08:07.252Z]  (g-io-error-quark, 0)
[2022-04-11T17:08:07.252Z] Error running command /usr/lib/coreos-assembler/gf-platformid
[2022-04-11T17:08:07.255Z] [Pipeline] }
[2022-04-11T17:08:07.257Z] [Pipeline] // withEnv
[2022-04-11T17:08:07.267Z] [Pipeline] }
[2022-04-11T17:08:07.267Z] Failed in branch OpenStack
[2022-04-11T17:08:09.652Z] + set -xeuo pipefail
[2022-04-11T17:08:09.652Z] ++ umask
[2022-04-11T17:08:09.652Z] + '[' 0000 = 0000 ']'
[2022-04-11T17:08:09.652Z] + umask 0022
[2022-04-11T17:08:09.652Z] + cosa buildextend-aliyun
[2022-04-11T17:08:09.652Z] **
[2022-04-11T17:08:09.652Z] ERROR:src/lib/rpmostree.c:122:rpm_ostree_get_basearch: assertion failed (local_error == NULL): Failed to invoke rpm-ostree shlib-backend: thread 'main' panicked at 'OS can't spawn a new worker thread: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', /builddir/build/BUILD/rpm-ostree-2022.5.80.gb7f91619/vendor/tokio/src/runtime/blocking/pool.rs:297:14
[2022-04-11T17:08:09.652Z] note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2022-04-11T17:08:09.652Z]  (g-io-error-quark, 0)
[2022-04-11T17:08:09.652Z] Bail out! ERROR:src/lib/rpmostree.c:122:rpm_ostree_get_basearch: assertion failed (local_error == NULL): Failed to invoke rpm-ostree shlib-backend: thread 'main' panicked at 'OS can't spawn a new worker thread: Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }', /builddir/build/BUILD/rpm-ostree-2022.5.80.gb7f91619/vendor/tokio/src/runtime/blocking/pool.rs:297:14
[2022-04-11T17:08:09.652Z] note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
[2022-04-11T17:08:09.652Z]  (g-io-error-quark, 0)
[2022-04-11T17:08:09.652Z] /home/jenkins/agent/workspace/build@tmp/durable-5c7742f2/script.sh: line 7: 23336 Aborted                 (core dumped) cosa buildextend-aliyun
[2022-04-11T17:08:09.659Z] [Pipeline] }
[2022-04-11T17:08:09.661Z] [Pipeline] // withEnv
[2022-04-11T17:08:09.665Z] [Pipeline] }
[2022-04-11T17:08:09.665Z] Failed in branch Aliyun
...

Hitting up against limits somehow? My first thought was we're not reaping zombies, but we do have:

[2022-04-11T16:16:27.711Z]   - name: coreos-assembler
[2022-04-11T16:16:27.711Z]     image: coreos-assembler:main
[2022-04-11T16:16:27.711Z]     imagePullPolicy: Always
[2022-04-11T16:16:27.711Z]     command:
[2022-04-11T16:16:27.711Z]     - /usr/bin/dumb-init
[2022-04-11T16:16:27.711Z]     args:
[2022-04-11T16:16:27.711Z]     - sleep
[2022-04-11T16:16:27.711Z]     - infinity

Let's keep a close eye on this and see if it happens again.

@jlebon jlebon changed the title Pipeline hit EAGAIN (Resource temporarily unavailable) on fork Pipeline and CI sometimes hitting EAGAIN (Resource temporarily unavailable) on fork Apr 27, 2022
@jlebon jlebon transferred this issue from coreos/fedora-coreos-pipeline Apr 27, 2022
@jlebon
Copy link
Member Author

jlebon commented Apr 27, 2022

We're seeing this also in the rpm-ostree CI: coreos/rpm-ostree#3639 (comment)

@cgwalters
Copy link
Member

Isn't this just ye olde kubernetes-vs-nproc https://www.flamingspork.com/blog/2020/11/25/why-you-should-use-nproc-and-not-grep-proc-cpuinfo/ that we've patched in a few places, but probably not enough?

I think classically still today make -j and the Go runtime use cpuinfo and don't honor cpusets.

@lucab
Copy link
Contributor

lucab commented Apr 27, 2022

I'm not sure if it is the same root-cause, but Zincati CI was also endelessly deadlocking: coreos/zincati#758 (comment). At the moment it seems to be back on track, at least for now.

jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this issue Apr 27, 2022
We're seeing issues where the pipeline sometimes hits EAGAIN on `fork`:

coreos/fedora-coreos-tracker#1186

This may have started happening more frequently after the recent
addition of VirtualBox.

Anyway, it's kinda silly to build that many images in parallel. There's
diminishing returns and eventually a performance cost. So let's do it in
two rounds instead. (Ideally instead it'd be a kind of task pool, but
let's not try to implement that in Groovy.)
jlebon added a commit to jlebon/rpm-ostree that referenced this issue Apr 27, 2022
We're seeing issues where CI sometimes hits EAGAIN on `fork`:

coreos/fedora-coreos-tracker#1186

It's possible something changed recently that made rpm-ostree use more
threads than before and more likely to hit this.

There's no way for us control the PID limit from our side (it's a
cluster setting), so let's just lower parallelism here. (Another
approach is splitting it into multiple pods, but it doesn't seem worth
the complexity yet.)
jlebon added a commit to jlebon/rpm-ostree that referenced this issue Apr 27, 2022
We're seeing issues where CI sometimes hits EAGAIN on `fork`:

coreos/fedora-coreos-tracker#1186

It's possible something changed recently that made rpm-ostree use more
threads than before and more likely to hit this.

There's no way for us control the PID limit from our side (it's a
cluster setting), so let's just lower parallelism here. (Another
approach is splitting it into multiple pods, but it doesn't seem worth
the complexity yet.)
jlebon added a commit to jlebon/rpm-ostree that referenced this issue Apr 27, 2022
We're seeing issues where CI sometimes hits EAGAIN on `fork`:

coreos/fedora-coreos-tracker#1186

It's possible something changed recently that made rpm-ostree use more
threads than before and more likely to hit this.

There's no way for us control the PID limit from our side (it's a
cluster/node setting), so let's just lower parallelism here. (Another
approach is splitting it into multiple pods, but it doesn't seem worth
the complexity yet.)
jlebon added a commit to jlebon/rpm-ostree that referenced this issue Apr 27, 2022
We're seeing issues where CI sometimes hits EAGAIN on `fork`:

coreos/fedora-coreos-tracker#1186

It's possible something changed recently that made rpm-ostree use more
threads than before and more likely to hit this.

There's no way for us control the PID limit from our side (it's a
node setting), so let's just lower parallelism here. (Another approach
is splitting it into multiple pods, but it doesn't seem worth the
complexity yet.)
@jlebon
Copy link
Member Author

jlebon commented Apr 27, 2022

I think classically still today make -j and the Go runtime use cpuinfo and don't honor cpusets.

My recollection of that is that our pod CPU request wasn't matching the parallel threads we were hitting. But here it seems like we're hitting some PID limit, which is obviously related to the amount of parallelism we do, but is independent of the pod's CPU request.

But yeah, clearly one thing we could do is lower the parallelism.

OK did that in:

Hmm, I was reading https://kubernetes.io/docs/concepts/policy/pid-limiting/. I wonder if the Fedora cluster admins recently started using some of the PID limit switches mentioned there too. Will ask around.

cgwalters pushed a commit to coreos/rpm-ostree that referenced this issue Apr 27, 2022
We're seeing issues where CI sometimes hits EAGAIN on `fork`:

coreos/fedora-coreos-tracker#1186

It's possible something changed recently that made rpm-ostree use more
threads than before and more likely to hit this.

There's no way for us control the PID limit from our side (it's a
node setting), so let's just lower parallelism here. (Another approach
is splitting it into multiple pods, but it doesn't seem worth the
complexity yet.)
jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this issue Apr 27, 2022
We're seeing issues where the pipeline sometimes hits EAGAIN on `fork`:

coreos/fedora-coreos-tracker#1186

This may have started happening more frequently after the recent
addition of VirtualBox.

Anyway, it's kinda silly to build that many images in parallel. There's
diminishing returns and eventually a performance cost. So let's do it in
two rounds instead. (Ideally instead it'd be a kind of task pool, but
let's not try to implement that in Groovy.)
jlebon added a commit to coreos/fedora-coreos-pipeline that referenced this issue Apr 27, 2022
We're seeing issues where the pipeline sometimes hits EAGAIN on `fork`:

coreos/fedora-coreos-tracker#1186

This may have started happening more frequently after the recent
addition of VirtualBox.

Anyway, it's kinda silly to build that many images in parallel. There's
diminishing returns and eventually a performance cost. So let's do it in
two rounds instead. (Ideally instead it'd be a kind of task pool, but
let's not try to implement that in Groovy.)
@jlebon
Copy link
Member Author

jlebon commented Apr 28, 2022

OK, cosa also hitting this now:

[2022-04-27T01:44:46.091Z] [logs:build/coreos-ci-coreos-assembler-041cd6cc-2780-4c0b-86a3-693718e25127-1] runtime: failed to create new OS thread (have 12 already; errno=11)
[2022-04-27T01:44:46.091Z] [logs:build/coreos-ci-coreos-assembler-041cd6cc-2780-4c0b-86a3-693718e25127-1] runtime: may need to increase max user processes (ulimit -u)
[2022-04-27T01:44:46.091Z] [logs:build/coreos-ci-coreos-assembler-041cd6cc-2780-4c0b-86a3-693718e25127-1] fatal error: newosproc

This bit:

have 12 already

makes me think this is very unlikely to be us hitting the limit at the pod level, but instead either at the configured kubelet level, or the actual host level.

@davidkirwan
Copy link

We know how to increase the limits, we can create a machineconfig. But we're just wondering, if there is some issue in the pipeline, increasing the limit may not be the right fix here. Is there a way to set the limit somehow within Jenkins instead?

@cgwalters
Copy link
Member

I think do need to try to limit parallelism inside our workloads, which...I don't think Jenkins can help with, but chasing down things that are trying to create too many CPU threads would.

We also have the generic problem that we have absolutely no prioritization going on - when e.g. dependabot appears on Monday it creates a storm of PRs which will compete for resources with the production FCOS jobs, AFAIK.

Perhaps another approach is to use kata containers by default - that will inherently create a small VM which I think will limit what appears to be the "physical" CPU count and things like the Go runtime will only see e.g. 4 processors and not 64 or whatever the host has.

@jlebon
Copy link
Member Author

jlebon commented Apr 29, 2022

OK this is super confusing. We were investigating this a bit yesterday with @darknao, who was monitoring /sys/fs/cgroup/pids/pids.current in a pod building cosa (i.e. trying to reproduce #1186 (comment)), and we saw it go as high as 600, which seems highly suspect to me.

Today I wanted to dig in more and so pushed

diff --git a/build.sh b/build.sh
index 0629db4a5..003d08fd7 100755
--- a/build.sh
+++ b/build.sh
@@ -96,7 +96,11 @@ install_ocp_tools() {
 }

 make_and_makeinstall() {
+    while sleep 1; do ps -eLf | wc -l | xargs echo threads:; done &
+    local pid=$!
+    export GODEBUG=schedtrace=1000
     make && make install
+    kill $pid
 }

 configure_user(){

and ran a build through CoreOS CI (https://jenkins-coreos-ci.apps.ocp.fedoraproject.org/job/coreos-assembler/job/try/1/console).

At its peak, I got:

12:09:30  [logs:build/coreos-ci-coreos-assembler-36b152c1-813e-4bef-aabf-ae3a7adad503-1] threads: 74
12:09:30  [logs:build/coreos-ci-coreos-assembler-36b152c1-813e-4bef-aabf-ae3a7adad503-1] SCHED 11050ms: gomaxprocs=8 idleprocs=8 threads=22 spinningthreads=0 idlethreads=8 runqueue=0 [0 0 0 0 0 0 0 0]

So gomaxprocs is set to 8, which is not outrageous. I'm not sure if that matches the node nprocs (seems unlikely, all nodes have way more than 8 cores, right @davidkirwan?), or if there's some cgroup-aware smarts enabled here.

But also, the total number of threads was 74, which is a far cry from 600, and even farther than the default 1024 limit.

Hmm, I wonder if this is specific to some nodes only somehow, which would help explain why we don't hit this more often.

@davidkirwan Do all the nodes have the same specs?

@davidkirwan
Copy link

davidkirwan commented Apr 29, 2022

@jlebon nope, it's a bit of a hodge podge.

We've the following machines with the following physical cores, each can have 2-4 threads per core. It varies.
2 nodes with 48 cores
1 node with 8 cores
3 nodes with 16 cores.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants