Skip to content

Commit

Permalink
Calculate parallel jobs based on available CPUs
Browse files Browse the repository at this point in the history
We're going to enable new ARM64 runners in CI, where test-run is invoked
in a Docker container. At the same time, the runner is an LXD container
created by the service provider. In this circumstances, the Docker
container sees all the 128 online CPUs, but the runner may have
only some of them available (depending on the pricing plan).

Python 3.3+ has a function to determine available CPUs, so we can reduce
the parallelism to this value. We falls back to the online CPUs count on
Python < 3.3.

After this change, test-run follows a CPU affinity mask set by `taskset`
(and, I guess, by `numactl`).

The change is similar to replacing `nproc --all` to `nproc`.

The change only affects the default behavior, which can be overwritten
by passing the `--jobs` (or `-j`) CLI option or using the
`TEST_RUN_JOBS` environment variable.

Reported in tarantool/tarantool#10102 (see a
discussion thread).
  • Loading branch information
Totktonada committed Jun 8, 2024
1 parent 84ebae5 commit bec97bf
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 1 deletion.
26 changes: 26 additions & 0 deletions lib/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import time
import json
import subprocess
import multiprocessing
from lib.colorer import color_stdout

try:
Expand All @@ -31,6 +32,12 @@
# Python 2.7.
get_terminal_size = None

try:
# Python 3.3+
from os import sched_getaffinity
except ImportError:
sched_getaffinity = None

UNIX_SOCKET_LEN_LIMIT = 107

# Useful for very coarse version differentiation.
Expand Down Expand Up @@ -384,3 +391,22 @@ def terminal_columns():
if get_terminal_size:
return get_terminal_size().columns
return 80


def cpu_count():
"""
Return available CPU count available for the current process.
The result is the same as one from the `nproc` command.
It may be smaller than all the online CPUs count. For example,
an LDX container may have limited available CPUs or it may be
reduced by `taskset` or `numactl` commands.
If it is impossible to determine the available CPUs count (for
example on Python < 3.3), fallback to the all online CPUs
count.
"""
if sched_getaffinity:
return len(sched_getaffinity(0))
return multiprocessing.cpu_count()
3 changes: 2 additions & 1 deletion test-run.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@
from lib.colorer import color_stdout
from lib.colorer import separator
from lib.colorer import test_line
from lib.utils import cpu_count
from lib.utils import find_tags
from lib.utils import shlex_quote
from lib.error import TestRunInitError
Expand Down Expand Up @@ -86,7 +87,7 @@ def main_loop_parallel():
jobs = args.jobs
if jobs < 1:
# faster result I got was with 2 * cpu_count
jobs = 2 * multiprocessing.cpu_count()
jobs = 2 * cpu_count()

if jobs > 0:
color_stdout("Running in parallel with %d workers\n\n" % jobs,
Expand Down

0 comments on commit bec97bf

Please sign in to comment.