Description
The current implementation of -n auto
is not working as hoped on high-performance computing environments where processes are assigned a number of cores they can use. For example, when running a PyTest job with 9 cores requested in the job submission on a compute node with 36 cores, pytest-xdist
runs 36 processes. Ideally, it should use 9.
This is related to the logic in def pytest_xdist_auto_num_workers(...)
in src/xdist/plugin.py
. This function first tries the psutil
package, and only then falls back to os.sched_getaffinity
, which gives the correct number. If one has accidentally psutil
installed (difficult to avoid), the autodetection does not produce the best answer.
For your information, in the scenario sketched above, these are the results of various functions to get the number of CPU cores:
>>> len(os.sched_getaffinity(0))
9
>>> len(psutil.Process().cpu_affinity())
9
>>> psutil.cpu_count(logical=True)
36
>>> psutil.cpu_count(logical=False)
36
>>> os.cpu_count()
36
>>> multiprocessing.cpu_count()
36
The function os.sched_getaffinity
was introduced in Python 3.3, older than the oldest supported version by pytest-xdist. As far as I understand, this function is not available in all environments. (Unclear to me, I cannot test on other OSes.) According to documentation, len(psutil.Process().cpu_affinity())
should at least work on Linux and Windows. There may still be a need to fall back to other functions. Trying them in the order listed above seems reasonable.
This suggestion may interfere with the option config.option.numprocesses
. In compute environments, the option is not so relevant because the number of cores is managed by the queueing system. (Also, hyperthreading is often disabled in such scenarios because it degrades raw compute performance. It mainly helps for io-bound workloads.)