Please make sure these conditions are met
What happened?
Probably related to #3507. Posting it again since I could not find an existing solution. The workaround is to manually set NUMBA_THREADING_LAYER=workqueue.
Problem: When scanpy and torch are both loaded, sc.pp.normalize_total can crash (segmentation fault) with warning
OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
import scanpy as sc
import torch
from scipy.sparse import csr_matrix
X = csr_matrix(torch.eye(100).numpy())
ad = sc.AnnData(X=X)
sc.pp.normalize_total(ad, target_sum=1e4)
Densifying ad.X first before running sc.pp.normalize_total can rescue the crash. Also, importing numpy only will throw the same OMP message but does not lead to segfault crash.
Root cause: Numba's @njit with numba.prange (parallel execution) in _normalize_csr. Numba's default threading layer on macOS/M-chip is omp (LLVM OpenMP). On Apple Silicon, Numba 0.64.0 with OpenMP threading throws the OMP warning (omp_set_nested deprecated). When torch is also imported, this somehow leads to crashes with a segfault.
A potential issue may be libomp.dylib. sklearn bundles libomp.dylib with install name /DLC/sklearn/.dylibs/libomp.dylib, while torch ships its own libomp.dylib with install name /opt/llvm-openmp/lib/libomp.dylib. Both may get loaded as separate OMP runtimes, and the LLVM OMP runtime crashes on detecting a duplicate.
Potential fix: Set NUMBA_THREADING_LAYER=workqueue which uses Numba's own platform-agnostic thread pool (no OpenMP dependency)
export NUMBA_THREADING_LAYER=workqueue
Or alternatively,
import os; os.environ['NUMBA_THREADING_LAYER'] = 'workqueue'
import scanpy as sc
import torch
Minimal code sample
conda create -n scanpy_test python=3.12 r-base
conda activate scanpy_test
pip install scanpy==1.12 torch==2.11
The r-base installation somehow is required to replicate the behavior.
import scanpy as sc
import torch
from scipy.sparse import csr_matrix
X = csr_matrix(torch.eye(100).numpy())
ad = sc.AnnData(X=X)
sc.pp.normalize_total(ad, target_sum=1e4)
Error output
OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
[1] 72347 segmentation fault python
Versions
Details
scanpy 1.12
---- ----
packaging 26.0
donfig 0.8.1.post1
typing_extensions 4.15.0
joblib 1.5.3
anndata 0.12.10
kiwisolver 1.5.0
h5py 3.16.0
numpy 2.4.4
PyYAML 6.0.3
session-info2 0.4
natsort 8.4.0
cycler 0.12.1
legacy-api-wrap 1.5
six 1.17.0
threadpoolctl 3.6.0
fast-array-utils 1.4
scipy 1.17.1
python-dateutil 2.9.0.post0
pillow 12.1.1
numba 0.64.0
numcodecs 0.16.5
scikit-learn 1.8.0
setuptools 81.0.0
fsspec 2026.3.0
llvmlite 0.46.0
pandas 2.3.3
matplotlib 3.10.8
google-crc32c 1.8.0
pyparsing 3.3.2
pytz 2026.1.post1
zarr 3.1.6
---- ----
Python 3.12.13 | packaged by conda-forge | (main, Mar 5 2026, 17:06:14) [Clang 19.1.7 ]
OS macOS-26.4-arm64-arm-64bit
CPU 8 logical CPU cores, arm
GPU No GPU found
Updated 2026-04-01 02:04
Please make sure these conditions are met
What happened?
Probably related to #3507. Posting it again since I could not find an existing solution. The workaround is to manually set
NUMBA_THREADING_LAYER=workqueue.Problem: When scanpy and torch are both loaded,
sc.pp.normalize_totalcan crash (segmentation fault) with warningDensifying
ad.Xfirst before runningsc.pp.normalize_totalcan rescue the crash. Also, importing numpy only will throw the same OMP message but does not lead to segfault crash.Root cause: Numba's
@njitwithnumba.prange(parallel execution) in_normalize_csr. Numba's default threading layer on macOS/M-chip isomp(LLVM OpenMP). On Apple Silicon, Numba 0.64.0 with OpenMP threading throws the OMP warning (omp_set_nested deprecated). When torch is also imported, this somehow leads to crashes with a segfault.A potential issue may be
libomp.dylib. sklearn bundles libomp.dylib with install name/DLC/sklearn/.dylibs/libomp.dylib, while torch ships its own libomp.dylib with install name/opt/llvm-openmp/lib/libomp.dylib. Both may get loaded as separate OMP runtimes, and the LLVM OMP runtime crashes on detecting a duplicate.Potential fix: Set
NUMBA_THREADING_LAYER=workqueuewhich uses Numba's own platform-agnostic thread pool (no OpenMP dependency)export NUMBA_THREADING_LAYER=workqueueOr alternatively,
Minimal code sample
The
r-baseinstallation somehow is required to replicate the behavior.Error output
Versions
Details