-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Description
Hey, I'm still seeing segfaults when doing numpy.matmul with two big matrices (numpy v1.20.1, OpenBLAS v0.3.13.dev).
This looks like potentially related to #2728 ?
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007fd34c490fa3, pid=1, tid=0x00007fd34f134740
#
# JRE version: OpenJDK Runtime Environment (8.0_242-b08) (build 1.8.0_242-b08)
# Java VM: OpenJDK 64-Bit Server VM (25.242-b08 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libopenblasp-r0-5bebc122.3.13.dev.so+0xe77fa3] dgemm_oncopy_HASWELL+0x193
The stacktrace points to a line where it does sth. like np.matmul(samples, samples.T).
I was running the code in a docker container (enterprise environment), where NumPy was installed using pip.
Here's the spec of the compute cluster, from which 6 CPUs were allocated to the container.

threadpool_info via threadpoolctl shows the following, which confirms it was OpenBLAS v0.3.13.dev and that num_threads was correctly recognized to be 6.
[{'filepath': '/job/.local/lib/python3.7/site-packages/numpy.libs/libopenblasp-r0-5bebc122.3.13.dev.so',
'internal_api': 'openblas',
'num_threads': 6,
'prefix': 'libopenblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.13.dev'},
{'filepath': '/job/.local/lib/python3.7/site-packages/torch/lib/libgomp-7c85b1e2.so.1',
'internal_api': 'openmp',
'num_threads': 6,
'prefix': 'libgomp',
'user_api': 'openmp',
'version': None},
{'filepath': '/job/.local/lib/python3.7/site-packages/scipy.libs/libopenblasp-r0-085ca80a.3.9.so',
'internal_api': 'openblas',
'num_threads': 6,
'prefix': 'libopenblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.9'}]
Let me know if you need any other information!
Metadata
Metadata
Assignees
Labels
No labels