Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Importing numpy interferes with signal masking #23170

Open
SeppMe opened this issue Feb 7, 2023 · 1 comment
Open

BUG: Importing numpy interferes with signal masking #23170

SeppMe opened this issue Feb 7, 2023 · 1 comment
Labels

Comments

@SeppMe
Copy link

SeppMe commented Feb 7, 2023

Describe the issue:

Problem

If numpy is imported before a Posix signal is blocked by setting a signal mask, the signal will not actually be blocked. If, however, numpy is imported after the signal is blocked, it works as intended.

Steps to reproduce

# Code demonstrating the issue
import numpy
import signal
import time

signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGUSR1})
for i in range(200):
    print(os.getpid(), signal.sigpending())
    time.sleep(2)
  1. Run the script
  2. While the script is still running, send SIGUSR1 to your Python process (kill -s SIGUSR1 {the shown pid})

Expectation: The output should change from "7654321 set()" to "7654321 {<Signals.SIGUSR1: 10>}" (if the pid is 7654321) because the signal is blocked but pending
Actual result: The Python process is terminated with the message "User defined signal 1", because the default signal handler is triggered (which terminates the Python process in case of an uncaught SIGUSR1)

The problem does not arise if the numpy import occurs after the signal block:

# Code without the issue
import signal
signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGUSR1})
import numpy
import time

for i in range(200):
    print(os.getpid(), signal.sigpending())
    time.sleep(2)

Steps: See above
Expectation: See above
Result: The expectation is met!

More observations

  • The issue was reproduced on multiple machines using multiple different numpy versions. Oldest tested version was 1.16.4, newest was 1.24.1
  • The same behavior also occurs if a custom signal handler is installed. The custom handler will execute immediately despite the signal being blocked.
  • This behavior can be reproduced for other blockable signals as well.
  • Interaction with multiprocessing: If the process is forked after the numpy import, the problem does not occur, for neither process.
# Multiprocessing demo. Code does not show the issue despite the early numpy import
import numpy
import signal
import time
import os

pid = os.fork()
if pid > 0:
    signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGUSR1})

for i in range(200):
    print(os.getpid(), signal.sigpending())
    time.sleep(2)

Send SIGUSR1 to either process, the signal will correctly show as pending.

Reproduce the code example:

import numpy
import signal
import time

signal.pthread_sigmask(signal.SIG_BLOCK, {signal.SIGUSR1})
for i in range(200):
    print(os.getpid(), signal.sigpending())
    time.sleep(2)

# 1. Run the script
# 2. While the script is still running, send SIGUSR1 to your Python process (`kill -s SIGUSR1 {the shown pid}`)

Error message:

No response

Runtime information:

Newest tested system

1.24.1
3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:26:04) [GCC 10.4.0]

[{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
                      'found': ['SSSE3',
                                'SSE41',
                                'POPCNT',
                                'SSE42',
                                'AVX',
                                'F16C',
                                'FMA3',
                                'AVX2'],
                      'not_found': ['AVX512F',
                                    'AVX512CD',
                                    'AVX512_KNL',
                                    'AVX512_KNM',
                                    'AVX512_SKX',
                                    'AVX512_CLX',
                                    'AVX512_CNL',
                                    'AVX512_ICL']}},
 {'architecture': 'Haswell',
  'filepath': '/var/SP/anaconda3/envs/env_3.10.8_e41a7bcc_e69a1b96/lib/libopenblasp-r0.3.21.so',
  'internal_api': 'openblas',
  'num_threads': 20,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.21'}]

The problem could also be reproduced using older numpy versions on another computers, all showing the same behavior. Oldest tested version was:

1.16.4
3.7.3 | packaged by conda-forge | (default, Jul  1 2019, 21:52:21)
[GCC 7.3.0]

Context for the issue:

The issue prevents applications from using advanced inter-process signaling techniques, unless special care is taken with regard to import order.

@SeppMe SeppMe added the 00 - Bug label Feb 7, 2023
@seberg
Copy link
Member

seberg commented Feb 8, 2023

I can reproduce it but things work if I run with export OMP_NUM_THREADS=1 (would set the num_threads openblas information output to 1). However, for me that only makes it work if I am not using ipython.

I don't really think NumPy does any munching with signals? Is there a way to track down who is modifying the signal handler? (if just running in gdb/lldb with the correct breakpoint). I really doubt it is within NumPy itself, but might be nice to see if there is an obvious solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants