Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing the astropy.table module prevents from setting a multiprocessing start method (caused by numcodecs in io.fits) #16351

Open
mboquien opened this issue Apr 27, 2024 · 14 comments

Comments

@mboquien
Copy link

Description

Importing the astropy.table triggers an exception if one uses multiprocessing.set_start_method() afterwards

Expected behavior

Importing the astropy.table does not trigger an exception if one uses multiprocessing.set_start_method() afterwards

How to Reproduce

The following code reproduces the issue.

import multiprocessing as mp
import astropy.table

mp.set_start_method('spawn')

It generates the following backtrace

Traceback (most recent call last):
  File "/home/mederic/scratch/start_method/test.py", line 4, in <module>
    mp.set_start_method('spawn')
  File "/usr/lib/python3.12/multiprocessing/context.py", line 247, in set_start_method
    raise RuntimeError('context has already been set')
RuntimeError: context has already been set

Versions

Linux-6.8.7-arch1-1-x86_64-with-glibc2.39
Python 3.12.3 (main, Apr 23 2024, 09:16:07) [GCC 13.2.1 20240417]
astropy 6.0.0
Numpy 1.26.4
pyerfa 2.0.1.1
Scipy 1.13.0
Matplotlib 3.8.3
@mboquien mboquien added the Bug label Apr 27, 2024
@mhvk
Copy link
Contributor

mhvk commented Apr 28, 2024

That's weird - I cannot reproduce this (on Debian, astropy 6.0.0, python 3.11.8; rest shouldn't matter). Could it be something with python 3.12?

@mboquien
Copy link
Author

In my case switching python 3.12 definitely triggers the bug. However I have been receiving sporadic reports of this issue with my code. While I was not always provided with the full information, one of the reporters had it on Python 3.11.8 on the win-amd64 platform.

@neutrinoceros
Copy link
Contributor

I couldn't reproduce either on macOS with Python 3.12.2 + astropy 7.0dev

@astrofrog
Copy link
Member

astrofrog commented Apr 28, 2024

@mboquien just out of curiosity do you have dask installed? What is the output of 'pip freeze'? (Just to see what other packages might matter)

@mboquien
Copy link
Author

Dask was installed but removing it does not solve the issue. To follow the idea, I have also removed many of the optional astropy packages, to no avail. I attach the list of packages if it can help.

@mboquien
Copy link
Author

I went down the rabbit hole a little more and I believe I have found the culprit in the form of the numcodecs package.

Now I am unsure whether the issue is with numcodecs itself or in the way astropy uses it.

@mhvk
Copy link
Contributor

mhvk commented Apr 28, 2024

Now I am unsure whether the issue is with numcodecs itself or in the way astropy uses it.

I don't think astropy uses it, not directly at least.
...
Just tries installing astropy in a clean environment, and I don't see numcodecs among the packages, not even when I do pip install astropy[all].

@mboquien
Copy link
Author

I looked at that with git grep and Astropy uses it optionally.

@mhvk
Copy link
Contributor

mhvk commented Apr 28, 2024

Ah, I'm silly, I did my git grep in an older branch I happened to be on... But at least we now know the culprit and can understand why @neutrinoceros and I could not reproduce it. One lesson perhaps is that we should not try and use packages that are not also listed in our dependencies. And if things really work without them, perhaps it is better just not to use them at all?

Anyway, back on topic: it would now seem that the bug is with numcodecs rather than with astropy, since in a clean environment with just numcodecs installed, I can reproduce your error:

>>> import multiprocessing as mp
>>> import numcodecs
>>> mp.set_start_method('spawn')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.11/multiprocessing/context.py", line 247, in set_start_method
    raise RuntimeError('context has already been set')
RuntimeError: context has already been set

@mboquien
Copy link
Author

Indeed, importing numcodecs directly triggers the issue, which I have reported to them. Hopefully it will be addressed soon. In the meantime I know what to do to work around the issue. This being said, as a former Astropy packager for Archlinux, I think numcodecs should officially be listed as an optional dependency. Hidden dependencies can make issues more difficult to debug. In any case, thank you very much for your help!

@neutrinoceros
Copy link
Contributor

Thanks for giving us proper motivation to go hunting for rogue optional dependencies in astropy, I've been meaning to do this systematically but lacked a concret reason to !

@astrofrog
Copy link
Member

Maybe worth trying out https://github.com/tweag/FawltyDeps !

@mhvk
Copy link
Contributor

mhvk commented Apr 28, 2024

@astrofrog - the particular use of numcodecs probably needs a bit of thought. From the comment, this was added to help people who use codecs elsewhere, which I think is a very good idea. But I think we do need to document and test it - and perhaps make it an explicit opt-in so we do not have to add it to astropy's dependencies - I'm not sure that we want dependencies where nothing we do depends on them, but rather we add functionality; that seems the wrong way around.

@pllim pllim changed the title Importing the astropy.table module prevents from setting a multiprocessing start method Importing the astropy.table module prevents from setting a multiprocessing start method (caused by numcodecs in io.fits) Apr 29, 2024
@pllim
Copy link
Member

pllim commented Apr 29, 2024

Re:

# If numcodecs is installed, we use Codec as a base class for the codecs below
# so that they can optionally be used as codecs in any package relying on
# numcodecs - however this is optional and if numcodecs is not installed we use
# our own base class. This does not affect any compressed data functionality
# in astropy.io.fits.
try:
from numcodecs.abc import Codec
except ImportError:
class Codec:
codec_id = None

Is it worth adding an option for user to disable numcodecs? 🤷‍♀️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants