Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing pyfftw and diplib in the same process can lead to segfaults #130

Closed
anntzer opened this issue Jul 12, 2023 · 19 comments
Closed

Importing pyfftw and diplib in the same process can lead to segfaults #130

anntzer opened this issue Jul 12, 2023 · 19 comments
Labels
bug component:PyDIP About the PyDIP Python interface

Comments

@anntzer
Copy link
Contributor

anntzer commented Jul 12, 2023

Component
PyDIP 3.4.0.

Describe the bug
Importing pyfftw and diplib in the same process, and having diplib try to access its own internal fftw library, can lead to segfaults.

To Reproduce

$ python -c 'import pyfftw, numpy as np, diplib as dip; dip.GaussFT(np.zeros((139, 217)), 3, [1, 1])'
Fatal Python error: Segmentation fault

(the array size above is just some awkward numbers to trigger the use of fftw)

System information:

  • What OS do you have? macSO 13.4.1
  • If PyDIP, what version of Python do you have? 3.11
  • If DIPimage, what version of MATLAB do you have? (type ver at the MATLAB command prompt, and copy-paste the result here). N/A
  • If compiled from sources, what compiler and version did you use? N/A

Note: pyFFTW was installed from their git HEAD (pip install git+https://github.com/pyFFTW/pyFFTW) as their latest release does not support py3.11 AFAICT.

I suspect we end up with two copies of fftw with incompatible ABIs which step onto another...

@anntzer anntzer added the bug label Jul 12, 2023
@crisluengo
Copy link
Member

This is interesting. You're using the version of DIPlib on PyPI? It doesn't use FFTW at all! I don't understand what could be causing this. I would expect things like this if they both dynamically link to a different version of FFTW, which you could fix by ensuring that both use the same dynamic library.

I will have to build pyFFTW myself to test this, since there's no version for Mac aarch64 on PyPI.

@crisluengo crisluengo added the component:PyDIP About the PyDIP Python interface label Jul 12, 2023
@anntzer
Copy link
Contributor Author

anntzer commented Jul 12, 2023

Yes, this is with pypi's diplib. For example, I can repro this by creating a clean conda environment conda create -n testenv python=3.11 fftw (configured to use conda-forge), activating it, and then installing everything from pypi (pip install diplib git+https://github.com/pyFFTW/pyFFTW).

@wcaarls
Copy link
Member

wcaarls commented Jul 12, 2023

Following these steps

conda create -n testenv python=3.11 fftw
conda activate testenv
pip install diplib git+https://github.com/pyFFTW/pyFFTW
python -c 'import pyfftw, numpy as np, diplib as dip; dip.GaussFT(np.zeros((139, 217)), 3, [1, 1])'

I do not get an error message on Ubuntu 22.04.2 LTS.

@emmenlau
Copy link

I just came across this thread at random, but wanted to share: even if not both are using FFTW, there can be incompatibilities with libraries that emulate FFTW. This is done, for example, by MKL (using the FFTW-Wrappers) and CUDA (to some extent). Does this help?

@crisluengo
Copy link
Member

crisluengo commented Jul 12, 2023

@emmenlau Thanks! If DIPlib doesn't link to FFTW, it uses PocketFFT, which is linked statically and is written in C++, so cannot have name clashes with C functions in FFTW.

BTW: NumPy and SciPy both also use PocketFFT, and I've never had an issue with either of them because they both statically link it.

@crisluengo
Copy link
Member

I have finally been able to build pyFFTW (for some reason, the include directory was confusing pip, so I had to move that directory -- I don't think this affects the resulting installed package though). I cannot reproduce this issue on my M1 Mac (Python 3.11.3, MacOS 12.6.7). So this is either exclusive to Intel Macs, or related to the MacOS 13 (unlikely), or related to Conda (which I don't use).

@anntzer Just to be sure, if you don't import pyfftw, it all works as expected?

@anntzer
Copy link
Contributor Author

anntzer commented Jul 12, 2023

@anntzer Just to be sure, if you don't import pyfftw, it all works as expected?

Yes, even importing pyfftw after diplib makes the issue go away.

related to Conda (which I don't use).

Ah, good point: doing this with homebrew's python 3.11 make the problem go away. Curious...

@crisluengo
Copy link
Member

doing this with homebrew's python 3.11 make the problem go away.

The plot thickens!

I will have to find some time to install Conda and seeing if I can reproduce this issue on my end.

Are you using the Conda version of DIPlib? Could that fix the incompatibility?

Someone here claims that it's not recommended to use pip install in a Conda environment, but doesn't explain why.

@anntzer
Copy link
Contributor Author

anntzer commented Jul 12, 2023

Are you using the Conda version of DIPlib?

AFAICT it is not available for M1 macs(?)

@crisluengo
Copy link
Member

crisluengo commented Jul 20, 2023

I installed miniconda, and tried installing pyFFTW in a conda environment. Nothing I tried has worked. Homebrew's FFTW is the wrong OS version (?), and it wouldn't find Conda's FFTW.

Anyway, if you have an FFTW that uses OpenMP, it might be a problem with two different OpenMP libraries loaded. DIPlib is distributed with its own copy of libomp.dylib.

Your example code loads FFTW first, meaning it loads whatever version of OpenMP it uses. DIPlib is loaded later and crashes when trying to use a function that uses OpenMP.

If you load the packages on the other order, it's DIPlib's version of OpenMP that is loaded, and the DIPlib function works fine. But does a pyFFTW function work in this case?

@anntzer
Copy link
Contributor Author

anntzer commented Jul 20, 2023

I installed miniconda, and tried installing pyFFTW in a conda environment. Nothing I tried has worked. Homebrew's FFTW is the wrong OS version (?), and it wouldn't find Conda's FFTW.

The following works for me:

C_INCLUDE_PATH=/opt/homebrew/include LIBRARY_PATH=/opt/homebrew/lib pip install git+https://github.com/rmorshea/pyFFTW@patch-1

(Normally C_INCLUDE_PATH=... LIBRARY_PATH=... pip install git+https://github.com/pyFFTW/pyFFTW would be enough (only pyFFTW HEAD is compatible with python 3.11 hence the install from git) but the very recent release of Cython 3 broke even that so you need to install from pyFFTW/pyFFTW#359.)

If it doesn't (and you have homebrew fftw installed), can you post the error message?

Good point re: openmp, I haven't actually extensively tried running pyfftw after that (incl. trying to trigger some openmp-related code); I'll report once I get a chance to try.

@crisluengo
Copy link
Member

Thank you! That worked.

Indeed, pyFFTW now links to 6 different FFTW libraries: for single, double and long floats, and for each of those, a serial and an OpenMP version. The OpenMP versions link to GGC's OpenMP library (/opt/homebrew/opt/gcc/lib/gcc/current/libgomp.1.dylib). DIPlib links against CLang's OpenMP library (/opt/homebrew/opt/libomp/lib/libomp.dylib, which was copied into the wheel for distribution).

I'm fairly certain that this is what you are running into.

Unfortunately, it seems I'm having no trouble using both of these at the same time. I still cannot reproduce the crash!

Is it possible that these two OpenMP libraries don't interfere with each other at all, and you're seeing a different problem?

I'm on macOS 12, maybe that makes a difference?

@anntzer
Copy link
Contributor Author

anntzer commented Jul 22, 2023

(I am on macos 13.4.1.)
Indeed, building pyfftw with PYFFTW_USE_PTHREADS (pthreads instead of openmp) fixes the issue. https://pypackaging-native.github.io/key-issues/native-dependencies/blas_openmp/ looks like an interesting read; I guess there isn't really any solution except perhaps for diplib to also provide an option (not necessarily to change the default) to be built without openmp either.

@crisluengo
Copy link
Member

crisluengo commented Jul 22, 2023

You can build DIPlib without OpenMP by setting DIP_ENABLE_MULTITHREADING to off in CMake (see https://diplib.org/diplib-docs/building_cmake.html#cmake_variables). DIPlib doesn’t offer different multithreading methods like FFTW does, the point of using OpenMP is that it makes multithreading easier.

But we can’t really offer no-OpenMP wheels on PyPI. That would double the number of builds to make and complicate deployment. And I’m guessing few people would want to install a slower version of purpose.

For people willing to build it themselves, it is possible to ensure all packages use the same OpenMP version.

It does however make sense to add some warnings regarding vendored OpenMP and potential incompatibilities between packages because of that. Just because you’re the first to bring this to our attention doesn’t mean you’re the first to run into this issue — thank you for reporting!

@anntzer
Copy link
Contributor Author

anntzer commented Jul 22, 2023

Thanks for your patience and helpfulness in investigating what is a rather obscure situation!

Actually I remembered that threadpoolctl (https://pypi.org/project/threadpoolctl/) is very useful in these situations; upon further investigation I realized that I had things even more messed up in my conda environment: I had (some time ago) also installed fftw via conda; my build of pyfftw did not actually link it but (as shown by python -mthreadpoolctl -i pyfftw) it still picked up conda's llvm-openmp (which comes with conda's fftw) and I now suspect that the bad collision was between conda's llvm-openmp and diplib's vendored openmp (python -mthreadpoolctl -i diplib). Indeed, even just uninstalling conda's fftw and llvm-openmp and rebuilding pyfftw without disabling openmp leads to pyfftw using homebrew's libgomp, which does not cause problems with diplib's libomp. (https://github.com/joblib/threadpoolctl/blob/master/multiple_openmp.md is also useful.)

So I guess if you want to be extra-safe one option would be to add a dependency on threadpoolctl (which is pure python and has no extra dependencies) and check before importing the main library that there isn't already "another" libomp loaded (and I guess pyfftw could do something similar), but I'd totally also understand if you just close this as "too obscure".

@crisluengo
Copy link
Member

From the page linked above, copied here for future reference:

The only unrecoverable incompatibility we encountered happens when loading a mix of compiled extensions linked with libomp (LLVM/Clang) and libiomp (ICC), on Linux and macOS, manifested by crashes or deadlocks. It can happen even with the simplest OpenMP calls like getting the maximum number of threads that will be used in a subsequent parallel region. A possible explanation is that libomp is actually a fork of libiomp causing name colliding for instance. Using threadpoolctl may crash your program in such a setting.

Fortunately this problem is very rare: at the time of writing, all major binary distributions of Python packages for Linux use either GCC or ICC to build the Python scientific packages. Therefore this problem would only happen if some packagers decide to start shipping Python packages built with LLVM/Clang instead of GCC (this is the case for instance with conda's default channel).

@crisluengo
Copy link
Member

I made a note on the DIPlib home page, and (see the commit referenced above) on the README file that will be shown on PyPI with the next release.

I think that is all we can do for now. threadpoolctl is a useful tool, but I'm not sure it would help us. Detecting there's another OpenMP library running won't tell us if the combined use could lead to issues. Would we, only on macOS, try to detect an already loaded libiomp? (since on Linux we use libgomp, and on Windows it's whatever MS library comes with the system). And what do we do if we detect the library? There's no way to not load the OpenMP library for a DIPlib that was compiled with OpenMP...

@anntzer
Copy link
Contributor Author

anntzer commented Aug 1, 2023

I think the note is good enough; at least it gives something to google at, and I think most people running into these weird incompatibilities are somewhat likely to be building at least some of the packages themselves and more likely to be able to figure out the problem based on your writeup.
Thanks!

And what do we do if we detect the library? There's no way to not load the OpenMP library for a DIPlib that was compiled with OpenMP...

Assuming we could really pre-detect the incompatibility (which I agree seems to complex to be worth it), I guess you could just refuse to import diplib at all (and throw an ImportError with a suitable descriptive error message).

@crisluengo
Copy link
Member

If this becomes a more common issue, I will put in the effort for a better error message. It's a good idea. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug component:PyDIP About the PyDIP Python interface
Projects
None yet
Development

No branches or pull requests

4 participants