New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: distutils: make msvc + mingw-gfortran work #9431

Merged
merged 8 commits into from Sep 2, 2017

Conversation

Projects
None yet
6 participants
@ghost

ghost commented Jul 17, 2017

Enable numpy.distutils to link mingw-w64-gfortran compiled files to MSVC objects,
by putting everything gfortran to separate DLL files to avoid CRT issues.

On installation, the extra DLL files are stuffed to an extra-dll subdirectory
of the package, which is added to PATH so that Windows can load the files.
The filenames of the libraries contain a sha1 hash of the input object files,
so DLL hell should be avoided (barring sha1 hash collisions).

This is a bit of a hack, but it is limited to the "msvc/gnu95" special case.
Moreover, in addition to mingwpy (which may still need further work?),
this is the currently the only working easily available compiler combination
to build Python.org compatible binaries for Windows.

To handle static libraries, we need to add a concept of a "fake" static library
to numpy.distutils. This is just a collection of object files whose filenames are
listed in *.cobjects and *.fobjects files. This is necessary, because MSVC
cannot build linkable static libraries out of gfortran objects, so the linking
cannot be done in the build_clib stage but has to be postponed to build_ext.
(And in distutils, the only good way to share information between stages
is to store it on disk).

Limitations: fortran code that needs symbols from C code provided by the
project cannot be compiled like --- this would require using two C compilers
(gcc + msvc) and this is getting too complicated
import libraries in also
to the other direction, so it's not supported currently (will fail with undefined
symbol errors at link time). However, the typical case of wrapping Fortran
codes to Python doesn't need this.

Moreover, if linking against BLAS/LAPACK, that must be provided in
a gfortran-compatible import library. The same machinery then will
automatically build a wrapper DLL for it for MSVC if needed.

Scipy Appveyor builds & wheels enabled by this:
scipy/scipy#7616
https://ci.appveyor.com/project/scipy/scipy

@matthew-brett

This comment has been minimized.

Contributor

matthew-brett commented Jul 17, 2017

First of a few cross-references to comments in the previous PR:

"""
@rgommers The other issue that I am unsure how to address is how to gracefully handle DLL dependencies . Do you have any ideas on this?"
"""

@xoviat - I guess you mean, for each scipy extension module, telling disutils which Fortran DLLs they depend on?

@matthew-brett

This comment has been minimized.

Contributor

matthew-brett commented Jul 17, 2017

Comments on mechanism for specifying library order:

Approach here is to list all the scipy Fortran libraries inside numpy distutils. It would be better to specify them inside Scipy, perhaps (@pv) in setup.py. But we need to take care of ordering of libraries. Do I understand right, that this will need changes to numpy distutils?

@ghost

This comment has been minimized.

ghost commented Jul 17, 2017

@xoviat - I guess you mean, for each scipy extension module, telling disutils which Fortran DLLs they depend on?

Correct. It's not just the DLLs that we make, but the DLLs in the MinGW folder too.

@matthew-brett

This comment has been minimized.

Contributor

matthew-brett commented Jul 17, 2017

Hmm. For the DLLs in the mingw folder, how about using mingwpy or one of the newer static mingw variants that @carlkl has been building? I'm worried about putting the mingw DLLs on the path, with their current names, I guess we could easily run into a DLL clash with other code linking against these DLLs.

Similarly, I wonder if there's a way to keeping the new Fortran DLL names unique, maybe with a hash of their object file contents? Just in case something else is using these DLL names?

@ghost

This comment has been minimized.

ghost commented Jul 17, 2017

There are no issues with the fortran DLLs that we make clashing. In addition, the PATH is only modified for scipy, so there should not be clashes (for the mingw DLLs) if scipy doesn't spawn subprocesses.

@ghost

This comment has been minimized.

ghost commented Jul 17, 2017

Also I don't think gfortran is called with -O3. We should at least try to run as fast as possible.

@matthew-brett

This comment has been minimized.

Contributor

matthew-brett commented Jul 17, 2017

I am happy to be corrected, but I believe that we'll run into trouble if any Python module imports a DLL with the same name, either before Scipy imports (Scipy will get the wrong DLL) or after it imports (the other module will get the wrong DLL). It might not be happening now, but it's difficult to predict if it will happen in the future.

@ghost

This comment has been minimized.

ghost commented Jul 17, 2017

Yes, if it the same DLL. However, I do believe that the MinGW DLLs have versioned names.

@rgommers

This comment has been minimized.

Member

rgommers commented Jul 18, 2017

if scipy doesn't spawn subprocesses.

SciPy itself spawns only subprocesses in the test suite AFAIK.

@rgommers

This comment has been minimized.

Member

rgommers commented Jul 18, 2017

Approach here is to list all the scipy Fortran libraries inside numpy distutils. It would be better to specify them inside Scipy, perhaps (@pv) in setup.py.

Agreed that it should be in scipy. Preferably outside setup.py I would think - perhaps in a simple plain text file in a format that numpy.distutils can read reliably when it needs to?

@pv

This comment has been minimized.

Member

pv commented Jul 18, 2017

@matthew-brett

This comment has been minimized.

Contributor

matthew-brett commented Jul 18, 2017

DLLs again - just to be clear, my understanding is that DLL names are global to the process. If package P imports DLL foo, then, if package Q imports DLL foo, it will get the foo that P imported. Hence the problem about DLL names - it's very easy to generate a clash, and for the clash to cause weird and hard to debug problems.

@carlkl

This comment has been minimized.

Contributor

carlkl commented Jul 18, 2017

import

all DLLs can be imported into a process only once, selected by the filename of the DLL. Windows DLL search order is: first Windows system libraries, then executable (https://github.com/numpy/numpy/wiki/windows-dll-notes).

Please forget to try the change the DLL search path with the PATH environment variable. This won't work in all cases. I can't recommend the approach of scipy/scipy#7613 at all.

@carlkl carlkl referenced this pull request Jul 18, 2017

Merged

Add distributor init file #7613

@matthew-brett

This comment has been minimized.

Contributor

matthew-brett commented Jul 18, 2017

@carlkl - just to clarify again (I know you meant this) but, for Windows, once you've imported c:\a_path\foo.dll then any attempt to import something with basename foo.dll will get the contents of c:\a_path\foo.dll - for example, loading c:\another_path\foo.dll will in fact get c:\a_path\foo.dll.

@matthew-brett

This comment has been minimized.

Contributor

matthew-brett commented Jul 18, 2017

@carlkl - can you say more about when adding the library directory to the PATH will not work? And why you can't recommend it? Did you have a chance to read https://github.com/pypa/wheel-builders/pull/2/files#diff-37a1ef08dd1b0096d769190274121258R378 ?

@njsmith

This comment has been minimized.

Member

njsmith commented Jul 18, 2017

If DLL name collisions are a concern, then Mach-O Mach-O Mangler can help you give them unique names. Basically it's the same trick that auditwheel uses patchelf for, but on Windows: give the DLLs a unique name, and then patch the binaries that refer to them to look for the name name. Example here

I'd generally recommend giving all DLLs unique names, and then using either the LoadLibrary trick or PATH manipulation to load them. The advantage of LoadLibrary is that it's very local. The advantage of manipulating PATH is that you can do it once and then the libraries are loaded on demand as needed, and you can even run native .exe's that are linked against your uniquified names (that's why it shows up in that pynativelib thing) -- but there's also more potential for making mistakes and accidentally affecting the load of libraries you didn't mean to. If you give your DLLs unique names and stick them in an otherwise empty directory then either is pretty safe AFAIK.

@matthew-brett

This comment has been minimized.

Contributor

matthew-brett commented Jul 18, 2017

Nathaniel - what is the LoadLibrary trick?

@carlkl

This comment has been minimized.

Contributor

carlkl commented Jul 18, 2017

Adding the folder to PATH may not work:

  • The PATH is more or less the lowest place in the Windows DLL search order.
  • there are corner cases like frozen executables, virtual env, multiprocessing and so on you have to consider.
@matthew-brett

This comment has been minimized.

Contributor

matthew-brett commented Jul 18, 2017

@carlkl - could you be more specific about the problems you see? I guess, if there were somehow two different scipies being imported in the same process this could be a problem - is that what you mean? In that case - what approach are you suggesting?

@njsmith

This comment has been minimized.

Member

njsmith commented Jul 18, 2017

@matthew-brett: if you call LoadLibrary("c:\whatever\absolute\path\to\foo.dll"), then that file gets loaded as foo.dll (assuming there isn't already a foo.dll loaded), and now if you try to load a library that says it wants foo.dll, it uses that one. Even if c:\whatever\absolute\path\to was never on any search path.

@carlkl: the search order doesn't matter if your dll is named like gfortran-lib-3409a0bef0139.dll, where the gibberish is a hash of the file contents; as long as you can find it at all, then it will definitely be the one you were looking for.

@carlkl

This comment has been minimized.

Contributor

carlkl commented Jul 18, 2017

@matthew-brett, @njsmith

using LoadLibrary with an absolute PATH means you have to ensure to get the PATH right. I.e.: you also have to consider frozen executables as well.

@matthew-brett

This comment has been minimized.

Contributor

matthew-brett commented Jul 18, 2017

@carllk - sorry to be dumb, but could you unpack for me, what approach you are suggesting, instead of setting the PATH (or loading the library by full path with LoadLibrary)?

@njsmith

This comment has been minimized.

Member

njsmith commented Jul 18, 2017

@carlkl: frozen executables are a pretty unusual case, and handling them likely requires some sort of ad hoc integration with whatever hacks they're using. I don't think we should worry about them until after the basic functionality is working.

@carlkl

This comment has been minimized.

Contributor

carlkl commented Jul 18, 2017

@matthew-brett @njsmith ,

please test the scipy wheels (scipy/scipy#7597 (comment)). No PATH manipulation is needed, as all scipy pyd binaries are distributed in the right folders. You also need to install numpy from here: https://ci.appveyor.com/project/matthew-brett/windows-wheel-builder/build/1.0.57

@pv

This comment has been minimized.

Member

pv commented Jul 18, 2017

@carlkl

This comment has been minimized.

Contributor

carlkl commented Jul 18, 2017

  1. I patched mingw32ccompiler.py and gnu.py from my local numpy installation. Essentially I added compilerflags for mingw.
  2. For cp35 and cp36 I copied npymath.lib from the numpy-cp34 build as libnpymath.a into numpy/core/lib.

The reason for 2. is, that npymath.lib created by VS2015 is not readable by binutils. I believe this is due to the fact, that this library is a mixture of static and dynamic import library. Using a pure static import library with the name libnpymath.a ensures, that mingwpy picks up a readable library and VS picks up its own npymath.lib library.

Scipy is unpatched and build by (still unpublished) gcc-5.3 version.

In short words: the recipe is not ready due to the fact, that I still get strange build errors for 32bit builds as well as cp34 and cp36 for 64 bit.

@pv pv changed the title from distutils: make msvc + mingw-gfortran work to ENH: distutils: make msvc + mingw-gfortran work Aug 7, 2017

@ghost

This comment has been minimized.

ghost commented Aug 7, 2017

I will squash the commits.

@ghost

This comment has been minimized.

ghost commented Aug 8, 2017

/cc @rgommers

# the following code is not needed (read: breaks) when using MinGW
# in case want to link F77 compiled code with MSVC
opt.append('gcc')
runtime_lib = msvc_runtime_library()

This comment has been minimized.

@rgommers

rgommers Aug 9, 2017

Member

Question: this change is in the class for g77, so is this a cleanup that looks right or have you been using g77?

Comments:

  • the code comment above about MinGW looks like it applied to these lines and should be removed.
  • the import of msvc_runtime_library is now unused and should be removed.

This comment has been minimized.

@ghost

ghost Aug 9, 2017

These lines actually had a signfiicant impact becuase the class is inherited. It's not just a "cleanup"

This comment has been minimized.

@rgommers

rgommers Aug 9, 2017

Member

Ah right, clear.

'linker_so': ["<F90>", "-Wall", "-g"],
'archiver': ["ar", "-cr"],
'ranlib': ["ranlib"],
'linker_exe': [None, "-Wall"]

This comment has been minimized.

@rgommers

rgommers Aug 9, 2017

Member

this formatting change is for the worse, the old version was easier to read - can you undo?

This comment has been minimized.

@ghost

ghost Aug 9, 2017

I will revert this section.

@rgommers

This comment has been minimized.

Member

rgommers commented Aug 9, 2017

/cc @rgommers

I had a quick read through. Did not see anything critical that should prevent merging or break other configs. Traveling at the moment, so haven't been able to follow all the discussions on Windows builds in detail.

@ghost

This comment has been minimized.

ghost commented Aug 10, 2017

FYI.

Limitations: fortran code that needs symbols from C code provided by the
project cannot be compiled like --- this would require using two C compilers
(gcc + msvc) and this is getting too complicated, so it's not supported
(will fail with undefined symbol errors at link time). However, the typical
case of wrapping Fortran codes to Python doesn't need this.

This is actually not correct. Supporting this (which I'm not going to get into this point) would require one additional call to lib.exe in build_ext. It is something that could be implemented with moderate additional effort. See here for more details

@ghost ghost referenced this pull request Aug 17, 2017

Closed

Moving to OpenBLAS #9579

@charris

This comment has been minimized.

Member

charris commented Aug 27, 2017

Needs a release note in doc/release/1.14.0-notes.rst.

@pv

This comment has been minimized.

Member

pv commented Sep 2, 2017

Release notes added. There was a merge conflict, so I rebased and re-pushed this.

xoviat and others added some commits Jul 22, 2017

distutils: gnu: patch fcompile
This allows mingw's gfortran to work with MSVC. DLLs are
autogenerated using heuristics that should work with most
cases. In addition, a libopenblas DLL is compiled from
the static lib for use with MSVC.

All generated DLLs have randomized names so that no clashes
will occur.
distutils: handle unlinkable object files in build_clib/build_ext, no…
…t gnu

Add concept of unlinkable Fortran object files on the level of
build_clib/build_ext.

Make build_clib generate fake static libs when unlinkable object files are
present, postponing the actual linkage to build_ext.

This enables MSVC+gfortran DLL chaining to only involve those DLLs that
are actually necessary for each .pyd file, rather than linking everything
in to every file. Linking everything to everywhere has issues due to
potential symbol clashes and the fact that library build order is
unspecified.

Record shared_libs on disk instead of in system_info. This is necessary
for partial builds -- it is not guaranteed the compiler is actually called
for all of the DLL files.

Remove magic from openblas msvc detection. That this worked previously
relied on the side effect that the generated openblas DLL would be added
to shared_libs, and then being linked to all generated outputs.
@ghost

This comment has been minimized.

ghost commented Sep 2, 2017

Thanks so much for taking care of the merge conflicts.

@charris charris merged commit 447e7e4 into numpy:master Sep 2, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@charris

This comment has been minimized.

Member

charris commented Sep 2, 2017

Lets get this in. If nothing else it is a good starting point for further fixes if needed. Thanks @xoviat and the reviewers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment