Skip to content

Latest commit

 

History

History
613 lines (462 loc) · 19.3 KB

binary-extension.rst

File metadata and controls

613 lines (462 loc) · 19.3 KB

Binary Extension

Note

This content was last updated October 19, 2018 (as part of the 0.9.0 release). Much of the content is tested automatically to keep from getting stale, but some of the console code blocks are not. As a result, this material may be out of date. If anything does not seem correct --- or even if the explanation is insufficient --- please file an issue.

The bezier Python package has optional speedups that wrap the libbezier library <../abi/index>. These are incorporated into the Python interface via Cython as a binary extension.

After the bezier Python package has been installed with these speedups, the library provides helpers to make it easier to build non-Python code that depends on libbezier.

C Headers

The C headers for libbezier will be included in the installed package

show-headers, show-lib, show-dll, macos-dylibs

import os import textwrap

import bezier

class Path(object):

"""This class is a hack for Windows.

It wraps a simple string but prints / repr-s it with Windows path separator converted to the standard *nix separator.

This way doctests will succeed on Windows without modification. """

def __init__(self, path):

self.path = path

def __repr__(self):

posix_path = self.path.replace(os.path.sep, "/") return repr(posix_path)

def sort_key(name):

return name.lower().lstrip("_")

def tree(directory, suffix=None):

names = sorted(os.listdir(directory), key=sort_key) parts = [] for name in names: path = os.path.join(directory, name) if os.path.isdir(path): sub_part = tree(path, suffix=suffix) if sub_part is not None: # NOTE: We always use posix separator. parts.append(name + "/") parts.append(textwrap.indent(sub_part, " ")) else: if suffix is None or name.endswith(suffix): parts.append(name)

if parts:

return "n".join(parts)

else:

return None

def print_tree(directory, suffix=None):
if isinstance(directory, Path):

# Make Windows act like posix. directory = directory.path separator = "/"

else:

separator = os.path.sep

print(os.path.basename(directory) + separator) full_tree = tree(directory, suffix=suffix) print(textwrap.indent(full_tree, " "))

# Monkey-patch functions to return a Path. original_get_include = bezier.get_include original_get_lib = bezier.get_lib

def get_include():

return Path(original_get_include())

bezier.get_include = get_include

# macOS specific. base_dir = os.path.dirname(original_get_include()) dylibs_directory = os.path.join(base_dir, ".dylibs")

show-headers

>>> include_directory = bezier.get_include() >>> include_directory '.../site-packages/bezier/include' >>> print_tree(include_directory) include/ bezier/ curve.h curve_intersection.h helpers.h status.h surface.h surface_intersection.h bezier.h

show-headers, show-lib, show-dll, macos-dylibs

# Restore the monkey-patched functions. bezier.get_include = original_get_include

Note that this includes a catch-all bezier.h that just includes all of the headers.

Static / Shared Library

On Linux and macOS, libbezier is included as a single static library (i.e. a .a file):

show-lib

>>> lib_directory = bezier.get_lib() >>> lib_directory '.../site-packages/bezier/lib' >>> print_tree(lib_directory) lib/ libbezier.a

Note

A static library is used (rather than a shared or dynamic library) because the "final" install location of the Python package is not dependable. Even on the same machine with the same operating system, the bezier Python package can be installed in virtual environments, in different Python versions, as an egg or wheel, and so on. Given the capabilities of auditwheel and delocate discussed below, it may be possible to use a shared library. See issue 54 for more discussion.

On Windows, an import library (i.e. a .lib file) is included to specify the symbols in the Windows shared library (DLL):

show-dll

>>> lib_directory = bezier.get_lib() >>> lib_directory '...\site-packages\bezier\lib' >>> print_tree(lib_directory) lib bezier.lib >>> dll_directory = bezier.get_dll() >>> dll_directory '...\site-packages\bezier\extra-dll' >>> print_tree(dll_directory) extra-dll bezier.dll

Extra Dependencies

When the bezier Python package is installed via pip, it will likely be installed from a Python wheel. The wheels uploaded to PyPI are pre-built, with the Fortran code compiled by GNU Fortran (gfortran). As a result, libbezier will depend on libgfortran. This can be problematic due to version conflicts, ABI incompatibility, a desire to use a different Fortran compiler (e.g. Intel's ifort) and a host of other reasons.

Some of the standard tooling for distributing wheels tries to address this. On Linux and macOS, they address it by placing a copy of libgfortran (and potentially its dependencies) in the built wheel. (On Windows, there is no standard tooling beyond that provided by distutils and setuptools.) This means that libraries that depend on libbezier should also link against these local copies of dependencies.

Linux

The command line tool auditwheel adds a bezier/.libs directory with a version of libgfortran that is used by libbezier, e.g.

$ cd .../site-packages/bezier/.libs
$ ls -1
libgfortran-ed201abd.so.3.0.0*

The bezier._speedup module depends on this local copy:

$ readelf -d _speedup.cpython-37m-x86_64-linux-gnu.so

Dynamic section at offset 0x2f9000 contains 27 entries:
  Tag        Type                         Name/Value
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/.libs]
 0x0000000000000001 (NEEDED)             Shared library: [libgfortran-ed201abd.so.3.0.0]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
...

Note

The runtime path (RPATH) uses $ORIGIN to specify a path relative to the directory where the extension module (.so file) is.

macOS

The command line tool delocate adds a bezier/.dylibs directory with copies of libgfortran, libquadmath and libgcc_s:

macos-dylibs

>>> dylibs_directory '.../site-packages/bezier/.dylibs' >>> print_tree(dylibs_directory) .dylibs/ libgcc_s.1.dylib libgfortran.5.dylib libquadmath.0.dylib

The bezier._speedup module depends on the local copy of libgfortran:

macos-extension, macos-delocated-libgfortran

import os import subprocess

import bezier

bezier_directory = os.path.dirname(bezier.__file__)

def invoke_shell(*args):

print("$ " + " ".join(args)) prev_cwd = os.getcwd() os.chdir(bezier_directory) # NOTE: We print to the stdout of the doctest, rather than using # subprocess.call() directly. output_bytes = subprocess.check_output(args).rstrip() print(output_bytes.decode("utf-8")) os.chdir(prev_cwd)

macos-extension

>>> invoke_shell("otool", "-L", "_speedup.cpython-37m-darwin.so") $ otool -L _speedup.cpython-37m-darwin.so _speedup.cpython-37m-darwin.so: @loader_path/.dylibs/libgfortran.5.dylib (...) /usr/lib/libSystem.B.dylib (...)

Though the Python extension module (.so file) only depends on libgfortran, it indirectly depends on libquadmath and libgcc_s:

macos-delocated-libgfortran

>>> invoke_shell("otool", "-L", ".dylibs/libgfortran.5.dylib") $ otool -L .dylibs/libgfortran.5.dylib .dylibs/libgfortran.5.dylib: /DLC/bezier/libgfortran.5.dylib (...) @loader_path/libquadmath.0.dylib (...) /usr/lib/libSystem.B.dylib (...) @loader_path/libgcc_s.1.dylib (...)

Note

To allow the package to be relocatable, the libgfortran dependency is relative to the @loader_path (i.e. the path where the Python extension module is loaded) instead of being an absolute path within the file system.

Notice also that delocate uses the nonexistent root /DLC for the install_name of libgfortran to avoid accidentally pointing to an existing file on the target system.

Windows

A single Windows shared library (DLL) is provided: extra-dll/bezier.dll. The Python extension module (.pyd file) depends directly on this library:

windows-extension, windows-dll

import distutils.ccompiler import os import subprocess

import bezier

if os.name == "nt":

c_compiler = distutils.ccompiler.new_compiler() assert c_compiler.compiler_type == "msvc" c_compiler.initialize()

dumpbin_exe = os.path.join(

os.path.dirname(c_compiler.lib), "dumpbin.exe")

assert os.path.isfile(dumpbin_exe)

else:

# This won't matter if not on Windows. dumpbin_exe = None

bezier_directory = os.path.dirname(bezier.__file__)

def replace_dumpbin(value):
if value == "dumpbin":

return dumpbin_exe

else:

return value

def invoke_shell(*args):

print("> " + " ".join(args)) # Replace "dumpbin" with dumpbin_exe. cmd = tuple(map(replace_dumpbin, args)) prev_cwd = os.getcwd() os.chdir(bezier_directory) # NOTE: We print to the stdout of the doctest, rather than using # subprocess.call() directly. output_bytes = subprocess.check_output(cmd).rstrip() print(output_bytes.decode("utf-8")) os.chdir(prev_cwd)

windows-extension

>>> invoke_shell("dumpbin", "/dependents", "_speedup.cp37-win_amd64.pyd") > dumpbin /dependents _speedup.cp37-win_amd64.pyd Microsoft (R) COFF/PE Dumper Version ... Copyright (C) Microsoft Corporation. All rights reserved. <BLANKLINE> <BLANKLINE> Dump of file _speedup.cp37-win_amd64.pyd <BLANKLINE> File Type: DLL <BLANKLINE> Image has the following dependencies: <BLANKLINE> bezier.dll python37.dll KERNEL32.dll VCRUNTIME140.dll api-ms-win-crt-stdio-l1-1-0.dll api-ms-win-crt-heap-l1-1-0.dll api-ms-win-crt-runtime-l1-1-0.dll ...

In order to ensure this DLL can be found, the bezier.__config__ module adds the extra-dll directory to os.environ["PATH"] on import (%PATH% is used on Windows as part of the DLL search path).

The libbezier DLL has no external dependencies, but does have a corresponding import library --- lib/bezier.lib --- which is provided to specify the symbols in the DLL.

On Windows, building Python extensions is a bit more constrained. Each official Python is built with a particular version of MSVC and Python extension modules must be built with the same compiler. This is primarily because the C runtime (provided by Microsoft) changes from Python version to Python version. To see why the same C runtime must be used, consider the following example. If an extension uses malloc from MSVCRT.dll to allocate memory for an object and the Python interpreter tries to free that memory with free from MSVCR90.dll, bad things can happen:

Python's linked CRT, which is msvcr90.dll for Python 2.7, msvcr100.dll for Python 3.4, and several api-ms-win-crt DLLs (forwarded to ucrtbase.dll) for Python 3.5 ... Additionally each CRT uses its own heap for malloc and free (wrapping Windows HeapAlloc and HeapFree), so allocating memory with one and freeing with another is an error.

This problem has been largely fixed in newer versions of Python but is still worth knowing, especially for the older but still prominent Python 2.7.

Unfortunately, there is no Fortran compiler provided by MSVC. The MinGW-w64 suite of tools is a port of the GNU Compiler Collection (gcc) for Windows. In particular, MinGW includes gfortran. However, mixing the two compiler families (MSVC and MinGW) can be problematic because MinGW uses a fixed version of the C runtime (MSVCRT.dll) and this dependency cannot be easily dropped or changed.

A Windows shared library (DLL) can be created after compiling each of the Fortran submodules:

$ gfortran \
>   -shared \
>   -o extra-dll/bezier.dll \
>   ${OBJ_FILES} \
>   -Wl,--output-def,bezier.def

Note

Invoking gfortran can be done from the Windows command prompt (e.g. it works just fine on AppVeyor), but it is easier to do from a shell that explicitly supports MinGW, such as MSYS2.

By default, the created shared library will depend on gcc libraries provided by MinGW:

> dumpbin /dependents .\extra-dll\bezier.dll
...
  Image has the following dependencies:

    KERNEL32.dll
    msvcrt.dll
    libgcc_s_seh-1.dll
    libgfortran-3.dll

Unlike Linux and macOS, on Windows relocating and copying any dependencies on MinGW (at either compile, link or run time) is explicitly avoided. By adding the -static flag

$ gfortran \
>   -static \
>   -shared \
>   -o extra-dll/bezier.dll \
>   ${OBJ_FILES} \
>   -Wl,--output-def,bezier.def

all the symbols used from libgfortran or libgcc_s are statically included and the resulting shared library bezier.dll has no dependency on MinGW:

windows-dll

>>> invoke_shell("dumpbin", "/dependents", "extra-dll\bezier.dll") > dumpbin /dependents extra-dllbezier.dll Microsoft (R) COFF/PE Dumper Version ... Copyright (C) Microsoft Corporation. All rights reserved. <BLANKLINE> <BLANKLINE> Dump of file extra-dllbezier.dll <BLANKLINE> File Type: DLL <BLANKLINE> Image has the following dependencies: <BLANKLINE> KERNEL32.dll msvcrt.dll USER32.dll ...

Note

Although msvcrt.dll is a dependency of bezier.dll, it is not a problem. Any values returned from Fortran (as intent(out)) will have already been allocated by the caller (e.g. the Python process). This won't necessarily be true for generic Fortran subroutines, but subroutines marked with bind(c) (i.e. marked as part of the C ABI of libbezier) will not be allowed to use allocatable or deferred-shape output variables. Any memory allocated in Fortran will be isolated within the Fortran code.

However, the dependency on msvcrt.dll can still be avoided if desired. The MinGW gfortran default "specs file" can be captured:

$ gfortran -dumpspecs > ${SPECS_FILENAME}

and modified to replace instances of -lmsvcrt with a substitute, e.g. -lmsvcr90. Then gfortran can be invoked with the flag -specs=${SPECS_FILENAME} to use the custom spec. (Some other dependencies may also indirectly depend on msvcrt.dll, such as -lmoldname. Removing dependencies is not an easy process.)

From there, an import library must be created

> lib /def:.\bezier.def /out:.\lib\bezier.lib /machine:${ARCH}

Note

lib.exe is used from the same version of MSVC that compiled the target Python. Luckily distutils enables this without difficulty.

Source

For code that depends on libgfortran, it may be problematic to also depend on the local copy distributed with the bezier wheels.

The bezier Python package can be built from source if it is not feasible to link with these libraries, if a different Fortran compiler is required or "just because".

The Python extension module (along with libbezier) can be built from source via:

$ python setup.py build_ext
$ # OR
$ python setup.py build_ext --fcompiler=${FC}

By providing a filename via an environment variable, a "journal" can be stored of the compiler commands invoked to build the extension:

$ export BEZIER_JOURNAL=path/to/journal.txt
$ python setup.py build_ext
$ unset BEZIER_JOURNAL

For examples, see:

Building a Python Extension

To incorporate libbezier into a Python extension, either via Cython, C, C++ or some other means, simply include the header and library directories:

setup-extension

import bezier

setup-extension

>>> import setuptools >>> >>> extension = setuptools.Extension( ... "wrapper", ... ["wrapper.c"], ... include_dirs=[ ... bezier.get_include(), ... ], ... libraries=["bezier"], ... library_dirs=[ ... bezier.get_lib(), ... ], ... ) >>> extension <setuptools.extension.Extension('wrapper') at 0x...>

Typically, depending on libbezier implies (transitive) dependence on libgfortran. See the warning in static-library for more details.