enhancements fortran

DagSverreSeljebotn edited this page Apr 2, 2011 · 13 revisions
Clone this wiki locally

CEP 401 - Support for Fortran

The goal is to allow transparently call Fortran code from Cython.

Note: This document is now quite outdated. Please see

http://wiki.cython.org/kwsmith/soc09 http://fortrancython.wordpress.com/

for current thoughts and progress on this. This CEP will be updated properly over the summer some time...

Overall strategy and compiler support

The strategy is to have Cython generate a Fortran source file from the pyx/pxd sources, together with the C file, which must also be compiled and linked into the resulting .so/DLL. This file contains automatically generated bindings necesarry to call the functions like if they were written in C. The Fortran file target is in the Fortran 90 language with the addition of Fortran 2003 ISO C bindings. Using these bindings means that one is using a standardized way of bridging between Fortran and C. This takes care of calling convention, Fortran name mangling, etc. etc.

Any Fortran compiler with Fortran 2003 ISO C bindings support should work. Newer versions of the open source gfortran and g95 does this; as well as the popular Intel Fortran compiler. Even if the end-user is using F77 code, usually a Fortran compiler is available which can properly link in the F77 code while still supporting compilation of the Cython wrapper.

The ISO C bindings is described here (chapter 5): http://www.fortranplus.co.uk/resources/john_reid_new_2003.pdf

Elementary wrapping

The features used here will be discussed later. Consider this pyx file:

cdef extern:
    module myfortranmodule:
        fortran double myfunc(double foo)

This will result in a wrapper like this (psuedo-code, as I actually do not know Fortran very well):

function cywrap_mymodule_myfunc(foo), bind(C, name="__Pyx_Fortran_1_mymodule_myfunc")
    use iso_c_binding
    use myfortranmodule, only :: myfunc
    real(c_double) :: foo
    real(c_double) :: cywrap_mymodule_myfunc
    cywrap_mymodule_myfunc = myfunc(foo)
end function

I.e. the call is simply forwarded. Cython then generates a simple call to __Pyx_Fortran_1_mymodule_myfunc which it can treat like a C function with the same signature.

If the Fortran argument is declared as "inout" or "out", the function should be declared like this:

cdef extern:
    fortran double myfunc2(short* foo)

and the wrapper will be slightly more complicated.

A nice consequence of this way of working is that misspecification will be tolerated without corrupting the stack. For instance, if myfunc2 above is misspecified like this:

cdef extern:
    fortran double myfunc2(int foo) # should have been short*

then a) the out value of foo will not be passed back to Cython, but discarded, and b) values too large for short will silently overflow in the Fortran wrapper.

Passing arrays

Fortran doesn't support pointers, but has a native array type which can be passed in a variety of ways (three or four I think). The ISO C bindings fully support passing contiguous arrays in various ways. This involves a function call in Fortran which casts a special (opaque) type to a Fortran array; where one must provide the shape information.

Depending on the call, the shape information may be present in the signature of the function (F77 style arrays) or not (F90 assumed shape arrays). In the latter case, we must pass a struct containing both the pointer and the shape information. (Note that the struct to be passed is entirely up to us, as we generate both sides of the bridge -- generated C code passes the struct to generated Fortran code which uses it to cast to a proper Fortran array object.)

To pass non-contiguous arrays, the simplest thing is probably to pass the original, non-strided buffer, together with stride information, and then do the appropriate slicing Fortran-side. If dimensions needs reordering, then Fortran's reshape function can be used (it copies data, but it's still easier than copying Cython-side). Result: If the Fortran function to be called operates on strided arrays and the ordering is correct, the correctly sliced array is passed, and if not, the Fortran compiler will transparently copy in and copy out. It is good to push as much copying as possible to the Fortran compiler as it can better determine how it should be done.

Cython-side, buffer objects would be the objects which can be passed to such functions. Another possible Cython language feature could facilitate passing C arrays through a cast to a buffer.

Python buffer objects also support "indirect buffers". These could be unsupported, at least at first. But they could always be supported by copying the array in and out.


cdef extern:
    fortran void foo(int nrow=arr.shape[0], int ncol=arr.shape[1], double[[,]] arr) # F77 style
    fortran void bar(double[[,]] arr) # F90 style; pointer, dimension information and strides passed as struct
    fortran void bar(const double[[,]] arr) # "in" argument (not inout or out)

Fortran types

In the above, C types has been used. This will work in the sense that only truncation/overflows will happen, the stack will not be destroyed. But it is always good to use the exact same type in Fortran and C.

To this end a new virtual Cython module cython.fortran should be created, containing Fortran equivalents. Examples:

cimport cython
cdef cython.fortran.real(kind=8) x # REAL(8)
cdef cython.fortran.real(selected_kind=(12,200)) # REAL(SELECTED_REAL_KIND(12,200))

These should ultimately boil down to the equivalent types in C. (The C compiler will know the exact type; while Cython will not know the size of the type). This is likely implemented by including a fortran.h controlled by compiler directives containing information about a supported set of compilers. If necesarry, a program could be written to generate such a header file by probing the Fortran and C compilers.

Build system

One also needs to modify distutils enough to reliably build these modules. This likely involves using NumPy's additions to distutils for compiling Fortran code, and integrate that with Cython's distutils.

Optional: Parsing Fortran files

It would nice to (optionally) be able to parse Fortran directly for function declarations, rather than having to use a pxd file. I have had a quick look on about four different Fortran parsers; all appear to have their weaknesses but some of them could probably be tweaked enough to be useful.


myfortranmodule = cython.fortran.include("mymodule.f90") # corresponds to "cimport myfortranmodule"