# Mixing C and Python with CFFI

 This notebook is set to have a look on the Python modula called [CFFI](https://cffi.readthedocs.io/en/latest/). Different tutorials exist on different websites. This might be considered as a concatenation of examples[[1](#Sam&Max)][[2](#dbader)]... Following the same order as the previous Notebook (1- Mixing C and Python with ctypes). This notebook is focused on API level, out-of-line system[[2](#dbader)]. This tutorial is not complete, though. Any idea of improvement is greatly welcomed.

The notebook is built on the following order:
 
 1. [Compilation of shared libraries and preprocess](#Compilation)
 2. [Get stdout in the notebook](#getSTDOUT)
 3. [CFFI](#cffi)     
 4. [Simple use](#simpleUse)
 5. [Unmutable and mutable strings](#strings)
 6. [Pointers and malloc](#pointers)
 7. [Structures](#structures)
 8. [References](#references)
 
 
This notebook was tested on a Ubuntu distribution, with Python3.6.


## Some imports and path definitions

In [None]:
import os
import numpy as np
import sys
import subprocess

from cffi import FFI

In [None]:
PATH_LIB = './libs/'
PATH_C = './src/'
PATH_FAKE_LIB = "utils/fake_libc_include" #from pycparser

## <a id="Compilation"></a> Compilation of C library

To be used in other languages, such as Python, C code must be compiled into a shared library. When *ctypes* just needed the shared library (file.so), **CFFI** needs also to parse header files. However, the parsed header must be previously preprocesed by a compilator, in order to "make disapear" all the **#include** lines. Users should open both files to see the difference. To do this little trick, we use fake headers[[3](#fakeHeaders)] and a specific flag for the partial compilation.

In [None]:
os.system("gcc -std=c99 -Wall -fPIC -c {} -o {}".format(os.path.join(PATH_C, "C_to_python.c"), 
                                                        os.path.join(PATH_LIB, "C_to_python.o")))
os.system("gcc -shared -o {} {}".format(os.path.join(PATH_LIB, "C2py.so"), 
                                        os.path.join(PATH_LIB, "C_to_python.o"))) 

#preprocess to tranform the header
os.system('gcc -E -I{} {} -o {}'.format(PATH_FAKE_LIB,
                                        os.path.join(PATH_C, "C_to_python.h"),
                                        os.path.join(PATH_LIB, "preprocessed_C_to_python.h")))

print("Library compiled and header transformed!")

## <a id="getSTDOUT"></a> Function to get stdout in the notebook

Different C functions (**printf** in particular) use a standard output in the terminal (*stdout*). A function [[3](#captureSTDOUT)] is used to grab them into the notebook. This part is completely useless when using a script .py, and must be completly removed.

In [None]:
import tempfile
from contextlib import contextmanager
import io
import ctypes

libc = ctypes.CDLL(ctypes.util.find_library('c'), use_errno=True)

class FILE(ctypes.Structure):
    pass

FILE_p = ctypes.POINTER(FILE)

# These variables, defined inside the C library, are readonly.
cstdin = FILE_p.in_dll(libc, 'stdin')
cstdout = FILE_p.in_dll(libc, 'stdout')
cstderr = FILE_p.in_dll(libc, 'stderr')

# C function to disable buffering.
csetbuf = libc.setbuf
csetbuf.argtypes = (FILE_p, ctypes.c_char_p)
csetbuf.restype = None

# C function to flush the C library buffer.
cfflush = libc.fflush
cfflush.argtypes = (FILE_p,)
cfflush.restype = ctypes.c_int

@contextmanager
def capture_c_stdout(encoding='utf8'):
    # Flushing, it's a good practice.
    sys.stdout.flush()
    cfflush(cstdout)

    # We need to use a actual file because we need the file descriptor number.
    with tempfile.TemporaryFile(buffering=0) as temp:
        # Saving a copy of the original stdout.
        prev_sys_stdout = sys.stdout
        prev_stdout_fd = os.dup(1)
        os.close(1)

        # Duplicating the temporary file fd into the stdout fd.
        # In other words, replacing the stdout.
        os.dup2(temp.fileno(), 1)

        # Replacing sys.stdout for Python code.
        #
        # IPython Notebook version of sys.stdout is actually an
        # in-memory OutStream, so it does not have a file descriptor.
        # We need to replace sys.stdout so that interleaved Python
        # and C output gets captured in the correct order.
        #
        # We enable line_buffering to force a flush after each line.
        # And write_through to force all data to be passed through the
        # wrapper directly into the binary temporary file.
        temp_wrapper = io.TextIOWrapper(
            temp, encoding=encoding, line_buffering=True, write_through=True)
        sys.stdout = temp_wrapper

        # Disabling buffering of C stdout.
        csetbuf(cstdout, None)

        yield

        # Must flush to clear the C library buffer.
        cfflush(cstdout)

        # Restoring stdout.
        os.dup2(prev_stdout_fd, 1)
        os.close(prev_stdout_fd)
        sys.stdout = prev_sys_stdout

        # Printing the captured output.
        temp_wrapper.seek(0)
        print(temp_wrapper.read(), end='')

## <a id="cffi"></a> CFFI is magic!

### Loading Library

We load the lirary into *lib_ffi*.

In [None]:
ffi = FFI()
with open(os.path.join(PATH_LIB,"preprocessed_C_to_python.h"), 'r') as f:
    preprocessed_text = f.read()
    
ffi.cdef(preprocessed_text)
lib_ffi = ffi.dlopen(os.path.join(PATH_LIB,"C2py.so"))

print("The library is loaded!")

### <a id="simpleUse"></a> Simple use

Loading the library creates an object wherein all functions are attributes. Then, it is easy to watch the effect of calling **welcome()**.

In [None]:
with capture_c_stdout():
    lib_ffi.welcome()

The FFI module reads the header in order to determine the nature of both inputs and outpout. Then, if one does not fill functions with the right number of arguments (or the correct type of argument), the execution leads to an internal error.

In [None]:
with capture_c_stdout():
    a = lib_ffi.add(4,2)
    
print(a)

In [None]:
tab = [4,6.56, 10]
with capture_c_stdout():
    lib_ffi.sum(len(tab), tab)

### <a id="strings"></a> Unmutable and mutable strings

Using strings might be a little more tricky than numbers. In Python, strings are unmutable, which means that they cannot be modified. To modify strings in Python, we must create another variable, which is not the case in C. Thus, we need to convert them so they can be used in C code. To do so, we must define a char\* variable using the modula CFFI. Two possible actions:
- **char\* ** is used to code a single char (1 byte)
- **char [] ** is used to code a list of chars. Must fill it with an encoded string

The user might test to see the different ways. In this example, one must choose **char []**.

In [None]:
# unmutable case
unmutable_str = "This is a test!"
mutable_str = ffi.new("char []", unmutable_str.encode('ascii'))

with capture_c_stdout():
    lib_ffi.print_text(mutable_str)

### <a id="pointers"></a> Pointers and malloc

#### Use a pointer input parameter as an output

The declaration of pointers has to be done on the same way than the declaration of a string. We also make a difference between two declarations (**type** can be replaced by int, double...):
- **type \* ** allocates $m$ bytes, where $m$ is the number of bytes the type needs for a single element
- **type [$n$]** allocates $m * n$ bytes, corresponding to $n$ elements of the type

It is possible to declare a table as (**type [], $n$**) as mentionned in [[5](#cffiDOC)].

Then, whenever we create a table, we must precise the number of elements in it. A wrapper might be used to create it without thinking.


In [None]:
def wrapper_ffi(ffi, type_str, size):
    return ffi.new("{}[]".format(type_str), size)

Then, we can simply get the square array of an input.

In [None]:
ina = [4,3,2.5]
outa = wrapper_ffi(ffi, "double", len(ina)) # do not have to free (allocation in Python)

with capture_c_stdout():
    lib_ffi.square_array(3, ina, outa)

for i in range(len(ina)):
    print("{}² = {}".format(ina[i], outa[i]))

#### Malloc and free 

In this section, it is possible to deal with **init_matrix** and **free_matrix** by calling them with integers and lists from the library. But it would be nicer to create a little wrapper to be able to exploit data easily.

The first example deals with the creation of a matrix in C.

In [None]:
def init_mat(size):
    global lib_ffi
    with capture_c_stdout():
        m = lib_ffi.init_matrix(len(size), size) # m is a C allocated pointer
    
    # extraction of data
    value = [[m[i][j] for j in range(size[i])] for i in range(len(size))]
    
    return value, m

In [None]:
size = [2, 5, 2]
val, address = init_mat(size)
print("Matrix {} \nstored by {}".format(val, address))

print("\nPrinting first elements...") # May produce huge errors... Since we exeed the allocated memory...
for i in range(3):
    print(address[0][i])

In [None]:
with capture_c_stdout():
    lib_ffi.free_matrix(len(size), size, address)
    
print("Matrix {} \nstored by {}".format(val, address))

print("\nPrinting first elements...")# May produce huge errors... Since the pointer points on random memory
for i in range(3):
    print(address[0][i])

The second example deals with the creation of a table set as an input. To do so, we create an **int\*\*** element.

In [None]:
def init_tab(length):
    global ffi
    global lib_ffi
    tab_handle = ffi.new("int **") # is freed when the function ends
    with capture_c_stdout():
        lib_ffi.init_tab(length, tab_handle)
    
    # Extract information of pointer tab
    value = [tab_handle[0][i] for i in range(length)]
    
    return value, tab_handle[0]


In [None]:
val, address = init_tab(3)
print("Tab {} \nstored by {}".format(val, address))

print("\nPrinting first elements...")# May produce huge errors... Since we exeed the allocated memory...
for i in range(4):
    print(address[i])

In [None]:
# Free tab
with capture_c_stdout():
    lib_ffi.free_tab(address)

print("Printing first elements...")# May produce huge errors... Since the pointer points on random memory
for i in range(4):
    print(address[i])

### <a id="structures" ></a> How to use a structure?

To define a structure, two methods are possible:
- a C function produces a pointer on that structure (which is out case). Then, no need to use *ffi.new()*
- Python has to create the structure. In that case, we need to define the length of all the elements in that structure [[6](#cffiDOC_2)].

In this notebook, we'll deal with the use of a C allocation.

In [None]:
with capture_c_stdout():
    test = lib_ffi.init_data(2)
if test.exists_nb_elements:
    print("Nb elements: {}".format(test.nb_elements))

if test.exists_matrix:
    print("There is a matrix!")

In [None]:
test.matrix

In [None]:
with capture_c_stdout():
    lib_ffi.free_data(test)
    
print("Nb elements: {}".format(test.nb_elements))


 ## <a id="references"></a> References
 
 [<a id="tuto">1</a>] A tremendous tutorial: https://dbader.org/blog/python-cffi
 
 [<a id="sharedLibrary">2</a>] Shared libraries: https://www.cprogramming.com/tutorial/shared-libraries-linux-gcc.html
 
 [<a id="tuto">3</a>] Fake headers: https://eli.thegreenplace.net/2015/on-parsing-c-type-declarations-and-fake-headers/

 [<a id="captureSTDOUT">4</a>] How to print stdout in a notebook? https://stackoverflow.com/questions/35745541/how-to-get-printed-output-from-ctypes-c-functions-into-jupyter-ipython-notebook
 
 [<a id="cffiDOC">5</a>]https://cffi.readthedocs.io/en/latest/using.html#working-with-pointers-structures-and-arrays
 
 [<a id="cffiDOC_2">6</a>] https://cffi.readthedocs.io/en/latest/using.html