# Efficient Python analysis with dynamic C++ and just-in-time compilation

Cppyy combines the convenience of the Python language with the efficiency of C++ implementations. The dynamic C++ bindings is powered by the C++ interpreter [cling](https://github.com/root-project/cling) allow to use conveniently efficient implementations in Python.

In [1]:
import cppyy
import numpy as np

## Just-in-time compilation of C++ functions



In [2]:
cppyy.cppdef('''

float smallest_diff(float* v1, float* v2, std::size_t size) {
    float min_diff = std::numeric_limits<float>::max();
    for (std::size_t i1 = 0; i1 < size; i1++) {
        for (std::size_t i2 = 0; i2 < size; i2++) {
            float diff = std::abs(v1[i1] - v2[i2]);
            if (diff < min_diff) {
                min_diff = diff;
            }
        }
    }
    return min_diff;
}
''');

As example inputs, we generate two numpy arrays with random numbers.

In [3]:
size = 100
v1 = np.random.randn(size).astype(np.float32)
v2 = np.random.randn(size).astype(np.float32)

And next we benchmark the runtime:

In [4]:
%%timeit
cppyy.gbl.smallest_diff(v1, v2, size)

3.56 μs ± 6.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


How does the C++ kernel compare to a pure Python implementation?

In [5]:
def smallest_diff(x1, x2):
    min_diff = float('inf')
    for e1 in x1:
        for e2 in x2:
            diff = abs(e1 - e2)
            if diff < min_diff:
                min_diff = diff
    return min_diff

**The Python implementation is a factor of 100 slower!**

In [6]:
%%timeit
smallest_diff(v1, v2)

989 μs ± 21.7 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


## Loading of precompiled functions

Improved C++ performance can be expected by precompiling the functionality and loading the library into cppyy 

In [7]:
%%writefile diff_small.hxx

#include <cstddef>
#include <cmath> 
float optimized_smallest_diff(float* v1, float* v2, std::size_t size);

Writing diff_small.hxx


In [8]:
%%writefile diff_small.cxx

# include "diff_small.hxx"

float optimized_smallest_diff(float* v1, float* v2, std::size_t size) {
    float min_diff = std::numeric_limits<float>::max();
    for (std::size_t i1 = 0; i1 < size; i1++) {
        for (std::size_t i2 = 0; i2 < size; i2++) {
            float diff = std::abs(v1[i1] - v2[i2]);
            if (diff < min_diff) {
                min_diff = diff;
            }
        }
    }
    return min_diff;
}


Writing diff_small.cxx


In [9]:
!g++ -Ofast -shared -o libanalysis.so diff_small.cxx

Ex 3 - You can interactively include the header and functionality from the shared library.

In [10]:
cppyy.cppdef('#include "diff_small.hxx"')
cppyy.load_library('libanalysis.so')

True

In [12]:
cppyy.gbl.__dict__

mappingproxy({'__module__': 'cppyy._cpython_cppyy',
              '__dict__': <attribute '__dict__' of '' objects>,
              '__weakref__': <attribute '__weakref__' of '' objects>,
              '__doc__': None,
              '__init__': <cppyy.CPPOverload at 0x7e140c4dd480>,
              'std': <namespace cppyy.gbl.std at 0x2bb3f60>,
              'int8_t': cppyy.gbl.int8_t,
              'uint8_t': cppyy.gbl.uint8_t,
              'CppyyLegacy': <namespace cppyy.gbl.CppyyLegacy at 0x3bc3060>,
              'smallest_diff': <cppyy.CPPOverload at 0x7e140c4a6f80>})

**The loaded libary improves the runtime further!**

While this may not make a huge difference for small files like this example, large files benefit since your code is recompiled every time you `cppdef`

In [13]:
%%timeit

cppyy.gbl.optimized_smallest_diff(v1, v2, size)

1.45 μs ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [14]:
cppyy.gbl.__dict__

mappingproxy({'__module__': 'cppyy._cpython_cppyy',
              '__dict__': <attribute '__dict__' of '' objects>,
              '__weakref__': <attribute '__weakref__' of '' objects>,
              '__doc__': None,
              '__init__': <cppyy.CPPOverload at 0x7e140c4dd480>,
              'std': <namespace cppyy.gbl.std at 0x2bb3f60>,
              'int8_t': cppyy.gbl.int8_t,
              'uint8_t': cppyy.gbl.uint8_t,
              'CppyyLegacy': <namespace cppyy.gbl.CppyyLegacy at 0x3bc3060>,
              'smallest_diff': <cppyy.CPPOverload at 0x7e140c4a6f80>,
              'optimized_smallest_diff': <cppyy.CPPOverload at 0x7e140572ce00>})

Finally, we can show that all implementations come to the same result:

In [17]:
print('Cppyy:', cppyy.gbl.smallest_diff(v1, v2, size))
print('Native Python:', smallest_diff(v1, v2))
print('Cppyy (loaded):', cppyy.gbl.optimized_smallest_diff(v1, v2, size))

Cppyy: 0.00012922286987304688
Native Python: 0.00012922287
Cppyy (loaded): 0.00012922286987304688
