Skip to content

Commit

Permalink
more Numba example code
Browse files Browse the repository at this point in the history
  • Loading branch information
wlav committed Jun 30, 2022
1 parent 27aa43f commit 8846a59
Showing 1 changed file with 92 additions and 6 deletions.
98 changes: 92 additions & 6 deletions doc/source/numba.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ and lowering by exposing type details through C++ reflection at runtime.

JIT-compiling traces of mixed Python/bound C++ code reduces (or even removes)
the overhead of boxing/unboxing native data into their Python proxies and vice
veras.
versa.
It can also remove temporaries, especially for template expressions.
Thus, there can be a significant speedup for mixed code, beyond the compilation
of Python code itself.
Expand All @@ -44,8 +44,8 @@ you can load the extensions explicitly::
After that, Numba is able to trace ``cppyy`` bound code.
Note that the ``numba_ext`` module will register only the proxy base classes,
to keep overheads to a minimum.
The pre-registerd base classes will, lazily and automatically, register
further type information for the concrete classess and overloads on actual
The pre-registered base classes will, lazily and automatically, register
further type information for the concrete classes and overloads on actual
use in Numba traces, not from other uses.


Expand Down Expand Up @@ -93,7 +93,93 @@ Instances of C++ classes can be passed into Numba traces.
They can be returned from functions called _within_ the trace, but can not yet
be returned _from_ the trace.
Their public data is accessible (read-only) if of builtin type and their public
methods can be called.
Overload selection among methods works.
methods can be called, for which overload selection works.
Example:

.. code-block:: python
>>> import cppyy
>>> import numba
>>> import numpy as np
>>>
>>> cppyy.cppdef("""\
... class MyData {
... public:
... MyData(int i, int j) : fField1(i), fField2(j) {}
...
... public:
... int get_field1() { return fField1; }
... int get_field2() { return fField2; }
...
... MyData copy() { return *this; }
...
... public:
... int fField1;
... int fField2;
... };""")
True
>>> @numba.jit(nopython=True)
>>> def tsdf(a, d):
... total = type(a[0])(0)
... for i in range(len(a)):
... total += a[i] + d.fField1 + d.fField2
... return total
...
>>> d = cppyy.gbl.MyData(5, 6)
>>> a = np.array(range(10), dtype=np.int32)
>>> print(tsdf(a, d))
155
>>> # example of method calls
>>> @numba.jit(nopython=True)
>>> def tsdm(a, d):
... total = type(a[0])(0)
... for i in range(len(a)):
... total += a[i] + d.get_field1() + d.get_field2()
... return total
...
>>> print(tsdm(a, d))
155
>>> # example of object return by-value
>>> @numba.jit(nopython=True)
>>> def tsdcm(a, d):
... total = type(a[0])(0)
... for i in range(len(a)):
... total += a[i] + d.copy().fField1 + d.get_field2()
... return total
...
>>> print(tsdcm(a, d))
155
>>>
.. _Numbal: https://numba.pydata.org/
`Performance`
-------------

The main overhead of JITing Numba traces is in Numba itself; optimization of
the IR and assembly by the backend plays a much smaller role.
The use of bound C++ does not change that, since its introspection by and
large relies on the same mechanisms as that of Python code.
For example, it takes the same amount of wall clock time to JIT a trace using
Numba's included math functions (from module ``math`` or ``numpy``) as one
using C++ bound ones whether from the standard library or templated versions
from e.g. Eigen.
Use of very complex template expressions may change this balance, but in
principle, wherever it makes sense in the first place to use Numba JITing, it
is also fine, performance-wise, to use ``cppyy`` bound C++ inside the trace.

A second important overhead is in unboxing Python proxies of C++ objects,
in particular when passed as an argument to a Numba-JITed function.
The main costs are in the lookup (types are matched at every invocation) and
to a lesser extent the subsequent copying of the instance data.
Thus, functions that take a C++ object as an argument will require more time
spent in the function body for JITing to be worth it than functions that do
not.

The current implementation invokes C++ callables through function pointers
and accesses data through offsets calculations from the object's base
address.
A future implementation will be able to inline C++ into the Numba trace if
code is available in headers files or was JITed.


.. _Numba: https://numba.pydata.org/

0 comments on commit 8846a59

Please sign in to comment.