External libraries in Python
====

Numpy is a Python module that interfaces to fast precompiled code.  Python is fast for development, but if you write Python like C/C++ you may be very disappointed in the performace, expecially for large loops.

In [1]:
import numpy as np

In [2]:
n=100000

The *magic* **%timeit (%%timeit)** returns the execution time for one line (cell) of code, see <br>
https://ipython.readthedocs.io/en/stable/interactive/magics.html for a discussion of the parameters used

Below we create a numpy array and iterate over this array by index to perform some basic operations.

In [3]:
%%timeit -r 1 -n 1
v=np.random.random(n)  # create array w/ random umbers in [0,1]
for i in range(n):
    v[i]=np.sqrt(v[i])

140 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


Next, we perform the same calculation using only calls to numpy.  The loops are offloaded to fast code in the numpy liraries.

In [4]:
%%timeit -r 1 -n 1
v=np.random.random(n) 
v=np.sqrt(v)

1.64 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


Compare python to C++ function using ROOT to compile the code and create our interface
----

In [6]:
import ROOT as R

Welcome to JupyROOT 6.22/00


Define a python function to calculate a series with slow convergence<br>
$ln(2) = \frac{1}{1} - \frac{1}{2} + \frac{1}{3} - \frac{1}{4} + \frac{1}{5} - ...$

In [7]:
# Python version of a function with a long loop
def longcalc(ncalcs=1000*1000):
    val=0.0
    for i in range(1,ncalcs+1):
        val += 1.0/i * (2.0*(i%2)-1)
    return val

In [8]:
# C/C++ version
!cat longcalc.cpp

#include <iostream>

using namespace std;

// A simple function that takes a while to calculate!
// We'll calculate the slowly converging series 1 - 1/2 + 1/3 - 1/4 + ...
// The series converges to ln(2)
double longcalc(unsigned long ncalcs=1000*1000){
  double val=0.0;
  for (unsigned long i=1; i<=ncalcs; i++){
    val += 1.0/i * (2.0*(i%2)-1);
  }
  return val;
}


Below we use a feature of ROOT which compiles our C++ function and builds a shared library. ROOT then builds the Python<-->C++ interface and loads the shared, allowing the C++ function to be called like a Python function in the ROOT namespace.  So easy!

In [9]:
R.gROOT.ProcessLine(".L longcalc.cpp+")

0

In [10]:
!ls longcalc*

longcalc.cpp			   longcalc_cpp.d
longcalc_cpp_ACLiC_dict_rdict.pcm  longcalc_cpp.so


Compare results of the Python and C++ functions:

In [11]:
print("time for running the python function once")
%timeit -r 1 -n 1 longcalc()
print("time for running the C++ function once")
%timeit -r 1 -n 1 R.longcalc()
print()
print("Python result",longcalc())
print("   C++ result",R.longcalc())

time for running the python function once
117 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
time for running the C++ function once
4.29 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)

Python result 0.6931466805602525
   C++ result 0.6931466805602525


The default use of **%timeit** below (no parameters given) will cause the code to be run multiple times to acquire statistics on the average runtime and standard deviation of the actial performance.  It will take a few seconds to run each test.

In [12]:
print("time for running the python function many times")
%timeit longcalc()
print("time for running the C++ function many times")
%timeit R.longcalc()

time for running the python function once
118 ms ± 2.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
time for running the C++ function once
1.36 ms ± 7.58 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


**Notice the difference in speed!**  <br> Be careful about coding long calulcations in Python!

Another speedup option
----

Here we will test the numba jit compiler

In [13]:
from numba import jit

In [14]:
# Python version of a function with a long loop
@jit(nopython=True)
def longcalc_nb(ncalcs=1000*1000):
    val=0.0
    for i in range(1,ncalcs+1):
        val += 1.0/i * (2.0*(i%2)-1)
    return val

In [15]:
print("time for running the python function once with numba jit")
%timeit -r 1 -n 1 longcalc_nb()

time for running the python function once with numba jit
118 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


That wasn't so great, let's try again:

In [21]:
print("time for running the python function once with numba jit")
%timeit -r 1 -n 1 longcalc_nb()

time for running the python function once with numba jit
1.71 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)


In [None]:
Recall jit = just in time!