## Cython

Cython is a static compiler for Python and Cython languages. With Cython, writing C extensions to optimize code performance is just as easy as writing normal Python code. Cython gives you the combined power of C/C++ and Python. The written code will call back and forth between Python and C.

### Installation

Many Python distributions, including Anaconda, have Cython as one of the packages in the initial setup. If you would like to double check, or install Cython use `pip install Cython` in the terminal of your instance. Along with the package, a C compiler is required to use Cython. For our AWS Instances `sudo apt-get install build-essential` will get all of the required components for the C compiler. With the package and compiler installed, you're ready to use Cython.

### Example

Now that we have setup cython this example will allow us to see how to use cython. In jupyter notebooks, we can use the power of cython in cells with `%%cython` header in the cells that we wish to denote as cython. To discover what is happening as Cython interacts with both Python and C, `%%cython -annotate`, will highlight the background information that is happening with the code.

In [1]:
# %load /home/ubuntu/Notebooks/wsetup.py
## Packages
import itertools as it
import numpy as np
import pandas as pd
import matplotlib as mpl
import seaborn as sns
import IPython.display as ipd
import matplotlib.pyplot as plt

## Special Items
idx = pd.IndexSlice
digits = 3
pd.options.display.chop_threshold = 10**-(digits+1)
#pd.options.display.precision = 3
pd.options.display.float_format = lambda x: '{0:.{1}f}'.format(x,digits)
pd.options.display.show_dimensions = True
        
## Matplotlib Options
%matplotlib inline
#%matplotlib notebook
plt.style.use("classic")
plt.style.use("seaborn-darkgrid")
#plt.style.use("bmh")
plt.rc("figure", figsize=(5,3))

## Functions
def display(X):
    if isinstance(X,np.ndarray) or isinstance(X,pd.Series):
        ipd.display(pd.DataFrame(X))
    else:
        ipd.display(X)

In [2]:
#load Cython extension into jupyter
%load_ext Cython 

In [3]:
%%cython 

cdef int a=0
for i in range(10):
    a += i 
print(a)

45


In [4]:
%%cython -annotate

def fib(n):
    cdef int a, b
    a, b = 0, 1
    while b < n:
        print(b)
        a, b = b, a + b

In [5]:
fib(95)

1
1
2
3
5
8
13
21
34
55
89


Adding the type with `cdef` allows the code to not be worried about the type of object that is generated during the function. This is called static typing. When written in Python, Python will search and detect the type of object for the problem. With Cython, you are able to declare the type of variable for a function. While this will generate faster code, sometimes it can be seen as less readable and also can be unnecessary. For example, the fibbonacci sequence code above is not improved by using Cython, or by defining the type of variable that is being called into the function. However, the following example displays the ability that Cython has for speeding up computing. 


In [6]:
# Python code:
def f(x):
    return x**2-x

def integrate_f(a,b,N):
    s = 0 
    dx = (b-a)/N
    for i in range(N):
        s += f(a+i*dx)
    return s * dx

%timeit integrate_f(1,10,100)

10000 loops, best of 3: 38.4 µs per loop


In [7]:
%%cython 

def cf(x):
    return x**2-x

def integrate_cf(a,b,N):
    s = 0 
    dx = (b-a)/N
    for i in range(N):
        s += cf(a+i*dx)
    return s * dx

In [8]:
%timeit integrate_cf(1,10,100)

10000 loops, best of 3: 25.7 µs per loop


In [9]:
%%cython

def cf2(x):
    return x**2-x

def integrate_cf2(a,b,N):
    cdef int i
    cdef double s, dx
    s = 0 
    dx = (b-a)/N
    for i in range(N):
        s += cf2(a+i*dx)
    return s * dx

In [10]:
%timeit integrate_cf2(1,10,100)

10000 loops, best of 3: 21.4 µs per loop


From the python code to running the code in cython there was approximately a 33% increase in performance speed. From running in cython to defining `i` and `s`, `dx` increase the performance by another 17%. The overall performance increased by 44% from the original funciton in python to the cython with defined variables. By declaring the types for the variable in the function, it allows Cython to store the variable as a C struct instead of a Python dict, which will allow for the imporoved performance. Along with imporoving the performance, Cython also allows you to use external C libraries, C functions,  and the C extensions types, previosuly demonstrated, when the different attributes are needed.

### Profiling 
To optimize the code that you are trying to write, it is necessary for you to know what is happening for each line of the code. With `pstats` and `cProfile` we can look at what is happening with the code in both python and cython in an attempt to optmize the code for the best performance.

In [11]:
import pstats, cProfile

def recip_square(i):
    return 1./i**2

def approx_pi(n=10000000):
    val = 0.
    for k in range(1,n+1):
        val += recip_square(k)
    return (6 * val)**.5

cProfile.runctx("approx_pi()", globals(), locals())

s = pstats.Stats(cProfile.runctx("approx_pi()", globals(), locals()))
s.strip_dirs().sort_stats("time").print_stats()

         10000004 function calls in 8.545 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 10000000    5.586    0.000    5.586    0.000 <ipython-input-11-95fef7c8bd01>:3(recip_square)
        1    2.959    2.959    8.545    8.545 <ipython-input-11-95fef7c8bd01>:6(approx_pi)
        1    0.000    0.000    8.545    8.545 <string>:1(<module>)
        1    0.000    0.000    8.545    8.545 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


         10000004 function calls in 8.570 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
 10000000    5.603    0.000    5.603    0.000 <ipython-input-11-95fef7c8bd01>:3(recip_square)
        1    2.967    2.967    8.570    8.570 <ipython-input-11-95fef7c8bd01>:6(approx_pi)
        1    0.000    0.000    8.570    8.570 <string>:1(<module>)
        1    0.

<pstats.Stats at 0x7f046eb27be0>

In [12]:
%%cython 
import pstats, cProfile

def recip_square(int i):
    return 1./i**2

def pi_approx(n=10000000):
    cdef double val = 0.
    cdef int k
    for k in range(1,n+1):
        val += recip_square(k)
    return (6 * val)**.5

cProfile.runctx("pi_approx()", globals(), locals())

s2 = pstats.Stats(cProfile.runctx("pi_approx()", globals(), locals()))
s2.strip_dirs().sort_stats("time").print_stats()

         4 function calls in 0.829 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.828    0.828 <string>:1(<module>)
        1    0.828    0.828    0.828    0.828 {_cython_magic_99d99f96a94948124a67b8531a6a40c3.pi_approx}
        1    0.000    0.000    0.829    0.829 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


         4 function calls in 0.824 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.824    0.824 <string>:1(<module>)
        1    0.823    0.823    0.823    0.823 {_cython_magic_99d99f96a94948124a67b8531a6a40c3.pi_approx}
        1    0.000    0.000    0.824    0.824 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


         0 fun

### Paralleled Random Forests

One goal that we had for this presentation was to find a way to speed up the process of running random Forests. Luckily, the scikitlearn `RandomForestClassifier` module has an attribute that allows you to designate the number of cores Random Forests can run for fitting and predicting. The attribute `n_jobs`, has a default of 1. However, if you have more options for making the process faster, you can set it to as many cores as you would like. Setting this to "-1" sets the number of jobs to the number of cores available on the device. Our AWS micro instances only have one core, which is plenty for the digits data example below. However, if you have larger data and need faster computing times, we can increase the number of cores on the instance.

In [31]:
import sklearn
from sklearn import datasets
from sklearn.ensemble import RandomForestClassifier
from sklearn.cross_validation import train_test_split
digits = datasets.load_digits()
Xtrain, Xtest, Ytrain, Ytest = train_test_split(digits.data, digits.target, random_state=15)

model = RandomForestClassifier(n_estimators=100, n_jobs= -1)
model.fit(Xtrain, Ytrain)
ypred = model.predict(Xtest)

In [32]:
from sklearn import metrics
print(metrics.classification_report(ypred,Ytest))

             precision    recall  f1-score   support

          0       0.95      0.97      0.96        36
          1       1.00      0.98      0.99        52
          2       1.00      1.00      1.00        48
          3       0.97      0.97      0.97        34
          4       0.94      0.96      0.95        51
          5       1.00      1.00      1.00        46
          6       0.98      1.00      0.99        44
          7       1.00      0.94      0.97        50
          8       0.95      0.92      0.94        39
          9       0.94      0.98      0.96        50

avg / total       0.97      0.97      0.97       450



### Assignment

For the following four code wars challenges, complete the challenge on codewars, write your function and time the function in your homework folder. Once the function has been timed, run the function with the `%%cython` annotation and time the function to calculate the speed up between the two functions. Then, using `pstats` and `cProfile`, profile the challenge and attempt a form of static typing to increase the performance of your challenge.

The following challenges are:
    1. Basic Encryption (6 kyu)
    2. World Bits War (6 kyu)
    3. Gap in Primes (5 kyu)
    4. k-Primes (5 kyu)

Note: This is just an introduction to cython. if you would like a better understanding and more examples, visit https://www.cython.org or also check out this cool YouTube video from one of the co-creators of Cython https://www.youtube.com/watch?v=a8LsdodGoWQ.