Speeding up Python with *Cython*
=====================

For today's notebook I am going to go through cython to try to learn the ins and outs of the language, how it
differs from python and how to easily integrate it with existing python code.  

I will likely code up the knn model or a similar model using cython in this notebook.  

Normally you would want to setup your setup.py to do a `c` style build for the cython (`.pyx`) files, however
for the notebook we can start by loading the cython extension. 

In [1]:
%load_ext Cython
# Note, this does require that Cython be installed (pip install Cython)

To create a block that should be compiled, you just need to add to the top of the block the following

```python
%%cython
```

If you want to see the compiler output (what parts were compiled, what still interacts with python) you would
add the `-a` option, like so: 

```python
%%cython -a
```

In [3]:
%%cython -a

def sum(int a, int b):
    cdef int result
    
    result = a + b
    
    return result

## Comparison

To start we are going to do a comparison of a fib function (not recursive) that is in python and one that is in
cython.  This should give us a reasonable baseline to use in determining the optimization benefit that we can
get by working with one over the other.  

In [4]:
def py_fib(n):
    prev = 1
    current = 1
    if n < 2:
        return current
    
    i = 2
    while i < n:
        prev, current = current, current + prev
        i += 1
        
    return current

In [5]:
%%cython

def c_fib(int n):
    cdef int prev, current, i
    
    prev = 1
    current = 1
    if n < 2:
        return current

    i = 2
    while i < n:
        prev, current = current, current + prev
        i += 1
        
    return current

In [6]:
from timeit import default_timer as timer

run_count = 1000000

start_py = timer()
py_fib(run_count)
end_py = timer()

print('Python Run: ', end_py - start_py)

start_c = timer()
c_fib(run_count)
end_c = timer()

print('Cython Run: ', end_c - start_c)

Python Run:  8.681652272438022
Cython Run:  0.0003150769805273512


So, as you can see it looks like the cython implementation is around **$10^5$ times faster**.  And there are only 3 lines
that are different in the file... 

```python
fib(n) -> fib(int n)
       -> cdef int prev, current, i
```

## Cython features

You can easily create your own types, classes or structs that work in cython.  These allow you to work with common
constructs found in C, but still function externally as easy to use python objects.  There are some caveats to using
them, however I am only familiar with a few.  

### cdef

The first and easiest to understand is the `cdef` keyword.  This word is used anytime you want to declare something
as being `c compatible`.  What this means is that it will make the variable accessible only in the scope of the c
system and that anything else that accesses it will need to cross the c-python boundary.  

In [7]:
%%cython -a

cdef int x

x = 10
y = 20
z = x * y

If we look at line 4 in the above output, we see that it is simply a single variable being created using `c` code, 
however on line 6 where it is a python object (`z`) it is instead creating a python int (`__Pyx_PyInt_From_int`) of our variable
and then using the multiply from python (`PyNumber_Multiply`).  

### structs

You can define `c`-style structs using the `cdef struct` definition as shown below.  

In [44]:
%%cython -a

cdef struct Task:
    char *task
    int complete
    int time
    
cdef Task test_task

test_task.complete = 0
test_task.task = 'Testing this'

So, my next task for this week, hopefully on Wednesday, is to add to the project the ability to use `.pyx` files and
have those files compile using the command `python setup.py build_ext --inplace`.  After that, I will implement a version
of the knn in cython.  

I'm very excited for cython and the power that it offers.  