# Cython 

### tl;dr

* Use Cython to speed up simple Python code 
* Easiest speed up is typecasting (define data types)
* use cdef for variables within Cython function scopes
* use cpdef for return types for Cython functions 
* Jupyter magic to load: `%load_ext Cython` and `%%cython`

[Cython](http://docs.cython.org/en/latest/) is a C/C++ extension for Python that is used to speed up and optimize Python code.  Source code gets translated into optimized C/C++ code and compiled as Python extension modules. This allows for both very fast program execution and tight integration with external C libraries, while keep productivity up in a native Python programming style. Popular libraries like Numpy, Pandas, and Scikit-learn all enjoy quick run-time, because they are all C Optimized with Cython! 

Cython programming can seem a bit imposing, and there is a lot you can do with it, including using C libraries and wrapping functions. The main gist here is that you can see huge performance gains with just one change to your code: specifiying types. Python is a dynamically typed programming language, meaning you can change the type of a specific variable on the fly. 

In [1]:
x = 5 #int 
x = 5.425 #float 
x = [1, 2, 3, 4, 5] #list 
x = 'hello' #string

But in a more optmized staticically typed language, like C, C++, or Java, would make you define a variable as a specific type and keep it; this avoids any type look ups, which take time. Under the hood, you are allocating the right amount of memory for that type. Anytime you need to change the memory address, it costs you efficiency, which is why Python is slower. 

#### In C++ 
```c++
int x = 5;  //specify the type upfront 
x = "hello" //this would break! 
```

Cython will want something similar 

#### In Cython
```python
cdef float x = 5.0
```

If you keep x as a float, you will be significantly rewarded. Thats about it in terms of syntax! Whenever you declare a variable, write cdef, and the type before it. Here are the main cython types: 

```python
    cdef int x,y,z
    cdef char *s = 'hello'
    cdef float x = 5.2 (single precision)
    cdef double x = 40.5 (double precision)
    cdef list li = [1, 2, 3]
    cdef dict d = {'first' : 21, 'second' : 42}
    cdef object inst = obj()
```

Additionally, there are different Cython variable types:


**def** - regular python function, calls from Python only.

**cdef** - cython only functions, can't access these from python-only code since there will be no C translation to Python for these.

**cpdef** - C and Python. Will create a C function and a wrapper for Python. 

In general, you can keep variables within Cython function scopes as `cdef`, and return types (since they go back to interact with pure Python) as `cpdef`

## Cython in Jupyter 

[Loading Cython into Jupyter](http://docs.cython.org/en/latest/src/quickstart/build.html) is pretty simple. You need to run `%load_ext Cython` in its own cell. 

In [2]:
%load_ext Cython

If and when you run your imports for cython related modules, preface it with `%%cython`.

In [3]:
%%cython
cimport numpy as np

When you define a function preface it with `%%cython`, or `%%cython -a` to get a nice annotation on the translation. 

Now lets transform this basic Python function to optimized Cython function. 

In [15]:
def test(x):
    '''useless function to increment y x times'''
    y = 0
    for i in range(x):
        y += i
    return y

Looking at this function, the main areas we can change are the data types; we can specify them. If we are explicit about our useage, we know that x, i, and y will always be ints. We can type cast all these, and even the return type of the function, to optimize.

In [13]:
%%cython -a 
cpdef int cy_test(int x):
    '''useless function to increment y x times'''
    cdef int y = 0
    cdef int i
    for i in range(x):
        y += i
    return y

### Benchmark Testing 

Lets benchmark test the regualr Python expression with the Cython one. 

In [23]:
import timeit

cy = timeit.timeit('cy_test(50)', setup = "from __main__ import cy_test", number=10000)
py = timeit.timeit('test(50)', setup = "from __main__ import test", number=10000)

print ('Cython takes {}s per iteration'.format(cy))
print('Python takes {}s per iteration'.format(py))
print('Cython is {}x faster'.format(py/cy))

Cython takes 0.0012902705854642704s per iteration
Python takes 0.04030693111551287s per iteration
Cython is 31.239130434806793x faster


There is some variance, but Cython is 25-50 times faster! All by just changing two variables to static types and compiling the function. Not bad at all. 

Anywhere where we're using variables the most frequently is the place to optimize. In our case, that'd be the for loop. Consider places in your code where Python has to keep verifying the type of some variable. This can either be in loops, or in programs that scale out. For example, if you have a heavily trafficked website, or maybe you've got some sort of crawlbot, or maybe you're analyzing tick prices from stocks, any time you're scaling out the use of variables, you should consider adding typing information for some serious performance improvements.

## Compiling from .pyx files 

The second way to use Cython is directly from source files. The abridged version is as follows: 

1. Create Cython functions in `.pyx` files
2. Create setup `.py` file 
3. Compile with `python setup.py build_ext --inplace`
4. Import and use your modules as you would otherwise! 

First you will write your Cython functions in `.pyx` files. I created [cy_test.pyx](cy_test.pyx) with the same example: 
```python 
def cy_test(int x):
    '''useless function to increment y x times'''
    cdef int y = 0
    cdef int i
    for i in range(x):
        y += i
    return y
    ```
    
Next I created the `setup.py` file with the following boilerplate code. 

```python
from distutils.core import setup 
from Cython.Build import cythonize 

setup(ext_modules = cythonize('cy_test.pyx'))
```

The arguement for cythonize will change along with your modules. 

Finally, in the terminal, run the command `python setup.py build_ext --inplace`. You should get the following log output on Windows:

```
Compiling cy_test.pyx because it changed.
[1/1] Cythonizing cy_test.pyx
running build_ext
building 'cy_test' extension
creating build
creating build\temp.win-amd64-3.6
creating build\temp.win-amd64-3.6\Release
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.11.25503\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\mohit\Anaconda3\include -IC:\Users\mohit\Anaconda3\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.11.25503\ATLMFC\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.11.25503\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.16299.0\winrt" /Tccy_test.c /Fobuild\temp.win-amd64-3.6\Release\cy_test.obj
cy_test.c
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.11.25503\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\mohit\Anaconda3\libs /LIBPATH:C:\Users\mohit\Anaconda3\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.11.25503\ATLMFC\lib\x64" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.11.25503\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.16299.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.16299.0\um\x64" /EXPORT:PyInit_cy_test build\temp.win-amd64-3.6\Release\cy_test.obj /OUT:C:\Users\mohit\Documents\projects\tutorials\cython\cy_test.cp36-win_amd64.pyd /IMPLIB:build\temp.win-amd64-3.6\Release\cy_test.cp36-win_amd64.lib
   Creating library build\temp.win-amd64-3.6\Release\cy_test.cp36-win_amd64.lib and object build\temp.win-amd64-3.6\Release\cy_test.cp36-win_amd64.exp
Generating code
Finished generating code
```

Depending on your machine, you should get a `cy_test.c` file and `cy_test.cp36-win_amd64.pyd` file as output. These are your optimized files! 

From here you can import and use your modules as normal. 

In [24]:
import cy_test

cy_test.cy_test(5)

10

Cython can get quite a bit more complicated, hell, you're wrapping Python in pure C! But thats for another time. 

### More Cython Resources: 

[Sentdex has a great Tutorial](https://pythonprogramming.net/introduction-and-basics-cython-tutorial/#cdef-declarations) where he goes more in depth on compilation and how/why the typecasting change works so well. 

[Chris LeBlanc](https://www.youtube.com/watch?v=L-FyCT02gqc) going in deeper on benchmarks a great Cython exmaple for a PyCon conference .
