# Intro to Cython


## Overview

### What You'll Learn
1. How to use Cython to integrate Python and C code!


### Prerequisites
None, besides a decent understanding of C and Python.


## Background

### What is Cython?

As you probably guessed, Cython is basically Python with some C code.

Cython allows us to use several aspects of C coding within python code, including
 - Data types
 - Functions
 - Modules
 
Most Python code is compatible with Cython, with the exception of a few cases which will be covered later.

### How does Cython work?

1. Code written in Python gets translated to C code
2. This code gets compiled
3. This compiled code gets bundled into Python extensions
4. These Python extensions can be imported into and used in your existing Python Code

### How does Cython help optimize Python code?

Python is, by default, an extremely slow language for 3 main reasons:

1. Global Interpreter Lock (GIL)

Simply, this is "a mutex (or a lock) that allows only one thread to hold the control of the Python interpreter."

2. Python is interpreted and not compiled

Python converts code directly to bytecode, which is a low-level set of instructions that can be directly executed by the interpreter. Unfortunately, this requires a lot of extra work to have it translated into a form that is easy for the machine to execute. An analogy:

> "If you can talk in your native language to someone, that would generally work faster than having an interpreter having to translate your language into some other language for the listener to understand."

3. It's a dynamically typed language

In short, this means that the concept of object "types" exists, but a variables type can change. 

In [None]:
a = 1
print(type(a))
a = "foo"
print(type(a))

This is costly becauase Python has to compare and possibly convert types every time a variable is read, written to, or referenced. Since Python is so flexible, it's difficult to optimize. While most languages, for example, Java and C/C++, are built around optimizing performance, Python emphasises flexibility, sacrificing some level of performance in the process. 

Cython addresses two of these three aspects of python. It combines C-typing and the compiled nature of C with Python, making it capable of improving your code's performance in certain situations by up to __84x__ if the Cython is optimized. (Source: https://notes-on-cython.readthedocs.io/en/latest/std_dev.html)

These are abridged explanations of the concepts. More information isn't necessary to effectively use Cython, but if you are curious to hear the in-depth explanations, there are sources at the bottom of the page!

## Using Cython

### First Program On Your Own PC

(If you just want to know how Cython works in Jupyter and don't care to try it on your own PC, this part can be skipped. The next section also shows how to implement Cython directly in your code.)

The first step is to create a virtual environment to run your code in and then activate it.

On Linux command line:

In [None]:
##### WARNING #####
# THIS SECTION SHOULD NOT RUN IN JUPYTER! These need to be used in the command line.
###################
python 3 -m venv cython_env

source cython_env/bin/activate

ls

mkdir my_first_cython

ls

# this might need to be "pip3 install Cython"
pip install Cython

Create a file using ".pyx" after it.

Write a print statement, this can be anything you want! 

Create another file called "setup.py"

Write the following code:

In [None]:
from setuptools import setup
from Cython.Build import cythonize

setup( ext_modules = cythonize("[your_file_name].pyx"))

Compile your code with:

In [None]:
python setup.py build_ext --inplace

You should be able to see a new file that you didn't create yourself in the directory, called "[your_file_name].c"

Run python by typing "python" into the compiler. Then import your code with "import [your_file_name]"

You should see your print statement!

This is a very basic example of how to use Cython! Although, for a simple print statement, the difference in time will barely be noticeable, when this is used for more time-consuming and/or complex code, the difference will be much bigger.

### Time Comparison (In Jupyter!)

#### Regular Python Code

Factorials are a great example of time differences. Here we have regular Python code to compute a factorial:

In [None]:
def factorial(n):
    fact = 1
    for x in range(2, n+1):
        fact = fact * x
    return fact

In [None]:
factorial(5)

In [None]:
%timeit factorial(100000)

#### Regular Python Code but in Cython

Here is the exact same code, except loaded in Cython:

In [None]:
%load_ext Cython

In [None]:
%%cython
def factorial_cy(n):
    fact = 1
    for x in range(2, n+1):
        fact = fact * x
    return fact

In [None]:
%timeit factorial_cy(100000)

More than likely, there was an extremely small time difference.

#### Python Code with C Data Types

Here is the same code again, only this time, we have given the variables data-types, something used in C but not Python:

In [None]:
%%cython
def factorial_cy2(n):
    cdef int fact = 1
    cdef int x
    for x in range(2, n+1):
        fact = fact * x
    return fact

In [None]:
%timeit factorial_cy2(100000)

In this example, you should see a *drastically* different amount of time. Hopefully your output reads something like "90.7 µs," which is 90.7 microseconds. As you can see, simply using c data types in your Python code makes a massive amount of difference for time.

Data types are always implemented with this format in Cython. Print statements are formatted more closely to how they are in C, as shown below:

In [None]:
%%cython

# To declare a string
cdef str my_word = "word"

# To declare numbers
cdef int num1 = 1
cdef double num2 = 2.22
cdef float num3 = 3.03

print("Num1 is {}, Num2 is {}, and Num3 is {}. My string is '{}'."
      .format(num1, num2, num3, my_word))

Note that Cython variables don't keep their types outside of their cells. Outside of Jupyter notebook, this means that statically typed variables in Cython are not available directly in Python scope.

### Cython and C Functions

There are 3 different types of Cython functions:

 - def: Regular Python functions (these can still be used with Cython)
 - cdef: Pure C functions
 - cpdef: Hybrid C and Python functions
 
Even using Cython, everything about regular Python functions (def) still applies, but the rules for the other two types of functions vary slightly.

#### Cdef functions

 - Can be called from both Cython and C
 - These cannot be defined inside other functions
 - Only C types can be used inside these functions
 - Variables/Functions declared using cdef are __not__ accessible by the Python scope
 - Cython optimizes the code automatically within these functions
 
Cdef function example:

In [None]:
%%cython

cdef my_function(x, y, z):
    return x + y - z

print(my_function(50, 30, 10)) # should give 70
print(my_function(50.3, 30.9, 10.5)) # should give a large decimal

In [None]:
%%cython

cdef int my_new_function(x, y, z):
    return x + y - z

print(my_new_function(50, 30, 10)) # should give 70
print(my_new_function(50.3, 30.9, 10.5)) # should give 70 still

As you can see, with cython you are able to declare the function such that it returns a specific data type. The code itself it identical with the exception of the return type declaration, and as you can see, the return values are different solely because of this.

You are also able to declare data types for input values in the function:

In [None]:
%%cython

cdef int my_other_function(int x, int y):
    return x + y

print(my_other_function(10, 20)) # should give 30

As you can see, this runs without an error.

#### Cpdef functions (hybrid)

 - Internally, both a cdef and a def function get greated
 - The def functions acts as a wrapper in python, calling the cdef function 
 - It is best to use these when Cython functions are called inside Python
 
(More information on wrappers is linked at the bottom)

(Note that you cannot access the cdef function from a different Jupyter cell, but the below functions can be accessed from other cells, since they are hybrid.)

Here is a cpdef function example:

In [None]:
%%cython

cpdef my_cp_function(x, y, z):
    return x + y - z

print(my_cp_function(50, 30, 10))

In [None]:
# as you can see, this can be accessed externally
print(my_cp_function(60, 30, 10)) 

### That's an Intro to Cython!

If you would like to learn more about Cython or the other topics mentioned, here are some good resources:

__Why Python is Slow:__

 - GIL: https://realpython.com/python-gil/

 - Interpreted vs compiled: https://towardsdatascience.com/how-does-python-work-6f21fd197888#:~:text=For%20the%20most%20part%2C%20Python,pyc%20or%20.

 - Dynamically vs. Statically typed languages: https://pythonconquerstheuniverse.wordpress.com/2009/10/03/static-vs-dynamic-typing-of-programming-languages/#:~:text=Python%20is%20a%20dynamically%2Dtyped,an%20explicit%20conversion%20is%20required.

__Wrappers:__

 - https://wiki.python.org/moin/FunctionWrappers#:~:text=FunctionWrapper%20is%20a%20design%20pattern,result%20of%20a%20slow%20computation 

__Advanced Cython:__
 - https://share.cocalc.com/share/c4cb4a9830136f7bdc07b11c803665cc99b3d899/advanced-cython.html?viewer=share
 - http://www.discoversdk.com/blog/cython-advanced-topics
 
__Similar to Cython:__

 - Numba: https://numba.pydata.org/

 - Numba vs. Cython: http://stephanhoyer.com/2015/04/09/numba-vs-cython-how-to-choose/