In [1]:
import numpy as np

# jupyter nbconvert --to slides Slides.ipynb --post serve

# Extending Python Using Cython

Daning Huang

## Instructions for Installation

### Using PyPI (Preferred for Linux/Mac)
1. Setup the `pip` tool (https://pip.pypa.io/en/stable/installing/)
2. In shell, run
>`[sudo] pip install Cython`

To test PyPI-based Cython:
1. Enter `test` folder and make.
2. If correct, you will see
>`My answer is 42!`

### Using Anaconda (Preferred for Windows)
1. Download the Anaconda package [Python 2.7 64-bit] (https://www.continuum.io/downloads)
2. Download the Microsoft Visual C++ Compiler for Python (https://www.microsoft.com/en-us/download/details.aspx?id=44266)

To test Anaconda:
1. Open and run the file ./test/Test_Cython.ipynb.
2. If correct, you will see
>`My answer is 42!`

## First, let's consider several scenarios...

### TensorFlow - Best of both worlds

Combining flexibility of Python and efficiency of other languages.

Access existing code from legacy, low-level or high-performance libraries and applications.

### OpenRAVE - Breaking performance bottleneck

Using C/C++ extension to accelerate the critical part of a Python code.

[img: results]

### Fluid-Structural-Thermal-Interaction - 1+1+1>3

Python as a glue language

[gif: results]

## Outline for today
1. Introducing Cython
2. A hands-on example
3. Mechanisms behind Cython
4. A practical example: Structural optimization
5. Concluding remarks

## Cython: C-Extensions for Python
### Nearly-automatic conversion from Python to C
### Native integration with C/C++ code

Python: **.py** -(CPython Compiler)- **byte code/.pyc** -(CPython Interpreter)- **Machine code**

Cython: **.pyx** -(Cython)- **.c** -(C Compiler)- **Machine code**

C: **.c** -(C Compiler)- **Machine Code**

"分かりますか？" $\rightarrow$ "Do you understand?" $\rightarrow$ "识得唔识得噶？"

## A Hands-on Example
### Problem
- Input: $N$
- Goal: Find the first $N$ prime numbers

Let's switch to the first example.

- Anaconda user: example_1/example_1.ipynb
- PyPI user: example_1/example_1.py

### Recap
- 2x faster by doing nothing
- 20x faster by adding some C types (A few lines!)

## The Mechanisms Behind Cython

### PyObject v.s. Raw buffer
[img: loop in python]

[img: loop in c]

### The Global Interpretation Lock
[img: GIL]

## A Practical Example - Part I
### Problem
- Input: Distribution of beam thickness $h(x)$ and transverse loading $p(x)$.
- Goal: Find the beam deflection

[img: A beam with deflection]

### Approach - Finite element
#### The beam element
$$
\begin{bmatrix}
K_{11} & K_{12} & K_{13} & K_{14} \\
K_{21} & K_{22} & K_{23} & K_{24} \\
K_{31} & K_{32} & K_{33} & K_{34} \\
K_{41} & K_{42} & K_{43} & K_{44}
\end{bmatrix}
\begin{bmatrix}
w_1 \\ \phi_1 \\ w_2 \\ \phi_2
\end{bmatrix}=
\begin{bmatrix}
F_1 \\ M_1 \\ F_2 \\ M_2
\end{bmatrix}
$$

[img: A beam element]

#### Assembling the global matrix

[img: The mesh of the beam]

[img: The shape of global matrix]

#### Solving the linear system
$$
\mathbf{K}\mathbf{x} = \mathbf{F}
$$
where $\mathbf{K}$ is *symmetric* and *banded*.

1. Cholesky Decomposition:
$$\mathbf{K}=\mathbf{L}\mathbf{L}^T$$
2. First back substitution:
$$\mathbf{L}\mathbf{y} = \mathbf{F}$$
3. Second back substitution:
$$\mathbf{L}^T\mathbf{x} = \mathbf{y}$$

### Implementation
Let's switch to the code

- Anaconda user: example_2/example_2.ipynb
- PyPI user: example_2/example_2.py