## Lab 4: Cython

### Pure Python code for fibonacci

In [1]:
def fib_py(n):
    a,b = 0,1
    for i in range(n):
        a,b = a+b,a
    return a

### Cython

In [2]:
%load_ext cython

In [3]:
%%cython
def fib_cy(int n):
    cdef int a,b,i
    a,b = 0,1
    for i in range(n):
        a,b = a+b,a
    return a

## C/ C++
```C
int fib(int n)
{
	int tmp, i,a=0,b=1;
	for (i=0; i<n; i++){
		tmp = a;
		a += b;
		b = tmp;
	}
	return a;
}
```


In [4]:
%timeit -n 100 fib_py(3000)

330 µs ± 9.57 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [5]:
%timeit -n 10000 fib_cy(3000)

1.01 µs ± 60 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [7]:
!./myfib 3000 10000

/bin/bash: ./myfib: cannot execute binary file


### Why is Python slow?

In [None]:
def foo(a, b):
    return a + b

In [None]:
from dis import dis
dis(foo)


### What's `BINARY_ADD`?

```c
/* Python/ceval.c */
TARGET(BINARY_ADD) {
    PyObject *right = POP();
    PyObject *left = TOP();
    PyObject *sum;
    /* NOTE(haypo): Please don't try to micro-optimize int+int on
       CPython using bytecode, it is simply worthless.
       See http://bugs.python.org/issue21955 and
       http://bugs.python.org/issue10044 for the discussion. In short,
       no patch shown any impact on a realistic benchmark, only a minor
       speedup on microbenchmarks. */
    if (PyUnicode_CheckExact(left) &&
            PyUnicode_CheckExact(right)) {
        sum = unicode_concatenate(left, right, f, next_instr);
        /* unicode_concatenate consumed the ref to left */
    }
    else {
        sum = PyNumber_Add(left, right); // <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
        Py_DECREF(left);
    }
    Py_DECREF(right);
    SET_TOP(sum);
    if (sum == NULL)
        goto error;
    DISPATCH();
}
```

### What's `PyNumber_Add(left, right)`?

```c
/* Objects/abstract.c */
PyObject *
PyNumber_Add(PyObject *v, PyObject *w)
{
    PyObject *result = binary_op1(v, w, NB_SLOT(nb_add)); // <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
    if (result == Py_NotImplemented) {
        PySequenceMethods *m = v->ob_type->tp_as_sequence;
        Py_DECREF(result);
        if (m && m->sq_concat) {
            return (*m->sq_concat)(v, w);
        }
        result = binop_type_error(v, w, "+");
    }
    return result;
}
```

### What's `NB_BINOP()`?

```c
#define NB_BINOP(nb_methods, slot) \
        (*(binaryfunc*)(& ((char*)nb_methods)[slot]))
```

### What's the addition function for two integers?

```c
/* Objects/longobject.c */
static PyObject *
long_add(PyLongObject *a, PyLongObject *b)
{
    PyLongObject *z;

    CHECK_BINOP(a, b);

    if (Py_ABS(Py_SIZE(a)) <= 1 && Py_ABS(Py_SIZE(b)) <= 1) {
        return PyLong_FromLong(MEDIUM_VALUE(a) + MEDIUM_VALUE(b)); // <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
    }
    if (Py_SIZE(a) < 0) {
        if (Py_SIZE(b) < 0) {
            z = x_add(a, b);
            if (z != NULL) {
                /* x_add received at least one multiple-digit int,
                   and thus z must be a multiple-digit int.
                   That also means z is not an element of
                   small_ints, so negating it in-place is safe. */
                assert(Py_REFCNT(z) == 1);
                Py_SIZE(z) = -(Py_SIZE(z));
            }
        }
        else
            z = x_sub(b, a);
    }
    else {
        if (Py_SIZE(b) < 0)
            z = x_sub(a, b);
        else
            z = x_add(a, b);
    }
    return (PyObject *)z;
}
```

### What's `MEDIUM_VALUE()`?

```c
/* Objects/longobject.c */
#define MEDIUM_VALUE(x) (assert(-1 <= Py_SIZE(x) && Py_SIZE(x) <= 1),  \
     Py_SIZE(x) < 0 ? -(sdigit)(x)->ob_digit[0] :  \
         (Py_SIZE(x) == 0 ? (sdigit)0 :  \
          (sdigit)(x)->ob_digit[0]))
```

### Why so much code for such a simple operation?

* Polymorphism -- code can handle `foo('a', 'b')` or any types that support `+`.
* Works for user-defined types, too, with an `__add__` or `__radd__` magic method.
* For adding ints, does overflow checking and conversions, etc.


### What is cython?

- Cython is a python like language that 
    - Improves python's performance
    - Wraps external libraries (C, C++)
    

- cython command translates code to C which is compiled into a python extension module.

### Why is it better?

- Dynamic typing in Python vs static typing in Cython.


- Compiler optimization.


- Performance gains are most significant in CPU-bound programs, notably in tight Python loops. By contrast, I/O bound programs are not expected to benefit much from a Cython implementation.


In [None]:
class Point_2d:
    def __init__(self,x,y):
        self.x = x
        self.y = y
        
    def __add__(self,other):
        
        newx = self.x + other.x
        newy = self.y + other.y
        return Point_2d(newx, newy)
    
    def __repr__(self):
        return (f'(x,y) : ({self.x},{self.y})')

In [None]:
a = Point_2d(1,2)
b = Point_2d(3,4)

In [None]:
c = a+b
print(c.x,c.y)
print(c)