# Comparing the speed of python, cython and numpy

first, we define two random vectors

In [1]:
import numpy as np
%time
L = 1000000
a=np.random.uniform(low=-1,high=1,size=L)
b=np.random.uniform(low=-1,high=1,size=L)

CPU times: user 1 µs, sys: 0 ns, total: 1 µs
Wall time: 6.91 µs


In [2]:
np.shape(a),np.shape(b)

((1000000,), (1000000,))

In [3]:
a[:5],b[:5]

(array([-0.35308728,  0.61969266, -0.5129263 , -0.86307921,  0.43900191]),
 array([-0.49663334,  0.34527971,  0.74819351, -0.50109067, -0.62784724]))

Check time for performing the dot product between lists rather than ndarrays.

## Computing a dot product using pure python

In [4]:
def python_dot(a,b):
    sum=0
    for i in xrange(len(a)):
        sum+=a[i]*b[i]
    return sum
    

In [5]:
%time python_dot(a,b)

CPU times: user 922 ms, sys: 76.2 ms, total: 998 ms
Wall time: 939 ms


-685.74473134544064

## Computing the dot product using the dot method in numpy

In [6]:
%time np.dot(a,b)

CPU times: user 1.25 ms, sys: 382 µs, total: 1.64 ms
Wall time: 1.1 ms


-685.74473134543825

## Computing the dot product using cython

In [7]:
%load_ext cythonmagic

The Cython magic has been move to the Cython package, hence 
`%load_ext cythonmagic` is deprecated; Please use `%load_ext Cython` instead.
You need Cython version >=0.21 to use the Cython magic


In [8]:
%%cython
cimport numpy as np  # makes numpy available to cython
# The following line defines a and b as numpy arrays, cython knows how to deal with those.
def cython_dot(np.ndarray[np.float64_t, ndim=1] a,
                np.ndarray[np.float64_t, ndim=1] b):
    cdef double sum
    cdef long i
    sum=0
    for i in xrange(a.shape[0]):
        sum=sum+a[i]*b[i]
    return sum


ERROR: Cell magic `%%cython` not found.


In [9]:
%%time
cython_dot(a,b)

CPU times: user 1.67 ms, sys: 14 µs, total: 1.69 ms
Wall time: 1.71 ms


147.0301582897001

In [9]:
!pwd

/Users/yoavfreund/Bigdata/UCSD_BigData/notebooks/streaming_statistics
