In [1]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

## Vectorization is Fast!

This short notebook demonstrates that working with <tt>NumPy</tt> arrays is much faster than working with *Python* lists.

In [2]:
import numpy as np

We begin by defining two <tt>NumPy</tt> arrays `a` and `b` that are each filled with $100\,000$ random numbers.

In [3]:
a = np.random.rand(1000000)
b = np.random.rand(1000000)

Next, we compute the dot product of `a` and `b`.  Mathematically, this is defined as follows:
$$ \textbf{a} \cdot \textbf{b} = \sum\limits_{i=1}^n \textbf{a}[i] \cdot \textbf{b}[i], $$
where $n$ is the dimension of `a`and `b`.

In [4]:
%%time 
a @ b

CPU times: user 8.58 ms, sys: 1.28 ms, total: 9.86 ms
Wall time: 5.38 ms


249897.2071615944

To compare this time with time that is needed if `a` and `b` are stored as lists instead, we convert `a` and `b` to ordinary *Python* lists.

In [5]:
la = list(a)
lb = list(b)

Next, we compute the dot product of `a` and `b` using these lists.

In [6]:
%%time
sum = 0
for i in range(len(la)):
    sum += la[i] * lb[i]

CPU times: user 480 ms, sys: 1.92 ms, total: 482 ms
Wall time: 260 ms


We notice that <tt>NumPy</tt> is much faster than the list based computation.  Similar observations can be made when a function is applied to all elements of an array.  For big array, that is usually much faster than applying the function to all elements of a list.

In [7]:
import math

In [8]:
%%time
for i in range(len(a)):
    b[i] = math.sin(i)

CPU times: user 242 ms, sys: 3.32 ms, total: 245 ms
Wall time: 252 ms


In [9]:
%%time
b = np.sin(a)

CPU times: user 5 ms, sys: 3.24 ms, total: 8.24 ms
Wall time: 4.21 ms
