In [1]:
%autosave 0
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

Autosave disabled


## Vectorization is Fast!

This short notebook demonstrates that working with <tt>NumPy</tt> arrays is much faster than working with *Python* lists.

In [2]:
import numpy as np

We begin by defining two <tt>NumPy</tt> arrays `a` and `b` that are each filled with a million random numbers.

In [3]:
a = np.random.rand(1000000)
b = np.random.rand(1000000)

Next, we compute the <em style="color:blue;">dot product</em> of `a` and `b`.  Mathematically, this is defined as follows:
$$ \textbf{a} \cdot \textbf{b} = \sum\limits_{i=1}^n \textbf{a}[i] \cdot \textbf{b}[i], $$
where $n$ is the dimension of `a`and `b`.

In [4]:
%%time 
a @ b

CPU times: user 29.1 ms, sys: 3.63 ms, total: 32.8 ms
Wall time: 31 ms


250058.67784253042

To compare this time with time that is needed if `a` and `b` are stored as lists instead, we convert `a` and `b` to ordinary *Python* lists.

In [5]:
la = list(a)
lb = list(b)

Next, we compute the <em style="color:blue;">dot product</em> of `a` and `b` using these lists.

In [6]:
%%time
sum = 0
for i in range(len(la)):
    sum += la[i] * lb[i]

CPU times: user 485 ms, sys: 4.29 ms, total: 489 ms
Wall time: 280 ms


We notice that <tt>NumPy</tt> based computation is much faster than the list based computation.  Similar observations can be made when a function is applied to all elements of an array.  For big arrays, using the vectorized functions offered by <tt>NumPy</tt> is usually much faster than applying the function to all elements of a list.

In [7]:
import math

In [8]:
%%time
for i, x in enumerate(a):
    b[i] = math.sin(x)

CPU times: user 276 ms, sys: 5.5 ms, total: 281 ms
Wall time: 303 ms


In [9]:
%%time
b = np.sin(a)

CPU times: user 10.4 ms, sys: 2.42 ms, total: 12.8 ms
Wall time: 12.5 ms
