In [1]:
%autosave 0
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

Autosave disabled


## Vectorization is Fast!

This short notebook demonstrates that working with <tt>NumPy</tt> arrays is much faster than working with *Python* lists.

In [2]:
import numpy as np

We begin by defining two <tt>NumPy</tt> arrays `a` and `b` that are each filled with a million random numbers.

In [3]:
a = np.random.rand(1000000)
b = np.random.rand(1000000)

Next, we compute the <em style="color:blue;">dot product</em> of `a` and `b`.  Mathematically, this is defined as follows:
$$ \textbf{a} \cdot \textbf{b} = \sum\limits_{i=1}^n \textbf{a}[i] \cdot \textbf{b}[i], $$
where $n$ is the dimension of `a`and `b`.

In [5]:
%%time 
a @ b

CPU times: user 1.94 ms, sys: 541 µs, total: 2.48 ms
Wall time: 785 µs


249632.90179701775

To compare this time with time that is needed if `a` and `b` are stored as lists instead, we convert `a` and `b` to ordinary *Python* lists.

In [6]:
la = list(a)
lb = list(b)

Next, we compute the <em style="color:blue;">dot product</em> of `a` and `b` using these lists.

In [7]:
%%time
sum = 0
for i in range(len(la)):
    sum += la[i] * lb[i]

CPU times: user 220 ms, sys: 1.39 ms, total: 221 ms
Wall time: 220 ms


We notice that <tt>NumPy</tt> based computation is much faster than the list based computation.  Similar observations can be made when a function is applied to all elements of an array.  For big arrays, using the vectorized functions offered by <tt>NumPy</tt> is usually much faster than applying the function to all elements of a list.

In [8]:
import math

In [9]:
%%time
for i, x in enumerate(a):
    b[i] = math.sin(x)

CPU times: user 256 ms, sys: 2.1 ms, total: 258 ms
Wall time: 257 ms


In [10]:
%%time
b = np.sin(a)

CPU times: user 7.72 ms, sys: 11.8 ms, total: 19.5 ms
Wall time: 9.78 ms
