In [None]:
# So far we have been discussing the basics of NumPy, now we will focus on computation of NumPy arrays
# Coputation on NumPy arrays can be fast or slow, key for fast is to use vectorized operations through NumPy's universal functions
# This sections motivates using these universal functions, which are used to make repeated calculations on array elements much more efficient
# We will also go over most common/useful arithmetic ufuncs available in NumPy package

In [2]:
# Python's lack of performance usually shows when many small operations are being repeated
# Ex. Looping over arrays to operate on each element
# Ex. We have an array of values we'd like to compute the reciprocal (1 divided by value) of each
# One approach can be like below:
import numpy as np
np.random.seed(0)

def compute_reciprocals(values):
    output = np.empty(len(values))
    for i in range(len(values)):
        output[i] = 1.0 / values[i]
    return output

values = np.random.randint(1, 10, size=5)
compute_reciprocals(values)

array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

In [6]:
# Above looks normal for someone in Java or C
# But measuring execution time of this code for large input tells us its pretty slow
big_array = np.random.randint(1, 100, size=1000000)
%timeit compute_reciprocals(big_array)

10 loops, best of 3: 142 ms per loop


In [None]:
# As we can see, it takes several milliseconds to compute million operations and store the result
# Bottleneck is not the operations, but type-checking and function dispatches that CPython must do to each cycle of the loop
# Each time reciprocal is computed, Python examines object's type and then does a dynamic lookup of the correct function to use for that type

In [7]:
# NumPy provides convenient interface into just this kind of statically typed/ compiled routine
# Known as a vectorized operation and is accomplished simply by performing an operation on the array which is then applied to each element
# This approach is designed to push the loop into the compiled layer that underlies NumPy
# Lets compare the 2 results below
print(compute_reciprocals(values))
print(1.0 / values)

[0.16666667 1.         0.25       0.25       0.125     ]
[0.16666667 1.         0.25       0.25       0.125     ]


In [8]:
%timeit (1.0 / big_array)

100 loops, best of 3: 2.96 ms per loop


In [10]:
# As we can see, this method is MUCH faster than the method we use
# Vectorized operations in NumPy are implemented via ufuncs
# ufuncs main purpose is to quickly execute repeated operations on values in NumPy arrays
# We did operations with a scalar and an array, but we can operate between two arrays as well
np.arange(5) / np.arange(1, 6)

array([0, 0, 0, 0, 0])