# Efficient NumPy

In [1]:
import numpy as np

## Best practices

### Avoid loops

Python loops are costly:

In [2]:
def square_loop(a):
    """Calculate square of an array in loop. We assume 1D array here."""

    result = np.zeros_like(a)
    for i in range(a.shape[0]):
        result[i] = a[i]*a[i]
    return result

In [3]:
large_arr = np.random.randint(100, size=(100000,))

In [4]:
%timeit -n 10 -r 3 square_loop(large_arr)

22.4 ms ± 459 µs per loop (mean ± std. dev. of 3 runs, 10 loops each)


In [5]:
%timeit -n 10 -r 3 np.square(large_arr)

56.2 µs ± 8.85 µs per loop (mean ± std. dev. of 3 runs, 10 loops each)


### Use broadcasting

Broadcasting mechanism provides an extremely efficient way of handling operations on arrays of different dimensionality. And it's always way more readable and concise. For example, to add `1D` array `b` to `2D` array `a` row-wise with a loop:

In [6]:
def row_loop(a, b):
    """Add a vector to a matrix directly."""

    result = np.zeros_like(a)
    for i in range(a.shape[0]):
        result[i] = a[i] + b
    return result

In [7]:
large_arr = np.random.randint(100, size=(1000,1000))
large_b = np.random.randint(100, size=(1000,))

In [8]:
%timeit -n 10 -r 3 row_loop(large_arr, large_b)

3.2 ms ± 473 µs per loop (mean ± std. dev. of 3 runs, 10 loops each)


Broadcasting is about `2X` faster:

In [9]:
%timeit -n 10 -r 3 large_arr + large_b

1.96 ms ± 351 µs per loop (mean ± std. dev. of 3 runs, 10 loops each)


In-place addition with broadcasting is even faster:

In [10]:
%timeit -n 10 -r 3 np.add(large_arr, large_b, out=large_arr)

837 µs ± 193 µs per loop (mean ± std. dev. of 3 runs, 10 loops each)


Btw, broadcasting allows for creating fancy structures in just a single line (you may leverage this in one of the problems in Homework #2):

In [11]:
np.arange(10) + np.expand_dims(np.arange(10), axis=-1)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12],
       [ 4,  5,  6,  7,  8,  9, 10, 11, 12, 13],
       [ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14],
       [ 6,  7,  8,  9, 10, 11, 12, 13, 14, 15],
       [ 7,  8,  9, 10, 11, 12, 13, 14, 15, 16],
       [ 8,  9, 10, 11, 12, 13, 14, 15, 16, 17],
       [ 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]])

### Beware!

In-place operations are prone to bugs due to incorrect shape of the result container:

In [12]:
A = np.random.randint(10, size=(10,10))
B = np.random.randint(10, size=(10,))

In [13]:
A

array([[6, 5, 8, 0, 3, 4, 1, 8, 3, 2],
       [4, 4, 4, 5, 8, 4, 0, 8, 7, 7],
       [9, 9, 6, 7, 9, 7, 2, 0, 8, 2],
       [2, 5, 2, 7, 3, 5, 0, 0, 3, 9],
       [8, 6, 5, 9, 2, 7, 7, 1, 3, 8],
       [5, 3, 0, 1, 7, 4, 9, 0, 3, 9],
       [0, 2, 8, 8, 5, 9, 9, 2, 5, 4],
       [8, 7, 8, 2, 4, 2, 6, 8, 7, 7],
       [6, 9, 2, 8, 6, 3, 6, 2, 4, 1],
       [5, 3, 3, 0, 7, 7, 5, 4, 3, 1]])

In [14]:
B

array([7, 6, 2, 7, 9, 2, 0, 3, 4, 7])

In [15]:
A+B

array([[13, 11, 10,  7, 12,  6,  1, 11,  7,  9],
       [11, 10,  6, 12, 17,  6,  0, 11, 11, 14],
       [16, 15,  8, 14, 18,  9,  2,  3, 12,  9],
       [ 9, 11,  4, 14, 12,  7,  0,  3,  7, 16],
       [15, 12,  7, 16, 11,  9,  7,  4,  7, 15],
       [12,  9,  2,  8, 16,  6,  9,  3,  7, 16],
       [ 7,  8, 10, 15, 14, 11,  9,  5,  9, 11],
       [15, 13, 10,  9, 13,  4,  6, 11, 11, 14],
       [13, 15,  4, 15, 15,  5,  6,  5,  8,  8],
       [12,  9,  5,  7, 16,  9,  5,  7,  7,  8]])

In [16]:
np.add(A, B)

array([[13, 11, 10,  7, 12,  6,  1, 11,  7,  9],
       [11, 10,  6, 12, 17,  6,  0, 11, 11, 14],
       [16, 15,  8, 14, 18,  9,  2,  3, 12,  9],
       [ 9, 11,  4, 14, 12,  7,  0,  3,  7, 16],
       [15, 12,  7, 16, 11,  9,  7,  4,  7, 15],
       [12,  9,  2,  8, 16,  6,  9,  3,  7, 16],
       [ 7,  8, 10, 15, 14, 11,  9,  5,  9, 11],
       [15, 13, 10,  9, 13,  4,  6, 11, 11, 14],
       [13, 15,  4, 15, 15,  5,  6,  5,  8,  8],
       [12,  9,  5,  7, 16,  9,  5,  7,  7,  8]])

This one will work:

In [17]:
np.multiply(A, B, out=A)

array([[42, 30, 16,  0, 27,  8,  0, 24, 12, 14],
       [28, 24,  8, 35, 72,  8,  0, 24, 28, 49],
       [63, 54, 12, 49, 81, 14,  0,  0, 32, 14],
       [14, 30,  4, 49, 27, 10,  0,  0, 12, 63],
       [56, 36, 10, 63, 18, 14,  0,  3, 12, 56],
       [35, 18,  0,  7, 63,  8,  0,  0, 12, 63],
       [ 0, 12, 16, 56, 45, 18,  0,  6, 20, 28],
       [56, 42, 16, 14, 36,  4,  0, 24, 28, 49],
       [42, 54,  4, 56, 54,  6,  0,  6, 16,  7],
       [35, 18,  6,  0, 63, 14,  0, 12, 12,  7]])

This one will note (although broadcasting mechanics is ok for addition):

In [18]:
np.add(A, B, out=B)

ValueError: ignored