# Why NumPy?

[NumPy](https://numpy.org/) is a math module. It specializes in performing fast math on large arrays of values, which is otherwise very slow in Python.

This notebook demonstrates that Python is slow even for simple math, and that equivalent operations with NumPy arrays is much faster. However, it also demonstrates the importance of interacting with NumPy on NumPy's terms, since accidentally introducing native Python types and functionality to NumPy operations can dramatically slow down your program.

In [2]:
# By convension, NumPy is usually imported as "np"
import numpy as np
from random import randint

In [3]:
# Create a native Python list of 1,000,000 random integers.
a = [randint(-1000, 1000) for _ in range(1000000)]

# Additionally, convert the list of 1,000,000 integers to a NumPy array.
a_arr = np.array(a)

In [4]:
# Examine the first 10 values in the Python list
a[:10]

[809, -616, -455, -4, -653, -914, -618, -576, 171, -249]

In [5]:
# Examine the first 10 values in the NumPy array.
#
# Note that we can use familiar Python operators on NumPy
# arrays: indexing and slicing works the same as it does on
# Python lists.
#
# Also note that NumPy arrays differentiate themselves from
# Python lists by surrounding their display output with "array()".
a_arr[:10]

array([ 809, -616, -455,   -4, -653, -914, -618, -576,  171, -249])

In [6]:
# Example 1: Write a function that increments every value in a Python list by 1.
#
# The algorithm is simple:
#   1. Access element i of the list.
#   2. Increment that element by 1.
#   3. Save the result back in element i of the list.
def increment(lst):
  for i in range(len(lst)):
    lst[i] += 1

# Execute the increment() function 10 times with the Python list.
# This will increment the contents of the list 10 times.
#
# Running times vary. In Colab, it should be between 100-200 ms per call.
#
# Consider what it takes to run this algorithm. For every element in the list,
# Python must:
#   - Determine the type of the element
#   - Determine the type of the value being added to the element
#   - Determine the correct operation to perform (integer addition).
%timeit -n 10 increment(a)

55.6 ms ± 794 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [7]:
# Execute the increment() function 10 times with the NumPy array.
# This will increment the contents of the list 10 times.
#
# Running times vary. In Colab, it should be between 200-400 ms per call.
#
# Why is this slower? Consider what has to happen. For every element in the
# NumPy array, Python must:
#   - Pull the i-th element from the NumPy array and package it as a Python
#     integer value
#   - Determine the type of the element
#   - Determine the type of the value being added to the element
#   - Determine the correct operation to perform (integer addition)
#   - Unpack the integer value from the result and copy it into the NumPy array
#
# This requires many more steps than the list alone!
%timeit -n 10 increment(a_arr)

188 ms ± 880 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [8]:
# Example 2: Write a function that increments every value in a NumPy array by 1.
#
# In this function, there is no algorithm to implement. NumPy overloads most
# Python operators and implements its own, highly efficient code that runs
# outside the restrictions of the Python interpreter.
#
# NumPy interprets "arr += 1" as "add 1 to every element of the array."
def increment_arr(arr):
  arr += 1

# Execute the increment_arr() function 10 times with the NumPy array.
# This will increment the contents of the list 10 times.
#
# Running times vary. In Colab, it should be around 400 µs (microseconds).
#
# This is _significantly faster_ than math with native Python objects.
%timeit -n 10 increment_arr(a_arr)

352 µs ± 36.6 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [9]:
# A quick demonstration of math with NumPy arrays is below.
#
# You cannot do math with Python lists. This is because:
#   - Lists are general-purpose and can contain a mix of different values.
#   - The "+" operator is defined as list concatenation instead of addition when
#     used with Python lists.
[1, 2, 3] + 1

TypeError: can only concatenate list (not "int") to list

In [10]:
# However, you can concatenate lists using the + operator:
[1, 2, 3] + [4, 5, 6]

[1, 2, 3, 4, 5, 6]

In [11]:
# In contrast, NumPy arrays overload the addition (+) and addition with
# assignment (+=) operators. It will interpret the line of code below as
# "add 1 to every element of the array".
np.array([1, 2, 3]) + 1

array([2, 3, 4])

In [12]:
# What do you think this will do, and why?
np.array([1, 2, 3]) + np.array([4, 5, 6])

array([5, 7, 9])