<center><b>DIGHUM101</b></center>
<center>3-4: NumPy</center>

---

Adapted from Stanford CS 124 NumPy tutorial

Based on the `CS 124: Jupyter and Python Tutorial` created by 
`Krishna Patel (Winter 2020)`, and updated by `Bryan Kim (Winter 2021)` and 
`Dilara Soylu (Winter 2022)`.

Some examples based on the 
[CS 231n Python Numpy Tutorial (with Jupyter and Colab)](https://cs231n.github.io/python-numpy-tutorial/) by Justin Johnson. 

<a id='overview'></a>
## Overview

In this tutorial, we will walk you through some `NumPy` examples.
`NumPy` is a very popular `Python` library used for matrix operations and linear
algebra.
The purpose of this notebook is to give a basic introduction to `NumPy` for 
students who haven't used it before, and an easy review for those who have.
Learning `NumPy` is well worth the effort, as you will be using it constantly if
you choose to take further ML/AI courses.

<a id='contents'></a>
## Contents

1. [`NumPy` Exercises](#regular_expressions_exercises)
   * [Part 1. Basic `NumPy`](#basic_numpy)
   * [Part 2. Indexing and Slicing](#indexing_and_slicing)
   * [Part 3. Array Math and Functions](#array_math_and_functions)
   * [Part 4. Vectorization](#vectorization)
   * [Part 5. Broadcasting](#broadcasting)
2. [Next Steps](#next_steps)

<a id='numpy_exercises'></a>
## `NumPy` Exercises

<a id='basic_numpy'></a>

### Part 1. Basic `NumPy`

In [1]:
# Let's import numpy (aliasing the import as np is traditional)
import numpy as np

The basic building blocks of `NumPy` are arrays, which are represented with the
`np.ndarray` type.
Arrays represent multi-dimensional matrices (often also referred to as tensors).

We can easily create a 1-D array (a vector) by calling `np.array()` and passing
in a `Python` list with the data that should go into the array.

**A 1-D array is often referred to as a *vector*.**
**Meanwhile, a 2-D array is a *matrix*, and a 3-D array (or higher) is frequently called a *tensor*.**

In [2]:
# This is an array (of type np.ndarray)
a = np.array([1, 2, 3, 4, 5])

print("a is an: {}".format(type(a)))
a

a is an: <class 'numpy.ndarray'>


array([1, 2, 3, 4, 5])

A 1-D array has one dimension with shape (n,), so it often looks like a single row. In practice, though, a 1-D array is simply a vector; it doesn’t strictly have rows or columns until it gains extra dimensions.

Arrays can have different numbers of dimensions, different shapes, and contain
elements of different types.
Very frequently you'll want to check these properties (especially shape), 
which you can do like this:

In [3]:
# a is an array containing integers (in this case, 64-bit integers)
print("Data type of a: {}".format(a.dtype))

# a is a 1-dimensional array of length 5
print("Shape of a: {}".format(a.shape))

Data type of a: int64
Shape of a: (5,)


In [4]:
# Note that b contains floating point numbers, not integers
b = np.array([5.0, 1.0])

print("Data type of b: {}".format(b.dtype))
print("Shape of b: {}".format(b.shape))
b

Data type of b: float64
Shape of b: (2,)


array([5., 1.])

As we noted above, you can initialize a 1-D array with a python list.
However, most of the time we're interested in higher-dimensional arrays like 
2-D arrays (matrices).
You can initialize them using nested `Python` lists:

In [5]:
# A 2-D array
c = np.array([[1, 2],
              [3, 4],
              [5, 6]])

print("Shape of c: {}".format(c.shape))
c

Shape of c: (3, 2)


array([[1, 2],
       [3, 4],
       [5, 6]])

Note that the length of the __outer__ list is the first dimension (in this
case of length 3), while the lengths of the __inner__ lists are the second
dimension (in this case of length 2).

It's easy to get your dimensions/shapes mixed up if you get the order confused.
Just remember that the number of (horizontal) rows is always the first
dimension, and the number of (vertical) columns is the second dimension.

In addition to the basic `np.array`, `NumPy` provides a bunch of other convenient
methods to create different types of arrays/matrices (to save you from
typing out the data by hand, or having to use loops or `Python` list
comprehensions). 
Some useful ones include:

In [6]:
# To create an all-zero array with the given shape (3, 4)
zeros = np.zeros((3, 4))
zeros

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [7]:
# To create an all-ones array
ones = np.ones((2, 2))
ones

array([[1., 1.],
       [1., 1.]])

In [9]:
# To create an uninitialized array (junk values). This is useful if you know
# you're going to manually fill/overwrite the entire array anyways (it saves
# the time NumPy would have spent to set every entry to a particular value)
# NOTE: Do NOT confuse this with np.zeros(). "empty" does not mean all zeroes,
# it just means we don't care what is in it. It could be all zeros, it could
# be all ones, it could be anything at all.
empty = np.empty((2, 2))
empty

array([[1., 1.],
       [1., 1.]])

In [10]:
# To create an array filled with a single value
filled = np.full((3, 3), 5)
filled

array([[5, 5, 5],
       [5, 5, 5],
       [5, 5, 5]])

In [11]:
# To create an identity matrix
identity = np.identity(3)
identity

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [13]:
# To create an array filled with random values sampled uniformly
# from [0.0, 1.0) -> ) here means that the random values can go up to but not include 1.0
random = np.random.random((2, 2))
random

array([[0.51414433, 0.39914552],
       [0.69190239, 0.35594345]])

Finally, once we have an array that we've initialized with data, we can also
reshape it without changing its data using np.reshape!

Check out this example:

In [14]:
a = np.array([[1, 2, 3],
              [4, 5, 6]])
print("a.shape = {}".format(a.shape))
a

a.shape = (2, 3)


array([[1, 2, 3],
       [4, 5, 6]])

In [15]:
a_reshaped = a.reshape((3, 2))
print("a_reshaped.shape = {}".format(a_reshaped.shape))
a_reshaped

a_reshaped.shape = (3, 2)


array([[1, 2],
       [3, 4],
       [5, 6]])

In [16]:
a_reshaped = a.reshape((6,))
print("a_reshaped.shape = {}".format(a_reshaped.shape))
a_reshaped

a_reshaped.shape = (6,)


array([1, 2, 3, 4, 5, 6])

In [19]:
a_2d = a_reshaped.reshape((2, 3))
print("a_reshaped.shape = {}".format(a_2d.shape))
a_2d

a_reshaped.shape = (2, 3)


array([[1, 2, 3],
       [4, 5, 6]])

In [17]:
# Note what happens if we try to reshape to a shape with a different number
# of elements:
a_reshaped = a.reshape((7,))

ValueError: cannot reshape array of size 6 into shape (7,)

<a id='indexing_and_slicing'></a>
### Part 2. Indexing and Slicing

Arrays can be initialized with lists, as we saw, and in many ways they behave
a lot like `Python` lists!
You can access specific elements by their index (starting with zero, just like 
in `Python` lists), and modify them:

In [20]:
a = np.array([1.0, 2.0, 3.0, 4.0, 5.0])

In [21]:
b = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

In [22]:
# You can access elements in an array like a Python list by indexing:
print("a[0] = {}".format(a[0]))
print("a[3] = {}".format(a[3]))

a[0] = 1.0
a[3] = 4.0


In [23]:
# You can index into higher-dimensional arrays the same way.

# NOTE: If you had nested python lists instead of a NumPy array, you'd
# need to do something like b[0][1] instead of b[0, 1], so it's a little
# different, but the idea is the same. The b[0][1] syntax will also work
# for NumPy arrays.
print("b[0, 0] = {}".format(b[0, 0]))
print("b[2, 2] = {}".format(b[2, 2]))
print("b[1, 2] = {}".format(b[1, 2]))

b[0, 0] = 1
b[2, 2] = 9
b[1, 2] = 6


In [24]:
# You can also modify elements just like in a Python list
print("a before:\n {}\n".format(a))
a[2] = 9.0
print("a after:\n {}\n".format(a))

a before:
 [1. 2. 3. 4. 5.]

a after:
 [1. 2. 9. 4. 5.]



`NumPy` also supports more complex forms of indexing (like slicing), which
also behaves similarly to `Python` lists:

In [25]:
# This is just a Python list for comparison
example_list = [1, 2, 3, 4, 5, 6]

In [26]:
# Recall how we can slice a Python list

# This gives a slice containing every element starting from position/index 1
# (inclusive) up to but EXCLUDING the element at position/index 3
print("example_list[1:3] = {}".format(example_list[1:3]))

# You can also omit the start and end index and it will use the start/end of the
# list instead
print("example_list[:3] = {}".format(example_list[:3]))
print("example_list[1:] = {}".format(example_list[1:]))
print("example_list[:] = {}".format(example_list[:]))

example_list[1:3] = [2, 3]
example_list[:3] = [1, 2, 3]
example_list[1:] = [2, 3, 4, 5, 6]
example_list[:] = [1, 2, 3, 4, 5, 6]


In [27]:
# NumPy works exactly the same way

a = np.array([1, 2, 3, 4, 5, 6])

In [28]:
print("a[1:3] = {}".format(a[1:3]))
print("a[:3] = {}".format(a[:3]))
print("a[1:] = {}".format(a[1:]))
print("a[1:] = {}".format(a[:]))

a[1:3] = [2 3]
a[:3] = [1 2 3]
a[1:] = [2 3 4 5 6]
a[1:] = [1 2 3 4 5 6]


In [29]:
# index with a list
a[[1,3,4]]

array([2, 4, 5])

We can modify slices just like how we modified elements using indexing:

In [30]:
print("a before:\n {}\n".format(a))
a[1:3] = [8, 9]
print("a after:\n {}\n".format(a))

a before:
 [1 2 3 4 5 6]

a after:
 [1 8 9 4 5 6]



We can also slice multi-dimensional arrays.
Try to figure out what each of the expression below will give before running 
them, to check your intuition:

In [31]:
b = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

In [32]:
print("b[:, :] =>\n {}\n".format(b[:, :]))
print("b[1, :] =>\n {}\n".format(b[1, :]))

# Note that 1-D arrays are always treated as row vectors (horizontal)
print("b[:, 0] =>\n {}\n".format(b[:, 0]))
print("b[1, :2] =>\n {}\n".format(b[1, :2]))

b[:, :] =>
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

b[1, :] =>
 [4 5 6]

b[:, 0] =>
 [1 4 7]

b[1, :2] =>
 [4 5]



In [33]:
# Modifying a slice of a 2-D arrays

print("b before:\n {}\n".format(b))
b[:, 0] = 0
print("b after:\n {}\n".format(b))

b before:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]

b after:
 [[0 2 3]
 [0 5 6]
 [0 8 9]]



It's important to think carefully about the shapes/dimensions of the slices
that you extract.
Depending on how you slice, your result could have fewer dimensions than the 
original array, or the same number.

Think about what shapes you would expect these slices to be, then run the cell 
below to double-check:

Do the results make sense to you? 
If not, can you figure out why the shapes came out as they did?

In [34]:
print("shape(b[:, :]) = {}".format(b[:, :].shape))
print("shape(b[1, :]) = {}".format(b[1, :].shape))
print("shape(b[1:2, :]) = {}".format(b[1:2, :].shape))
print("shape(b[:, 0]) = {}".format(b[:, 0].shape))
print("shape(b[:, 0:1]) = {}".format(b[:, 0:1].shape))

shape(b[:, :]) = (3, 3)
shape(b[1, :]) = (3,)
shape(b[1:2, :]) = (1, 3)
shape(b[:, 0]) = (3,)
shape(b[:, 0:1]) = (3, 1)


<a id='array_math_and_functions'></a>
### Part 3. Array Math and Functions

We covered creating arrays and reading/writing elements in them, but the main
reason we use `NumPy` in the first place is to do math with arrays (linear
algebra).
For the most part, array/vector math in `NumPy` is extremely straight-forward 
and intuitive.
You can add, subtract, multiply, etc. `NumPy` arrays just like they were 
`Python` numbers and NumPy will take care of everything for you!

For example:

In [37]:
# 1-D Example
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

In [None]:
# Element-wise addition
print("Element-wise addition (a + b):", a + b)

Element-wise addition (a + b): [5 7 9]


Element-wise addition means that each element in a is added to its counterpart in b.

So we added 1 + 4 -> 5, 2 + 5 -> 7, 3 + 6 -> 9

Think of it like this

[1 | 2 | 3]

[4 | 5 | 6]

\-\-\-\-\-\-\-\-\-

[5 | 7 | 9]

In [39]:
# Element-wise subtraction
print("Element-wise subtraction (a - b):", a - b)

Element-wise subtraction (a - b): [-3 -3 -3]


In [None]:
# Element-wise subtraction other way around
print("Element-wise subtraction (b - a):", b - a)

Element-wise subtraction (b - a): [3 3 3]


In [None]:
# Element-wise multiplication
print("Element-wise multiplication (a * b):", a * b)
# Here, 1*4 -> 4, 2*5 -> 10, 3*6 -> 18

Element-wise multiplication (a * b): [ 4 10 18]


In [42]:
# Element-wise division
print("Element-wise division (a / b):", a / b)

Element-wise division (a / b): [0.25 0.4  0.5 ]


In [43]:
# Element-wise division other way around
print("Element-wise division (b / a):", b / a)

Element-wise division (b / a): [4.  2.5 2. ]


Same principles of element-wise operation apply to matrices or 2-D arrays

In [46]:
a = np.array([[1,2],
              [3,4]])

In [47]:
b = np.array([[3,3],
              [4,4]])


Note that for all the below element-wise operations, the two
arrays being added/subtracted/multiplied etc. must have the same shape!
(We will talk about an exception to this rule in the next exercise.)

Also note that for many matrix operations, there's two ways of
writing it in `NumPy`. 
Either you can call a dedicated function, or you can use the standard 
"+, -, *, /" operators on NumPy arrays as if they were just numbers

In [48]:
# To take an element-wise sum
print("a + b =>\n {}\n".format(a + b))
print("np.add(a, b) =>\n {}\n".format(np.add(a, b)))

# To take an element-wise difference
print("b - a =>\n {}\n".format(b - a))
print("np.subtract(b, a) =>\n {}\n".format(np.subtract(b, a)))

# To take an element-wise product
print("a * b =>\n {}\n".format(a * b))
print("np.multiply(a, b) =>\n {}\n".format(np.multiply(a, b)))

# To take an element-wise quotient
print("a / b =>\n {}\n".format(a / b))
print("np.divide(a, b) =>\n {}\n".format(np.divide(a, b)))

a + b =>
 [[4 5]
 [7 8]]

np.add(a, b) =>
 [[4 5]
 [7 8]]

b - a =>
 [[2 1]
 [1 0]]

np.subtract(b, a) =>
 [[2 1]
 [1 0]]

a * b =>
 [[ 3  6]
 [12 16]]

np.multiply(a, b) =>
 [[ 3  6]
 [12 16]]

a / b =>
 [[0.33333333 0.66666667]
 [0.75       1.        ]]

np.divide(a, b) =>
 [[0.33333333 0.66666667]
 [0.75       1.        ]]



In [49]:
# Note what happens when the shapes don't match

a = np.array([1, 2, 3])
b = np.array([1, 2])

a + b

ValueError: operands could not be broadcast together with shapes (3,) (2,) 

`NumPy` also provides a bunch of super-useful functions to compute mathematical
functions of NumPy arrays, or do other common operations.
Some commonly used ones that you might find useful include:

In [72]:
a = np.array([1, 2, 3])

b = np.array([[1, 2, 3],
              [4, 5, 6]])

c = np.array([2, 3, 4])

d = np.array([[3, 4],
              [5, 6],
              [7, 8]])

In [51]:
# Sum all the elements in an array
np.sum(a)

np.int64(6)

In [None]:
# Sum elements along an axis/dimension
np.sum(b, axis=1)

array([ 6, 15])

In [56]:
# Let's study this a bit more
c = np.sum(b, axis=1)
print(c.shape)
c

(2,)


array([ 6, 15])

In [57]:
# What if we sum along axis=0?
d = np.sum(b, axis=0)
print(d.shape)
d

(3,)


array([5, 7, 9])

np.sum(b, axis=1) sums the values of each row (axis=1) and returns a 1-D array

np.sum(b, axis=0) sums the values of each column (axis=0) and returns also a 1-D array

The main difference is in the shape.

Our numpy array b had the shape **(2,3)**

This means that if we sum along the rows, we get a new array with the shape **(2,)**

And if we sum along the columns, our new array has the shape **(3,)**

> In other words, for a 2D array with shape (m, n), axis=0 sums columns (results in a shape (n,)), while axis=1 sums rows (results in a shape (m,)). 



In [58]:
# Take (natural) log element-wise
np.log(a)

array([0.        , 0.69314718, 1.09861229])

### What are Logarithms?

A **logarithm** is the inverse operation of **exponentiation**. In simple terms, it answers the question: "To what power must a certain base be raised to produce a given number?"

For example, in the equation:

`b^x = y`

The **logarithm** of `y` to the base `b` is `x`, written as:

`log_b(y) = x`

Here:
- `b` is the **base** of the logarithm,
- `y` is the number we want to find the logarithm of, and
- `x` is the power (or exponent) that the base must be raised to, in order to get `y`.

#### Examples:
- `log₂(8) = 3` because `2^3 = 8`
- `log₁₀(1000) = 3` because `10^3 = 1000`

### Natural Logarithm

The **natural logarithm**, denoted as `ln(x)`, is a specific type of logarithm where the base is the constant **e** (approximately 2.718). e is Euler's number. The constant e and the exponential function `e^x` is particularly useful in various fields such as mathematics, physics, and computer science for modeling natural growth (or decline) processes or phenomena that change continuously over time.

### np.log()

np.log() computes the natural logarithm (base e). To calculate the logarithm for a different base, you need to specify it like so: np.log2()

Natural logarithm is the default log calculation in machine learning. This is both because of its role in modeling phenamenon and also the derivative of `ln(x)` is `1/x`, which makes it easy to use in back propagation, an integral part of neural machine learning.


In [60]:
# log base 2 element-wise
np.log2(a)

array([0.       , 1.       , 1.5849625])

In [61]:
# Take exponential (e^x) element-wise
np.exp(a)

array([ 2.71828183,  7.3890561 , 20.08553692])

In [62]:
# Take square root element-wise
np.sqrt(a)

array([1.        , 1.41421356, 1.73205081])

In [63]:
# Get maximum element in an array
print("np.max(a) = {}".format(np.max(a)))
print("np.max(b) = {}".format(np.max(b)))
print("np.max(b, axis=1) = {}".format(np.max(b, axis=1)))

np.max(a) = 3
np.max(b) = 6
np.max(b, axis=1) = [3 6]


In [64]:
# Get index of maximum element in an array
np.argmax(a)

np.int64(2)

In [65]:
# You can do with with 2-D arrays as well
print("np.argmax(b) = {}".format(np.argmax(b)))
print("np.argmax(b, axis=0) = {}".format(np.argmax(b, axis=0)))
print("np.argmax(b, axis=1) = {}".format(np.argmax(b, axis=1)))

np.argmax(b) = 5
np.argmax(b, axis=0) = [1 1 1]
np.argmax(b, axis=1) = [2 2]


In [69]:
# Let's take a look at the index of the maximum element in a 2-D array
# this is the same b, repeated from above

b = np.array([[1, 2, 3],
              [4, 5, 6]])

max_value = np.max(b)
print("Maximum value in b: {}".format(max_value))

idx = np.argmax(b)
print("Index of maximum element in b: {}".format(idx))

# The index is in 1-D, so we can convert it to 2-D coordinates
row, col = np.unravel_index(idx, b.shape)
print(row, col) 

Maximum value in b: 6
Index of maximum element in b: 5
1 2


In [70]:
# Get average of all elements in array
np.mean(a)

np.float64(2.0)

In [73]:
# Take a dot product between two arrays
# In this case, two vectors (1-D arrays)
np.dot(a, c)

np.int64(20)

### Dot Product in Matrices

The **dot product** (also known as the **scalar product**) in the context of matrices is an operation that takes two vectors or matrices and produces a scalar (a single number) or another matrix, depending on the type of objects involved.

#### Dot Product of Two Vectors

For two vectors **A** and **B**, the dot product is calculated by multiplying corresponding elements and then summing the results. If **A** and **B** are vectors of the same length, say, (n, ) then:

A = [a₁, a₂, ..., aₙ]
B = [b₁, b₂, ..., bₙ]

The **dot product** \( A ⋅ B \) is defined as:

A ⋅ B = a₁ ⋅ b₁ + a₂ ⋅ b₂ + ... + aₙ ⋅ bₙ

This results in a scalar (single number), which represents a measure of the *similarity* or *correlation* between the two vectors.

### Our example

a = [1, 2, 3]

c = [2, 3, 4]

a ⋅ c = 1x2 + 2x3 + 3x4

a ⋅ c = 21




Same logic of dot product can be applied to matrix mutliplication. This is different from the element-wise multiplication

In [None]:
# Remember the element-wise multiplication?
# Let's multiply a and c element-wise
print("Element-wise multiplication of a and c: {}".format(a * c))


Element-wise multiplication of a and c: [ 2  6 12]


In [74]:
# Matrix-multiply two arrays
print("b @ d =>\n {}\n".format(b @ d))
print("np.matmul(b, d) =>\n {}\n".format(np.matmul(b, d)))

b @ d =>
 [[34 40]
 [79 94]]

np.matmul(b, d) =>
 [[34 40]
 [79 94]]



Matrix multiplication (also known as the dot product for matrices) involves multiplying corresponding rows of the first matrix by columns of the second matrix and summing the results to form a new matrix.



In [77]:
# Same as above, just a reminder here

a = np.array([1, 2, 3])

b = np.array([[1, 2, 3],
              [4, 5, 6]])

c = np.array([2, 3, 4])

d = np.array([[3, 4],
              [5, 6],
              [7, 8]])

When we do a matrix multiplication, denoted as `@` we are taking the first row of b and getting its dot product with the first column of d.

```
result = [[m, n], 
          [x, y]]
```

first row of b = [1, 2, 3]

first column of d = [3, 5, 7]

m = 1x3 + 2x5 + 3x7
m = 34


The shapes of b (2,3) and d (3,2) are different, which is why matrix multiplication is valid here because the number of columns in B (which is 3) matches the number of rows in D (which is also 3). The result of multiplying these matrices will be a new matrix with the number of rows of B and the number of columns of D.

In [78]:
# Get the norm (magnitude) of a matrix/vector
# You can specify different types of norm, but by default it does the
# 2-norm, which is only vector norm that we'll see in this class.
np.linalg.norm(a)

np.float64(3.7416573867739413)

The norm of a matrix or a vector is a measure of its "size" or "magnitude." It provides a scalar value that quantifies how large the matrix or vector is. For a vector, the norm gives a measure of its length in the vector space.

This will come hand when we are doing comparisons between vectors

__NOTE:__ In general, for most `NumPy` functions that operate on an entire 
array, you can also specify a dimension or dimensions to apply it along

__NOTE:__ There are also other packages that build off of `NumPy` or
use `NumPy` arrays.
We will very briefly encounter a few examples later in the class like `SciPy`
and `PyTorch`. 
For the most part, these packages tend to work very similarly to the built-in 
`NumPy` functions above.

<a id='vectorization'></a>
### Part 4. Vectorization

Vectorization is the process of converting operations on scalar values (individual numbers) into operations on vectors (arrays of numbers). It allows you to perform computations on entire arrays or matrices at once, rather than using explicit loops to process each element one by one. This results in more efficient, faster, and parallelizable operations.

In [91]:
arrays = np.random.random((1000, 1000)) # this is a 2-D array with 1000 rows and 1000 columns
other_array = np.random.random((1000)) # this is a 1-D array with 1000 elements

In [97]:
%%time

#Compute the max inner (dot) product
max_i = 0
max_val = 0
for  i in range(1000):
    idp = 0
    for j in range(1000):
        idp += arrays[i][j] * other_array[j]
    if idp > max_val:
        max_val = idp
        max_i = i
print(max_i, max_val)
        


765 267.21449646836396
CPU times: user 701 ms, sys: 7.85 ms, total: 709 ms
Wall time: 713 ms


In [96]:
%%time
# let's use numpy's dot product

idps = np.dot(arrays, other_array)
print(np.argmax(idps), np.max(idps))


765 267.2144964683641
CPU times: user 3.44 ms, sys: 361 μs, total: 3.8 ms
Wall time: 919 μs


<a id='broadcasting'></a>
### Part 5. Broadcasting

A final topic that may be useful to you is broadcasting. 
It's one of the most useful and powerful features of `NumPy`, because it lets 
you write matrix/array operations in a natural way without having to be too 
specific about what you want `NumPy` to do.

In most cases, `NumPy` can use broadcasting to infer what you wanted to do
and do it. 
Let's look at an example:

In [99]:
# Create a multi-dimensional (1 x 2) matrix of ones
a = np.ones((1,2))
a

array([[1., 1.]])

In [100]:
# Create a single-element 1-D array
b = np.array([4])
b

array([4])

Now recall that earlier, when we tried to add, multiply, subtract, etc. two
arrays, they had to have the same shape! 
Let's see what happens when we do some of these things with a and b:

In [101]:
a + b

array([[5., 5.]])

In [102]:
a - b

array([[-3., -3.]])

In [103]:
a * b

array([[4., 4.]])

In [104]:
a / b

array([[0.25, 0.25]])

Wait, what happened?? 
We just said that the shapes had to match, but `a` and `b` definitely don't 
have the same shape...

This is where broadcasting comes into play. Although `a` and `b` don't have the
same shape, they have __compatible__ shapes, so `NumPy` was able to guess
what we actually wanted to do and __broadcast__ behind the scenes to make the
operation work.

What `NumPy` did behind the scenes in this case is see that `b` and `a` don't 
have the same shape, but also realize that maybe we meant to "re-use" the value 
in `b` for every value in `a`. In other words, it "broadcast" `b` from its
original shape of (1,) to `a`'s shape of (2,) by duplicating the element
in `b`.

Once it did that, it could simply do the element-wise operation as usual!

This makes our life a lot easier, as we didn't have to explicitly write out
that duplication and reshaping ourselves.
Of course, that assumes that this behavior is actually what we intended.

In this case, it seems to make good sense, as if we tell it to subtract the
array `[4]` from `[1, 1]`, it seems reasonable that what we are really asking
for is for it to subtract 4 from each 1 in the second array.

Most of the time, `NumPy` assumes that whenever we do an operation on two arrays
with different shapes, we would like things to be broadcast if possible to
make the operation work.

It will then try to expand the smaller array to match the size of the bigger
one by copying/repeating the data in the smaller array.

Let's try a slightly more complicated example involving 2-D arrays:

In [105]:
# Create a 2-D array of shape (4, 3)
a = np.ones((4,3))
a

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [106]:
# Create an array of shape (4,)
# np.arange(n) creates a vector out of increasing
# integers from 0 to n (exclusive)
b = np.arange(4)

# Reshape to (4, 1)
b = b.reshape((4, 1))
b

array([[0],
       [1],
       [2],
       [3]])

In [107]:
# Multiply them together
b * a

array([[0., 0., 0.],
       [1., 1., 1.],
       [2., 2., 2.],
       [3., 3., 3.]])

Did it do what you expected?
Do you see how broadcasting was applied to create the result?

Broadcasting also works with scalars. In general, most operations you can do with two scalars, you can do with an array and a scalar and the operation will be broadcast to every element in the array.

This is true for the obvious operations:

In [108]:
a = np.arange(5)
a

array([0, 1, 2, 3, 4])

In [109]:
a + 5

array([5, 6, 7, 8, 9])

In [110]:
a - 1

array([-1,  0,  1,  2,  3])

In [111]:
a * 10

array([ 0, 10, 20, 30, 40])

In [112]:
a / 2

array([0. , 0.5, 1. , 1.5, 2. ])

In [113]:
a ** 2

array([ 0,  1,  4,  9, 16])

It's also true in a lot of ways you might not think of right away but that can be very useful! For example:

In [114]:
a = np.random.random((5, 5))
a

array([[0.3344394 , 0.76879531, 0.70794573, 0.80873786, 0.1740348 ],
       [0.44664426, 0.24944975, 0.81068993, 0.05889276, 0.21912507],
       [0.15208953, 0.11096977, 0.37765491, 0.70219078, 0.3076976 ],
       [0.74590303, 0.06757166, 0.26456117, 0.08814405, 0.36674206],
       [0.01793822, 0.69819903, 0.91136803, 0.60924758, 0.11222834]])

In [115]:
a > 0.5

array([[False,  True,  True,  True, False],
       [False, False,  True, False, False],
       [False, False, False,  True, False],
       [ True, False, False, False, False],
       [False,  True,  True,  True, False]])

Can you think of a way this might be useful?

The details and different cases for broadcasting can be tricky, even
for experienced `NumPy` users!
So you're certainly not expected to be use or need it heavily in this class. 
However, knowing the basics of how it works may make your life a little easier 
and your code a little simpler on some of the homeworks.

If you ever find yourself in a confusing situation involving broadcasting, we
definitely recommend checking out the `NumPy` documentation on
[Broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html).


<a id='next_steps'></a>
## Next Steps

That's it! 
That's all the basic `NumPy` knowledge you need for this course
(possibly you may not even need all of the tools we have introduced here).

If you are interested in learning more about `NumPy`, we recommend:
* [The NumPy User Guide](https://numpy.org/doc/stable/user/index.html)
* [CS 231n NumPy Tutorial](https://cs231n.github.io/python-numpy-tutorial/#numpy),
  which we based much of the content here on, with more advanced topics
