Note: this notebook's cells were taken and adapted from tutorial 1 of the [cs236781 course](https://github.com/vistalab-technion/cs236781-tutorials).

## Numpy

Numpy is a core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays.

We'll refer to such n-dimentional arrays as **tensors** in accordance with the deep learning terminology.

Although we'll mainly use **PyTorch** tensors for implementing our Deep Learning systems, it's still important to be proficient with `numpy`, since:
1. They concepts are very similar. It will help you understand numpy ndarrays you'll understand Pytorch Tensors.
1. You'll find that you need to switch between the two when working with read DL systems.

To use Numpy, we first need to import the `numpy` package:

In [1]:
import numpy as np

### Arrays

A numpy array represents an n-dimentional grid of values, all of the same type, and is indexed by a tuple of nonnegative integers.

- The the **rank** of the array is the number of dimensions it has
- The **shape** of an array is a tuple of integers giving the number of elements along each dimension

We can initialize numpy arrays from nested Python lists, and access elements using square brackets:

In [2]:
a = np.array([1, 2, 3])  # Create a rank 1 array
a

array([1, 2, 3])

In [3]:
a[0]

1

Two very important properties of any numpy array are its `shape` and `dtype`.

In [4]:
def print_arr(arr, pre_text=''):
    print(f'shape={arr.shape} dtype={arr.dtype}:')
    print(f'{pre_text}{arr}\n')

In [5]:
print_arr(a)

shape=(3,) dtype=int64:
[1 2 3]



In [6]:
a[0] = 5                 # Change an element of the array
a

array([5, 2, 3])

In [7]:
b = np.array([[1,2,3],[4,5,6.7]])   # Create a rank 2 array
print_arr(b)

shape=(2, 3) dtype=float64:
[[1.  2.  3. ]
 [4.  5.  6.7]]



In [8]:
b[0, 0], b[0, 1], b[1, 0]

(1.0, 2.0, 4.0)

Numpy also provides many functions to create arrays:

In [9]:
np.zeros((2, 2))  # Create an array of all zeros

array([[0., 0.],
       [0., 0.]])

In [10]:
np.ones((1, 10))   # Create an array of all ones

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

In [11]:
np.full((3, 3), 7.2) # Create a constant array

array([[7.2, 7.2, 7.2],
       [7.2, 7.2, 7.2],
       [7.2, 7.2, 7.2]])

In [12]:
np.eye(4, dtype=np.int) # Create an identity matrix of integers

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  np.eye(4, dtype=np.int) # Create an identity matrix of integers


array([[1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 1]])

In [13]:
t = np.random.random((4,4,3)) # Create a 3d-array filled with U[0,1] random values
t

array([[[0.52431218, 0.51922641, 0.13759393],
        [0.10720176, 0.1436768 , 0.83489569],
        [0.01032326, 0.50795203, 0.92343308],
        [0.76584871, 0.09869159, 0.66746649]],

       [[0.87913343, 0.06919086, 0.53511182],
        [0.09236758, 0.13314812, 0.59323823],
        [0.45020689, 0.3829606 , 0.32453746],
        [0.04433514, 0.92629976, 0.85813478]],

       [[0.30689412, 0.44209892, 0.19028759],
        [0.43541267, 0.03037649, 0.50366464],
        [0.48038406, 0.57595227, 0.49307728],
        [0.33884244, 0.51389638, 0.22773131]],

       [[0.44911893, 0.3871002 , 0.29651687],
        [0.6266539 , 0.75653455, 0.49302381],
        [0.55672347, 0.42744981, 0.55952159],
        [0.5904866 , 0.63451432, 0.42021219]]])

In [14]:
t[1,1,2]

0.5932382334851755

#### Array rank

In `numpy` **rank** means **number of dimensions**.

**rank-0** arrays are scalars.

In [15]:
a0 = np.array(17)
print_arr(a0)

shape=() dtype=int64:
17



In [16]:
# Get sclar as a python float
a0.item()

17

**rank-1** arrays of length `n` have a shape of `(n,)`. 

In [17]:
# A rank-1 array
a1 = np.array([1,2,3])

print_arr(a1)

shape=(3,) dtype=int64:
[1 2 3]



In [18]:
# A rank-1 array scalar
print_arr(np.array([3.14]))

shape=(1,) dtype=float64:
[3.14]



**rank-2** arrays have a shape of `(n,m)`. 

In [19]:
a2 = np.array([[1,2,3], [4,5,6]])

print_arr(a2)

shape=(2, 3) dtype=int64:
[[1 2 3]
 [4 5 6]]



A column vector is also rank-2!

In [20]:
a_col = a1.reshape(-1, 1)

print_arr(a_col)

shape=(3, 1) dtype=int64:
[[1]
 [2]
 [3]]



And a row vector is also rank-2:

In [21]:
a_row = a1.reshape(1, -1)

print_arr(a_row)

shape=(1, 3) dtype=int64:
[[1 2 3]]



**rank-k** arrays have a shape of `(n1,...,nk)`. 

In [22]:
print_arr(np.zeros((2,3,4)))

shape=(2, 3, 4) dtype=float64:
[[[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]

 [[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]]



In [23]:
print_arr(np.ones((2,2,2,2)))

shape=(2, 2, 2, 2) dtype=float64:
[[[[1. 1.]
   [1. 1.]]

  [[1. 1.]
   [1. 1.]]]


 [[[1. 1.]
   [1. 1.]]

  [[1. 1.]
   [1. 1.]]]]



### Array math

#### Elementwise operations
Basic mathematical functions **operate elementwise** on arrays, and are available both as operator overloads and as functions in the numpy module:

In [24]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise basic math
print(x + y)

[[ 6.  8.]
 [10. 12.]]


In [25]:
print(x - y)

[[-4. -4.]
 [-4. -4.]]


In [26]:
print(x * y)

[[ 5. 12.]
 [21. 32.]]


In [27]:
print(x / y)

[[0.2        0.33333333]
 [0.42857143 0.5       ]]


In [28]:
# Elementwise functions
print(np.sqrt(x))

[[1.         1.41421356]
 [1.73205081 2.        ]]


In [29]:
print(np.exp(x))

[[ 2.71828183  7.3890561 ]
 [20.08553692 54.59815003]]


In [30]:
print(np.log(x))

[[0.         0.69314718]
 [1.09861229 1.38629436]]


There are of course many more elementwise operations inmplemented by `numpy`.

#### Inner and outer products

Unlike MATLAB, `*` is elementwise multiplication, not matrix multiplication (as we saw above).

We can instead use the `dot()` function to:
- compute inner or outer products of vectors,
- multiply a vector by a matrix, and to
- multiply matrices, and more generally n-d tensors.

The `dot()` function is available both as a function in the numpy module and as an instance
method of array objects.

In [31]:
v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))

219
219


Rank-1 arrays arrays are somewhat special in that `numpy` can treat them both as column or as row vectors.
Arrays of different rank have different semantics when using them in vector-vector or vector-matrix products, so always make sure you known what shapes you're working with:

In [32]:
print_arr(a1, 'a1\t\t')

# Inner products, but output dimenstions are different
print_arr(np.dot(a1, a1), 'a1 * a1 =\t')

print_arr(np.dot(a_row, a1), 'a_row * a1 =\t')

print_arr(np.dot(a1, a_col), 'a1 * a_col =\t')

print_arr(np.dot(a_row, a_col), 'a_row * a_col =\t')

# Outer product
print_arr(np.dot(a_col, a_row), 'a_col * a_row =\n')

shape=(3,) dtype=int64:
a1		[1 2 3]

shape=() dtype=int64:
a1 * a1 =	14

shape=(1,) dtype=int64:
a_row * a1 =	[14]

shape=(1,) dtype=int64:
a1 * a_col =	[14]

shape=(1, 1) dtype=int64:
a_row * a_col =	[[14]]

shape=(3, 3) dtype=int64:
a_col * a_row =
[[1 2 3]
 [2 4 6]
 [3 6 9]]



#### Non-elementwise operations

Numpy provides many useful functions for performing computations on arrays.

In [33]:
x = np.array([[1,2,3],[3,4,5]])
print_arr(x)

shape=(2, 3) dtype=int64:
[[1 2 3]
 [3 4 5]]



In [34]:
print(np.sum(x))  # Compute sum of all elements
print(np.mean(x, axis=0))  # Compute mean of each column
print(np.prod(x, axis=1) ) # Compute product of each row

18
[2. 3. 4.]
[ 6 60]


You can find the full list of mathematical functions provided by numpy in the [documentation](http://docs.scipy.org/doc/numpy/reference/routines.math.html).

### Array indexing

Numpy offers several ways to index into arrays.

**Slicing**

Similar to Python lists, numpy arrays can be sliced. Since arrays may be multidimensional, you must specify **a slice for each dimension** of the array:

In [35]:
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

print_arr(a)

shape=(3, 4) dtype=int64:
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]



In [36]:
b = a[:2, 1:3]

print_arr(b)

shape=(2, 2) dtype=int64:
[[2 3]
 [6 7]]



A slice of an array is a **view** into the same in-memory data, so modifying it will modify the original array.

In [37]:
# Changing a view
b[0, 0] = 77777

# ...modifies original
a

array([[    1, 77777,     3,     4],
       [    5,     6,     7,     8],
       [    9,    10,    11,    12]])

You can also mix integer indexing with slice indexing.
However, doing so will yield an array of **lower rank** than the original array.


In [38]:
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

Two ways of accessing the data in the middle row of the array.
- Mixing integer indexing with slices yields an array of lower rank
- Using only slices yields an array of the same rank as the original array

In [39]:
row_r1 = a[1, :]    # Rank 1 view of the second row of a  
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
row_r3 = a[[1], :]  # Rank 2 view of the second row of a

print_arr(row_r1)
print_arr(row_r2)
print_arr(row_r3)

shape=(4,) dtype=int64:
[5 6 7 8]

shape=(1, 4) dtype=int64:
[[5 6 7 8]]

shape=(1, 4) dtype=int64:
[[5 6 7 8]]



In [40]:
# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]

print_arr(col_r1)
print_arr(col_r2)

shape=(3,) dtype=int64:
[ 2  6 10]

shape=(3, 1) dtype=int64:
[[ 2]
 [ 6]
 [10]]



**Integer array indexing** 

- When you slice, the resulting array view will always be a subarray of the original array.
- Integer array indexing allows you to construct arbitrary arrays using the data from another array.


In [41]:
a = np.array([[1,2], [3, 4], [5, 6]])
print_arr(a)

shape=(3, 2) dtype=int64:
[[1 2]
 [3 4]
 [5 6]]



In [42]:
# An example of integer array indexing.
# The returned array will have shape (3,)
print_arr(a[ [0, 1, 2], [0, 1, 0] ])

shape=(3,) dtype=int64:
[1 4 5]



In [43]:
# The above example of integer array indexing is equivalent to this:
print_arr(np.array([a[0, 0], a[1, 1], a[2, 0]]))

shape=(3,) dtype=int64:
[1 4 5]



In [44]:
# When using integer array indexing, you can reuse the same
# element from the source array:
print(a[[0, 0], [1, 1]])

# Equivalent to the previous integer array indexing example
print(np.array([a[0, 1], a[0, 1]]))

[2 2]
[2 2]


One useful trick with integer array indexing is selecting or mutating one element from each row of a matrix:

In [45]:
# Create a new array from which we will select elements
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
a

array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [46]:
# Create an array of indices
b = np.array([0, 2, 0, 1])

# Select one element from each row of a using the indices in b
a[np.arange(4), b]

array([ 1,  6,  7, 11])

In [47]:
# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] += 1000
a

array([[1001,    2,    3],
       [   4,    5, 1006],
       [1007,    8,    9],
       [  10, 1011,   12]])

**Boolean array indexing**

This type of indexing is used to select the elements of an array that satisfy some condition
(similar to MATLAB's logical indexing).

In [48]:
a = np.array([[1,2], [3, 4], [5, 6]])
print_arr(a)

shape=(3, 2) dtype=int64:
[[1 2]
 [3 4]
 [5 6]]



In [49]:
bool_idx = (a > 2)  # Find the elements of a that are bigger than 2;
                    # this returns a numpy array of Booleans of the same
                    # shape as a, where each slot of bool_idx tells
                    # whether that element of a is > 2.

bool_idx

array([[False, False],
       [ True,  True],
       [ True,  True]])

In [50]:
# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
a[a>2]

array([3, 4, 5, 6])

For brevity we have left out a lot of details about numpy array indexing; if you want to know more you should read the [documentation](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html).

### Datatypes

Every numpy array is a grid of elements of the same type. Numpy provides a large set of numeric datatypes that you can use to construct arrays. Numpy tries to guess a datatype when you create an array, but functions that construct arrays usually also include an optional argument to explicitly specify the datatype. Here is an example:

In [51]:
x = np.array([1, 2])  # Let numpy choose the datatype
y = np.array([1.0, 2.0])  # Let numpy choose the datatype
z = np.array([1, 2], dtype=np.int64)  # Force a particular datatype

x.dtype, y.dtype, z.dtype

(dtype('int64'), dtype('float64'), dtype('int64'))

You can read all about numpy datatypes in the [documentation](http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html).

### Changing and adding dimensions

You can **transpose** dimensions within an array using arbitrary axis permutations.

In [52]:
a = np.ones((3, 5))
print_arr(a.transpose()) # also a.T

shape=(5, 3) dtype=float64:
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]



In [53]:
a = np.ones((2, 4, 6))
a[1,2,3] = 777

print_arr(a.transpose(1,0,2))

shape=(4, 2, 6) dtype=float64:
[[[  1.   1.   1.   1.   1.   1.]
  [  1.   1.   1.   1.   1.   1.]]

 [[  1.   1.   1.   1.   1.   1.]
  [  1.   1.   1.   1.   1.   1.]]

 [[  1.   1.   1.   1.   1.   1.]
  [  1.   1.   1. 777.   1.   1.]]

 [[  1.   1.   1.   1.   1.   1.]
  [  1.   1.   1.   1.   1.   1.]]]



Note that an element `[x,y,z]` moves to position `[y,x,z]` after a transpose with this permutation (1,0,2).

Another important feature is **reshaping** an array into different dimensions.

In [54]:
a = np.ones((3, 6))
print_arr(np.reshape(a, (2, 9)))

shape=(2, 9) dtype=float64:
[[1. 1. 1. 1. 1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1. 1. 1. 1. 1.]]



When reshaping, we need to make sure to preserve the same number of elements.
Use `-1` in one of the dimensions to tell numpy to "figure it out".

You can also combine multiple arrays with **concatenation** along an arbitrary axis.

In [55]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
print_arr(a)
print_arr(b)

shape=(2, 2) dtype=int64:
[[1 2]
 [3 4]]

shape=(1, 2) dtype=int64:
[[5 6]]



In [56]:
print_arr(np.concatenate((a, b), axis=0))

shape=(3, 2) dtype=int64:
[[1 2]
 [3 4]
 [5 6]]



In [57]:
print_arr(np.concatenate((a, b.T), axis=1))

shape=(2, 3) dtype=int64:
[[1 2 5]
 [3 4 6]]



In [58]:
print_arr(np.concatenate((a, b), axis=None))

shape=(6,) dtype=int64:
[1 2 3 4 5 6]



### Broadcasting

Broadcasting is a powerful mechanism that allows numpy to work with arrays of **different shapes** when performing arithmetic operations.

Frequently we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operation on the larger array.

For example, suppose that we want to add a constant vector to each row of a matrix.

In [59]:
# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x)   # Create an empty matrix with the same shape as x

print_arr(x,'x=\n')
print_arr(v, '\nv=')

shape=(4, 3) dtype=int64:
x=
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]

shape=(3,) dtype=int64:

v=[1 0 1]



**Naïve approach**: Use a loop.

In [60]:
# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
    y[i, :] = x[i, :] + v

y

array([[ 2,  2,  4],
       [ 5,  5,  7],
       [ 8,  8, 10],
       [11, 11, 13]])

This works; however computing explicit loops in Python is **slow**. 

**Naïve approach 2**: adding the vector v to each row of the matrix `x` is equivalent to forming a matrix `vv` by stacking multiple copies of `v` vertically, then performing elementwise summation of `x` and `vv`.

We could implement this approach like this:

In [61]:
vv = np.tile(v, (4, 1))  # Stack 4 copies of v on top of each other
vv

array([[1, 0, 1],
       [1, 0, 1],
       [1, 0, 1],
       [1, 0, 1]])

In [62]:
y = x + vv  # Add x and vv elementwise
y

array([[ 2,  2,  4],
       [ 5,  5,  7],
       [ 8,  8, 10],
       [11, 11, 13]])

Nice, but a new array was allocated and memory was copied.

**Numpy broadcasting** allows us to perform this computation without actually creating multiple copies of v. Consider this version, using broadcasting:

In [63]:
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])

# Add v to each row of x using broadcasting
y = x + v  

print(f'shapes: {x.shape=}, {v.shape=}\n')
y

shapes: x.shape=(4, 3), v.shape=(3,)



array([[ 2,  2,  4],
       [ 5,  5,  7],
       [ 8,  8, 10],
       [11, 11, 13]])

The line `y = x + v` works even though `x` has shape `(4, 3)` and `v` has shape `(3,)` due to broadcasting; this line works **as if** v actually had shape `(4, 3)`, where each row was a copy of `v`, and the sum was performed elementwise.

Broadcasting two arrays together follows these rules:

1. All input arrays with ndim smaller than the input array of largest ndim, have **1’s prepended to their shapes**.
1. The size in each dimension of the **output shape** is the maximum of all the input sizes in that dimension.
1. An input can be used in the calculation if its size in a particular **dimension either matches** the output size in that dimension, **or has value exactly 1**.
1. If an input has a dimension size of 1 in its shape, the **first data entry in that dimension will be used for all calculations** along that dimension. In other words, the stepping machinery of the ufunc will simply not step along that dimension (the stride will be 0 for that dimension).

In our example:
- `x` has shape `(4,3)`
- `v` has shape `(3,)`.

Following the Broadcasting logic, we can say the following is equivalent to what happened:
1. `v` has less dims than `x` so a dimension of `1` is **prepended** -> `v` is now `(1, 3)`.
1. Output shape will be `(max(1,4), max(3,3)) = (4,3)`.
1. Dim 1 of `v` matches exactly (3); dim 0 is exactly 1, so we can use the first data entry (row 0) for each time any row is accessed. This is effectively like converting `v` from `(1,3)` to `(4,3)` by replicating.

Broadcasting is incredibly useful and necessary for writing **vectorized** code,
i.e. code that avoids explicit python loops which are very slow.
Instead, this approach leveraged the underlying C implementation of numpy.

For more on broadcasting, see the [documentation](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) or this [explanation](http://wiki.scipy.org/EricsBroadcastingDoc).

Functions that support broadcasting are known as universal functions. You can find the list of all universal functions in the [documentation](http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs).

Here are some applications of broadcasting:

In [64]:
# Compute outer product of vectors
v = np.array([1,2,3])  # v has shape (3,)
w = np.array([4,5])    # w has shape (2,)
print_arr(v)
print_arr(w)

shape=(3,) dtype=int64:
[1 2 3]

shape=(2,) dtype=int64:
[4 5]



In [65]:
# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:

# (3,1) * (2,) -> (3,1) * (1, 2) -> (3, 2) * (3, 2)
np.reshape(v, (3, 1)) * w

array([[ 4,  5],
       [ 8, 10],
       [12, 15]])

In [66]:
# Multiply a matrix by a constant:
x = np.ones((2,3))

# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3).

# (2,3) * () -> (2,3) * (1,1) -> (2,3) * (2,3)
x * 2

array([[2., 2., 2.],
       [2., 2., 2.]])

Broadcasting typically makes your code more concise and faster, so you should strive to use it where possible.

This brief overview has touched on many of the important things that you need to know about numpy, but is far from complete. Check out the [numpy reference](http://docs.scipy.org/doc/numpy/reference/) to find out much more about numpy.