# NUMPY

**Attribution**:

- Parts of this notebook come from the following [Google colab file](https://colab.research.google.com/github/ffund/ml-notebooks/blob/master/notebooks/1-python-numpy-tutorial.ipynb). Which is based on the following sources:
    -   Parts of this notebook are adapted from a [tutorial from CS231N at Stanford University](https://cs231n.github.io/python-numpy-tutorial/), which is shared under the [MIT license]((https://opensource.org/licenses/MIT)).
    -   Parts of this notebook are adapted from Jake VanderPlas’s [Whirlwind Tour of Python](https://colab.research.google.com/github/jakevdp/WhirlwindTourOfPython/blob/master/Index.ipynb), which is shared under the [Creative Commons CC0 Public Domain Dedication license](https://github.com/jakevdp/WhirlwindTourOfPython/blob/master/LICENSE).
    -   The visualizations in this notebook are from [A Visual Intro to NumPy](http://jalammar.github.io/visual-numpy/) by Jay Alammar, which is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
    -   Parts of this notebook (and some images) about numpy broadcasting are adapted from Sebastian Raschka’s [STATS451](https://github.com/rasbt/stat451-machine-learning-fs20) materials.

- Also some content come from wikipedia, like the [numpy page](https://en.wikipedia.org/wiki/NumPy)
- Exercices come from various websites including Kaggle.


## Introduction

Using Python for scientific activites or data science is made very easy thanks to three very popular librairies : "numpy", "pandas" and "matplotlib". They are all designed to work well together and, almost every time, objects from one library can be reused in the two others.

## What is Numpy?

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays.

From 1995 to 2006 there were two librairies mainly used to deal with arrays and matrices: "numeric" and "numarray". But in early 2005, Travis Oliphant wanted to unify the community around a single array package and ported Numarray's features to Numeric, releasing the result as **NumPy 1.0 in 2006** (Wikipedia).

To use Numpy, we first need to import the `numpy` package. By convention, we import it using the alias `np`. Then, when we want to use modules or functions in this library, we preface them with `np.`

In [None]:
import numpy as np

## Arrays and array construction

A numpy array is a grid of values, all of the same type, and is indexed by a tuple of non-negative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

### 1d Array

We can create a `numpy` array by passing a Python list to `np.array()`.

In [None]:
a = np.array([1, 2, 3]) # Create a rank 1 array
print("Type:", type(a))
print("Elements:", a[0], a[1], a[2])
print("Shape:", a.shape)
print("Number of dimensions:", len(a.shape))
print(a)

Let's say the first dimension is verticality, then we can think of this array as a column.
![](files/create-numpy-array-1.png)

### Arrays with multiple dimensions

To create a numpy array with more dimensions, we can pass nested lists, like this:

In [None]:
b = np.array([[1,2],[3,4]]) # Create a rank 2 array
print("Shape:", b.shape)
print("Number of dimensions:", len(b.shape))
print(b)

![](files/numpy-array-create-2d.png)

In [None]:
c = np.array([[[1,2],[3,4]],[[5,6],[7,8]]]) # Create a rank 3 array
print("Shape:", c.shape)
print("Number of dimensions:", len(c.shape))
print(c)

![](files/numpy-3d-array.png)

## One-dimensional array with values already set

There are often cases when we want numpy to initialize the values of the array for us. numpy provides methods like `.ones()`, `.zeros()`, and `.random.random()` for these cases. We just pass them the number of elements we want it to generate:

In [None]:
print("np.ones(3):", np.ones(3))
print("np.zeros(3):", np.zeros(3))
print("np.random.random(3):", np.random.random(3))

![](files/create-numpy-array-ones-zeros-random.png)

## What is a tuple?

A tuple is an (immutable) ordered list of values. A tuple is in many ways similar to a list; one of the most important differences is that tuples can be used as keys in dictionaries and as elements of sets, while lists cannot. Here is a trivial example:

In [None]:
d = {(x, x + 1): x for x in range(5)} # Create a dictionary with tuple keys
t = (3, 4) # Create a tuple
print(d)
print(type(t))
print(d[t])
print(d[(1, 2)])

In [None]:
t = (3, 4)
# t[0] = 1 # this statement raises an error! 'tuple' object does not support item assignment

## Multi-dimensional array with values already set

We can also use these methods to produce multi-dimensional arrays, as long as we pass them a list or a tuple describing the dimensions of the matrix we want to create. Sometimes, we need an array of a specific shape with "placeholder" values that we plan to fill in with the result of a computation, the `zeros` or `ones` functions are handy for this. As for random matrices, they are useful, among other things, to test functions.

In [None]:
print("np.ones((3, 2)):\n", np.ones((3, 2)))
print("np.zeros((3, 2)):\n", np.zeros((3, 2)))
print("np.random.random((3, 2)):\n", np.random.random((3, 2)))

![](files/numpy-matrix-ones-zeros-random.png)

In [None]:
print("np.ones((4, 3, 2)):\n", np.ones((4, 3, 2)))
print("np.zeros((4, 3, 2)):\n", np.zeros((4, 3, 2)))
print("np.random.random((4, 3, 2)):\n", np.random.random((4, 3, 2)))

![](files/numpy-3d-array-creation.png)

You can also create a matrix full with a constant, or an identity matrix.

In [None]:
d = np.full((2,2), 7) # Create a constant array
print(d)

In [None]:
e = np.eye(5) # Create a 5x5 identity matrix
print(e)

## Exercice

Create an array that looks like this:

```python
      [[[ 78,  87],
        [ 17,  71]],

       [[ 98,  89],
        [ 92,  28]],

       [[112, 211],
        [ 45,  54]]]
```


In [None]:
# Code here!



## Exercice

Create an array that looks like this:

```python
[[[12 12 12 12 12]
  [12 12 12 12 12]
  [12 12 12 12 12]]

 [[12 12 12 12 12]
  [12 12 12 12 12]
  [12 12 12 12 12]]]
```

In [None]:
# Code here!


## Exercice

Create an array that looks like this. **Find a short way to generate it** (do not copy/paste or type all the data!).

```python
      [[10.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.],
       [ 9., 10.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.],
       [ 9.,  9., 10.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.],
       [ 9.,  9.,  9., 10.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.],
       [ 9.,  9.,  9.,  9., 10.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.],
       [ 9.,  9.,  9.,  9.,  9., 10.,  9.,  9.,  9.,  9.,  9.,  9.,  9.],
       [ 9.,  9.,  9.,  9.,  9.,  9., 10.,  9.,  9.,  9.,  9.,  9.,  9.],
       [ 9.,  9.,  9.,  9.,  9.,  9.,  9., 10.,  9.,  9.,  9.,  9.,  9.],
       [ 9.,  9.,  9.,  9.,  9.,  9.,  9.,  9., 10.,  9.,  9.,  9.,  9.],
       [ 9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9., 10.,  9.,  9.,  9.],
       [ 9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9., 10.,  9.,  9.],
       [ 9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9., 10.,  9.],
       [ 9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9.,  9., 10.]]
```

In [None]:
# Code here!


## Exercice

Create an array that looks like this. **Find a short way to generate it** (do not copy/paste or type all the data!).

```python
      [[[[99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99]],

        [[99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99]],

        [[99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99]]],


       [[[99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99]],

        [[99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99]],

        [[99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99],
         [99, 99, 99, 99, 99]]]]
```

In [None]:
# Code here!


## Creating sequences of numbers


Numpy also has two useful functions for creating sequences of numbers: `arange()` and `linspace()`.

- The `arange()` function accepts three arguments, which define the start value, stop value of a half-open interval, and step size. (The default step size, if not explicitly specified, is 1; the default start value, if not explicitly specified, is 0.)

In [None]:
f = np.arange(10,50,5) # Create an array of values starting at 10 in increments of 5
print(f)

Note this ends on 45, not 50 (does not include the top end of the interval).

- The `linspace()` function is similar, but we can specify the number of values instead of the step size, and it will create a sequence of evenly spaced values.

In [None]:
g = np.linspace(0, 1, num=5)
print(g)

# Exercise

Use `np.arange()` to create this array:

```python
[104, 121, 138, 155, 172, 189, 206]
```

In [None]:
# Code here!


# Exercise

Use `np.linspace()` to create this array:

```python
[112.  , 112.75, 113.5 , 114.25, 115.  , 115.75, 116.5 , 117.25,
       118.  ]
```

In [None]:
# Code here!


### Stacking arrays

Sometimes, we may want to construct an array from existing arrays by “stacking” the existing arrays, either vertically or horizontally. We can use `vstack()` (or `row_stack()`) and `hstack()` (or `column_stack()`), respectively.

In [None]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.vstack((a,b))

In [None]:
a = np.array([[7], [8], [9]])
b = np.array([[4], [5], [6]])
np.hstack((a,b))

### Array indexing

Numpy offers several ways to index into arrays.

We can index and slice numpy arrays in all the ways we can slice Python lists:

In [None]:
data = np.array([1, 2, 3])
print(data)
print(data[0])
print(data[1])
print(data[0:2])
print(data[1:])

![](files/numpy-array-slice.png)

And you can index and slice numpy arrays in multiple dimensions. If slicing an array with more than one dimension, you should specify a slice for each dimension:

In [None]:
sep = "\n------------"
data = np.array([[1, 2], [3, 4], [5, 6]])
print("data \n", data, sep)
print("data[0, 1] \n", data[0,1], sep)
print("data[1:3] \n", data[1:3], sep)
print("data[1:3] \n", data[0:2,0], sep)

![](files/numpy-matrix-indexing.png)

In [None]:
sep = "\n--------------"
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print("a: \n", a, sep)
# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2
b = a[:2, 1:3]
print("b : \n", b)

A slice of an array is a view into the same data, so modifying it will modify the original array.

In [None]:
sep = "\n-------------------"
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print("a: \n", a, sep)
b = a[:2, 1:3]
print("b: \n", b, sep)
print("Let's set the value of b [0, 0] to 1337", sep)
b[0, 0] = 1337 # b[0, 0] is the same piece of data as a[0, 1]
print("b: \n", b, sep)
print("a: \n", a, sep)

# Exercice

Slice the "z" array array so it outputs:
```python
[[ 7, 11],
 [19, 23]]
```

In [None]:
z = np.array([[[1,2,3,4], [5,6,7,8], [9,10,11,12]], [[13,14,15,16], [17,18,19,20], [21,22,23,24]]])
print("z array: \n", z)
# Code here



In [None]:
# Solution

z = np.array([[[1,2,3,4], [5,6,7,8], [9,10,11,12]], [[13,14,15,16], [17,18,19,20], [21,22,23,24]]])
z[:, 1:3, 2]

### Mixing integer indexing and slice indexing

You can also mix integer indexing with slice indexing. However, doing so will yield an array of lower rank than the original array. There are two ways of accessing the data in the middle row of the array. Mixing integer indexing with slices yields an array of lower rank, while using only slices yields an array of the same rank as the original array:

In [None]:
sep = "\n-------------------"
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a, sep)

row_r1 = a[1, :]    # Rank 1 view of the second row of a  
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
print(row_r1, "  shape:", row_r1.shape)
print(row_r2, "shape:", row_r2.shape)

In [None]:
sep = "\n-------------------"
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a, sep)

# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print("shape:", col_r1.shape)
print(col_r1, sep)
print("shape:", col_r2.shape)
print(col_r2, sep)

### Slice array indexing and integer array indexing

When you index into numpy arrays using slicing, the resulting array view will always be a subarray of the original array. In contrast, integer array indexing allows you to construct arbitrary arrays using the data from another array. Here is an example:

#### Integer array indexing

In [None]:
sep = "\n-------------------"
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a, sep)

a[[0, 1, 2]] # Let's create an array with the rows order

In [None]:
sep = "\n-------------------"
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a, sep)

a[[1, 0, 2]] # Let's swap rows

In [None]:
sep = "\n-------------------"
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a, sep)

a[[1, 0, 2], [3, 1, 0]] # Let's select some numbers inside the others dimensions.

In [None]:
# An other example

sep = "\n-------------------"
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print("a:\n", a, sep)

# An example of integer array indexing (creates a new array)
b = a[[0, 1, 2], [0, 1, 0]]
print("b:\n", b, sep)

In [None]:
# Using that syntax above is equivalent to the following one (creates a new array)
c = np.array([a[0, 0], a[1, 1], a[2, 0]])
print("c:\n", c, sep)

print("Let's modify b and c")
b[0] = 789
c[0] = 456

print("b:\n", b, sep)
print("c:\n", c, sep)

print("a:\n", a, sep) # a has not been modified

In [None]:
# When using integer array indexing, you can reuse the same
# element from the source array:
sep = "\n-------------------"
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(a, sep)

print(a[[0, 0], [1, 1]])

# Equivalent to the previous integer array indexing example
print(np.array([a[0, 1], a[0, 1]]))

One useful trick with integer array indexing is selecting or mutating one element from each row of a matrix:

In [None]:
# Create a new array from which we will select elements
sep = "\n-------------------"
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print("a:\n", a, sep)

# Create an array of indices

b = np.arange(3) # [0 1 2]

print("b:\n", b, sep)

c = np.array([0, 2, 0])
print("c:\n", c, sep)

# Select one element from each row of a using the indices in b
#c = a[np.arange(3), b]
print(a[b, c])

# Mutate one element from each row of a using the indices in b
a[b, c] += 100
print("a:\n", a, sep)

### Boolean array indexing

Boolean array indexing lets you pick out arbitrary elements of an array. Frequently this type of indexing is used to select the elements of an array that satisfy some condition. Here is an example:

In [None]:
sep = "\n-------------------"
a = np.array([[1,2], [3, 4], [5, 6]])

# Find the elements of a that are bigger than 2;
# this returns a numpy array of Booleans of the same
# shape as a, where each slot of bool_idx tells
# whether that element of a is > 2.

print(a > 2, sep)

print(a[a > 2], sep)

When working with numpy arrays, it’s often helpful to get the indices (not only the values) of array elements that meet certain conditions. There are a few numpy functions that you’ll definitely want to remember:

-   [`argmax`](https://numpy.org/doc/stable/reference/generated/numpy.argmax.html) (get index of maximum element in array)
-   [`argmin`](https://numpy.org/doc/stable/reference/generated/numpy.argmin.html) (get index of minimum element in array)
-   [`argsort`](https://numpy.org/doc/stable/reference/generated/numpy.argsort.html) (get sorted list of indices, by element value, in ascending order)
-   [`where`](https://numpy.org/doc/stable/reference/generated/numpy.where.html) (get indices of elements that meet some condition)

In [None]:
sep = "\n-------------------"
a = np.array([1, 8, 9, -3, 2, 4, 7, 9])

print("a:\n", a, sep)

# Get the index of the maximum element in a
# (this array has two elements with the maximum value but 
# only one index is returned)
print("np.argmax(a) ->", np.argmax(a), sep)

# Get the index of the minimum element in a
print("np.argmin(a)->", np.argmin(a), sep)

# Get sorted list of indices in ascending order
print("np.argsort(a) ->", np.argsort(a), sep)

# Get sorted list of indices in descending order
# [::-1] is a special slicing index that returns the reversed list
print("np.argsort(a)[::-1] ->", np.argsort(a)[::-1], sep)

# Get indices of elements that meet some condition
# this returns a tuple, the list of indices is the first entry
# so we use [0] to get it
print("np.where(a > 5) ->", np.where(a > 5), sep)
print("np.where(a > 5)[0] ->", np.where(a > 5)[0], sep)

# Get indices of elements that meet some condition
# this example shows how to get the index of *all* the max values
print(np.where(a >= a[np.argmax(a)])[0])

For brevity we have left out a lot of details about numpy array indexing; if you want to know more you should read the documentation.

### Datatypes

Every numpy array is a grid of elements of the same type. Numpy provides a large set of numeric datatypes that you can use to construct arrays. Numpy tries to guess a datatype when you create an array, but functions that construct arrays usually also include an optional argument to explicitly specify the datatype. Here is an example:

In [None]:
x = np.array([1, 2])  # Let numpy choose the datatype
y = np.array([1.0, 2.0])  # Let numpy choose the datatype
z = np.array([1, 2], dtype=np.int64)  # Force a particular datatype

print(x.dtype, y.dtype, z.dtype)

You can also cast an array to a different datatype using the function `astype()`.

In [None]:
x = np.array([1, 2])
print(x)
x = x.astype(np.float64)
print(x)

You can read all about numpy datatypes in the [documentation](http://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html).

### Array math

What makes working with `numpy` so powerful and convenient is that it comes with many *vectorized* math functions for computation over elements of an array. These functions are highly optimized and are *very* fast - much, much faster than using an explicit `for` loop.

For example, let’s create a large array of random values and then sum it both ways. We’ll use a `%%time` *cell magic* to time them.

In [None]:
a = np.random.random(100_000_000)

In [None]:
%%time
x = np.sum(a)
print(x)

In [None]:
%%time
x = 0 
for element in a:
    x = x + element
print(x)

Look at the “Wall Time” in the output - note how much faster the vectorized version of the operation is! This type of fast computation is a major enabler of machine learning, which requires a *lot* of computation.

Whenever possible, we will try to use these vectorized operations.

Some mathematic functions are available both as operator overloads and as functions in the numpy module.

For example, you can perform an elementwise sum on two arrays using either the + operator or the `add()` function.

![](files/numpy-arrays-adding-1.png)

![](files/numpy-matrix-arithmetic.png)

In [None]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
print(x + y)
print(np.add(x, y))

And this works for other operations as well, not only addition:

![](files/numpy-array-subtract-multiply-divide.png)

In [None]:
# Elementwise difference; both produce the array
print(x - y)
print(np.subtract(x, y))

In [None]:
# Elementwise product; both produce the array
print(x * y)
print(np.multiply(x, y))

In [None]:
# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

In [None]:
# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

The operator `*` is elementwise multiplication, not matrix multiplication. We instead use the `dot()` function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices. `dot()` is available both as a function in the numpy module and as an instance method of array objects. We can also use the "@" operator that does the same thing.

![](files/numpy-matrix-dot-product-1.png)

In [None]:
a = np.array([1, 2, 3])
b = np.array([[1, 10], [100, 1_000], [10_000, 100_000]])
print(a.dot(b))
print(a @ b)
# print(a * b) will yield an error

In [None]:
v = np.array([9, 10])
w = np.array([11, 12])

# Inner product of vectors; produce 219
print(v.dot(w)) # Method 1
print(v @ w) # Method 2
print(np.dot(v, w)) # Method 3

In [None]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9, 10])
w = np.array([11, 12])

# Matrix / vector product; both produce the rank 1 array
print(x.dot(v)) # Method 1
print(np.dot(x, v)) # Method 2
print(x @ v) # Method 3

In [None]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9, 10])
w = np.array([11, 12])

# Matrix / matrix product; both produce the rank 2 array

print(x.dot(y)) # Method 1
print(np.dot(x, y)) # Method 2
print(x @ y) # Method 3

Besides for the functions that overload operators, Numpy also provides many useful functions for performing computations on arrays, such as `min()`, `max()`, `sum()`, and others:

![](files/numpy-matrix-aggregation-1.png)

In [None]:
x = np.array([[1, 2], [3, 4], [5, 6]])

print(np.max(x))
print(np.min(x))
print(np.sum(x))

Not only can we aggregate all the values in a matrix using these functions, but we can also aggregate across the rows or columns by using the `axis` parameter:

![](files/numpy-matrix-aggregation-4.png)

In [None]:
x = np.array([[1, 2], [5, 3], [4, 6]])

print(np.max(x, axis=0))  # Compute max of each column; prints "[5 6]"
print(np.max(x, axis=1))  # Compute max of each row; prints "[2 5 6]"

You can find the full list of mathematical functions provided by numpy in the [documentation](http://docs.scipy.org/doc/numpy/reference/routines.math.html).

### Reshaping and transposing arrays

Apart from computing mathematical functions using arrays, we frequently need to reshape or otherwise manipulate data in arrays. The simplest example of this type of operation is transposing a matrix; to transpose a matrix, simply use the T attribute of an array object.

![](files/numpy-transpose.png)

In [None]:
x = np.array([[1, 2], [3, 4], [5, 6]])

print(x)
print("transpose\n", x.T)

In [None]:
v = np.array([[1,2,3]])
print(v)
print("transpose\n", v.T)

In more advanced use case, you may find yourself needing to change the dimensions of a certain matrix. This is often the case in machine learning applications where a certain model expects a certain shape for the inputs that is different from your dataset. numpy's `reshape()` method is useful in these cases.

![](files/numpy-reshape.png)

A common task in this class will be to convert a 1D array to a 2D array, and vice versa. We can use `reshape()` for this.
For example, suppose we had this 2D array, but we need to pass it to a function that expects a 1D array.

In [None]:
w = np.array([[1],[2],[3]])
print(w)
w.shape

We can remove the “unnecessary” extra dimension with

In [None]:
y = w.reshape(-1,)
print(y)
y.shape

Note that we can pass -1 as one dimension and numpy will infer the correct size based on our matrix size!

There’s also a `squeeze()` function that removes *all* of the “unnecessary” dimensions (dimensions that have size 1) from an array:

In [None]:
z = w.squeeze()
print(z)
z.shape

To go from a 1D to 2D array, we can just add in another dimension of size 1:

In [None]:
y.reshape((-1,1))

### Broadcasting

Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when performing arithmetic operations.

For example: in basic linear algebra, we can only add (and perform similar element-wise operations) two matrices that have the same dimension. In numpy, if we want to add two matrices that have different dimensions, numpy will implicitly “extend” the dimension of one matrix to match the other so that we can perform the operation.

So these operations will work, instead of returning an error:

![](files/broadcasting-1.png)

![](files/broadcasting-2.png)

Broadcasting two arrays together follows these rules:

**Rule 1**: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.

For example, in the following cell, "a" will be implicitly extended to shape (1,3):

In [None]:
a = np.array([1,2,3])         # has shape (3,): one dimension
b = np.array([[4], [5], [6]]) # has shape (3,1): two dimensions
c = a + b                     # will have shape (3,3) (two dimensions)
print(c)

**Rule 2**: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.

For example, in the following cell "a" will be implicitly extended to shape (3,2):

In [None]:
a = np.array([[1],[2],[3]])         # has shape (3,1)
b = np.array([[4,5], [6,7], [8,9]]) # has shape (3,2)
c = a + b                           # will have shape (3,2) 
print(c)

**Rule 3**: If in any dimension the sizes disagree and neither is equal to 1, an error is raised:

For more detail, you can read the explanation from the [documentation](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html).

Functions that support broadcasting are known as universal functions. You can find the list of all universal functions in the [documentation](http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs).

Here are a few visual examples involving broadcasting.

![](files/numpy-array-broadcast.png)

Note that these arrays are compatible in each dimension if they have either the same size in that dimension, or if one array has size 1 in that dimension.

![](files/numpy-matrix-broadcast.png)

And here are some more practical applications:

In [None]:
# Compute outer product of vectors
v = np.array([1,2,3])  # v has shape (3,)
w = np.array([4,5])    # w has shape (2,)
# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:

print(np.reshape(v, (3, 1)) * w)

In [None]:
# Add a vector to each row of a matrix
x = np.array([[1,2,3], [4,5,6]])
# x has shape (2, 3) and v has shape (3,) so they broadcast to (2, 3),
# giving the following matrix:

print(x + v)

In [None]:
# Add a vector to each column of a matrix
# x has shape (2, 3) and w has shape (2,).
# If we transpose x then it has shape (3, 2) and can be broadcast
# against w to yield a result of shape (3, 2); transposing this result
# yields the final result of shape (2, 3) which is the matrix x with
# the vector w added to each column. Gives the following matrix:

print((x.T + w).T)

In [None]:
# Another solution is to reshape w to be a row vector of shape (2, 1);
# we can then broadcast it directly against x to produce the same
# output.
print(x + np.reshape(w, (2, 1)))

In [None]:
# Multiply a matrix by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3), producing the
# following array:
print(x * 2)

Broadcasting typically makes your code more concise and faster, so you should strive to use it where possible.

This brief overview has touched on many of the important things that you need to know about numpy, but is far from complete. Check out the [numpy reference](http://docs.scipy.org/doc/numpy/reference/) to find out much more about numpy.

# Exercises

## Exercise (easy)

Find the right function and convert the variable "a", which is a 1-D array, to a 3-D array.

In [None]:
a = np.array([x for x in range(27)])

# Code here!

## Exercise (easy)

Convert all the elements of a numpy array from float to integer datatype.

In [None]:
a = np.array([[2.5, 3.8, 1.5],
              [4.7, 2.9, 1.56]])

# Code here!

## Exercise (easy)

Convert a binary numpy array (containing only 0s and 1s) to a boolean numpy array.

In [None]:
a = np.array([[1, 0, 0],
              [1, 1, 1],
              [0, 0, 0]])

# Code here!

## Exercise (easy / medium)

Use `np.arange()` to generate a sequence of numbers from 0 to 100 (both included) with gaps of 2 numbers, for example: 0, 2, 4 ...

In [None]:
# Code here!

## Exercise (easy / medium)

From 2 numpy arrays, extract the indices in which the elements in the 2 arrays match. What function are you going to use?

In [None]:
a = np.array([1,2,3,4,5])
b = np.array([1,3,2,4,5])

# Code here!

## Exercise (easy)

Generate a sequence of equally gapped 12 numbers in the range 0 to 1342. Then convert it to integers.

In [None]:
# Code here!


## Exercise (easy / medium)

Use the appropriate numpy function to generate this matrix (there are 9 rows and 9 columns): 
```python
[[1, 1, 1, 1, 1, 1, 1, 1, 1],
 [1, 2, 2, 2, 2, 2, 2, 2, 1],
 [1, 2, 2, 2, 2, 2, 2, 2, 1],
 [1, 2, 2, 2, 2, 2, 2, 2, 1],
 [1, 2, 2, 2, 2, 2, 2, 2, 1],
 [1, 2, 2, 2, 2, 2, 2, 2, 1],
 [1, 2, 2, 2, 2, 2, 2, 2, 1],
 [1, 2, 2, 2, 2, 2, 2, 2, 1],
 [1, 1, 1, 1, 1, 1, 1, 1, 1]]
```

In [None]:
# Code here!


## Exercise (easy)

Get all items greater than 7 from the array "a".

In [None]:
a = np.array([2, 6, 1, 9, 10, 3, 27])

# Code here!

## Exercise (easy / medium)
Swap columns 1 and 2 in the array a.

**hint**: use slices and integers indexing!

Expected output:

```python
[[1, 0, 2],
 [4, 3, 5],
 [7, 6, 8]]
```

In [None]:
a = np.arange(9).reshape(3,3)
print(a)

# Code here!


## Exercise (easy / medium)
Reverse the rows of the 2D array "a".

**hint**: use slices and... the little trick to reverse elements in a list.

Expected output:

```python
[[6, 7, 8],
 [3, 4, 5],
 [0, 1, 2]]
```

In [None]:
a = np.arange(9).reshape(3,3)
print(a)

# Code here!