In [None]:
from notebook_checker import start_checks
# Start automatic globals checks
%start_checks

# Numpy arrays

For the next assignment we'll be working with *Numpy* arrays a lot again, and using the functionalities of Numpy effectively to solve certain parts of the assignment will be key. In the previous module we mainly used matrix multiplications to perform certain calculations for entire vectors / matrices in a single operation. For this module we won't use matrix multiplications, but instead learn about other useful vector operations in Numpy.

## Arrays vs. Lists

We'll start by getting a little more familiar with the very basics of Numpy. The central data structure used in Numpy are *arrays*. Arrays are similar to Python lists, but there are a few important differences. Let's start with some similarities. By now you know how to create a basic list of numbers in Python. Converting this list to a Numpy array is trivial using [np.array](https://numpy.org/doc/stable/reference/generated/numpy.array.html)

In [None]:
my_list = [1, 2, 3, 4]

my_list

In [None]:
import numpy as np

my_array = np.array(my_list)

my_array

As you can see, these two structures look very similar, except one is a list and other is an array.

Accessing and modifying elements works exactly the same for both

In [None]:
print(f'The first element of my_list is {my_list[0]}')

my_list[2] = 9

my_list

In [None]:
print(f'The first element of my_array is {my_array[0]}')

my_array[2] = 9

my_array

This type of simple array is called a *1-D array*, which is also called a *vector*. Here 1-D refers to fact that the array is one-dimensional.

This array is still different from a list, so we'll need to use different methods for certain operations. To get the length of a list, we can use the familiar `len()` function, but an array has a size *attribute*, which we can access using `.size`

In [None]:
print(f'The length of my_list is {len(my_list)}')

print(f'The number of elements in my_array is {my_array.size}')

### Assignment 1: Odd ones

Complete the function `odd_ones()`. The input for this function is a 1-D array `arr` and the function should return a modified version of `arr`, where all the odd indices are replaced by the value 1. Take a look at some example outputs below

    >>> odd_ones(np.array([2, 3, 4, 5, 6]))
    array([2, 1, 4, 1, 6])
    
    >>> odd_ones(np.array([0, 0, 0, 0, 0, 0, 0, 0]))
    array([0, 1, 0, 1, 0, 1, 0, 1])

Write a loop over the size of the array and change all the appropriate indices of `arr` to 1. *Hint:* You can use a modulo operation to determine if a index is odd, or use the optional arguments of the [range](https://docs.python.org/3.8/library/functions.html#func-range) function.


In [None]:
def odd_ones(arr):
    # YOUR CODE HERE
    
    return arr


In [None]:
np.testing.assert_equal(odd_ones(np.array([2, 3, 4, 5, 6])),
                        np.array([2, 1, 4, 1, 6]), "Example failed\n")
np.testing.assert_equal(odd_ones(np.array([0, 0, 0, 0, 0, 0, 0, 0])),
                        np.array([0, 1, 0, 1, 0, 1, 0, 1]), "Example failed\n")
try:
    odd_ones([2, 3, 4, 5, 6])
    print("Error: The function should use the size of the array!")
except AttributeError:
    print("All tests passed!")

## 2-D arrays and shape

The differences with lists and arrays become a bit more apparent when comparing a *2-D array*, also called a *matrix*, with a list of lists structure

In [None]:
list_of_lists = [[1, 2, 3], [4, 5, 6]]

list_of_lists

In [None]:
matrix = np.array(list_of_lists)

matrix

Now `len()` will only give the length of the *outer* list, while `.size` will still give the total number of elements.

In [None]:
print(f'The length of list_of_lists is {len(list_of_lists)}')

print(f'The number of elements in matrix is {matrix.size}')

As the matrix is a 2-D array, it now has two different axes. To see the size of each axis separately, you can use the `.shape` attribute.

In [None]:
print(f'The shape of matrix is {matrix.shape}')

In general, when retrieving the size of multi-dimensional arrays *you should always use `.shape` so you can operate on each axis separately!*

The shape is just a tuple, which you can index to get first or second element:

In [None]:
print(f'The matrix is {matrix.shape[0]} by {matrix.shape[1]}')

We can get the transpose of any matrix by accessing the `.T` attribute. The shape of a transpose is then also inverted of course

In [None]:
transpose = matrix.T

display(transpose)
print(f'The shape of transpose is {transpose.shape}')

## Creating arrays and data types

Creating new arrays is straight forward if you know the specific shape you need. You can create an array with all zeros using [np.zeros](https://numpy.org/doc/stable/reference/generated/numpy.zeros.html) or an array with all ones using [np.ones](https://numpy.org/doc/stable/reference/generated/numpy.ones.html). The only required argument is the shape of the array you want to create, which should also be a *tuple* 

In [None]:
zeros_mat = np.zeros((5, 3))

zeros_mat

In the output above you can see all of the zeros are written as `0.` . This is because they are floating point numbers, so they each have a decimal point. Here we can see another key difference with lists, which is the fact that Numpy arrays only ever contain data of a *single* type.

The data type of Numpy arrays can be accessed using the `.dtype` attribute

In [None]:
print(f'The type of all numbers in zeros_mat is {zeros_mat.dtype}')

The data type here is `float64`, which means a floating point number consisting of 64 bits. This is default data type for a Numpy array.

We can change the data type of any array we're creating by using the optional `dtype` argument

In [None]:
ones_mat = np.ones((4, 6), dtype=int)

display(ones_mat)
print(f'The type of all numbers in ones_mat is {ones_mat.dtype}')

The code above has created a new matrix containing only 64 bit integers. Note that we *cannot* mix types in arrays, even though this is possible in lists. Storing a floating point number in an integer array will cause any decimal precision to be lost, as the float will automatically be cast to an int 

In [None]:
ones_mat[0, 0] = 3.14159

ones_mat

This might seem like a downside, but that type restriction also allows Numpy code to be much faster and require less memory than when working just with Python lists.

Another such apparent limitation is that Numpy arrays should be regular, i.e. all inner arrays must be of the same size. A shape of `(2, 3)` means there are 2 rows, each with exactly 3 elements. Lists of lists can contain an uneven numbers of elements, but trying to create such an array will produce a warning and result in a strange array with an `object` data type

In [None]:
uneven_list = [[1, 2.6, 3, 4.2, 5], [6, 7], [8.5, 9.9, 10.0]]
print(uneven_list)

np.array(uneven_list)

The specifics of what exactly happens when you create such an array are beyond the scope of this introduction, but it should at least be clear enough this is not exactly what you might expect to get.

The key thing to remember from this is to always create arrays with **regular shapes and data types**.

### Assignment 2: Shape zeros

Complete the function `shape_zeros()`. The input for this function is a 2-D array `arr` and the function should return a new array of the same shape, containing only zeros of integer type. Take a look at some example outputs below

    >>> shape_zeros(np.array([[1, 2, 3], [4, 5, 6]]))
    array([[0, 0, 0],
           [0, 0, 0]])

    >>> shape_zeros(np.array([[1.0, 1.0], [2.0, 2.0], [3.0, 3.0], [4.0, 4.0]]))
    array([[0, 0],
           [0, 0],
           [0, 0],
           [0, 0]])
    
Make sure not to modify the input array, but instead create a new array of zeros of the correct type. *Hint:* This assignment is intended to only be one or two lines, try to find the relevant functions from the sections above.

In [None]:
def shape_zeros(arr):
    # YOUR CODE HERE


In [None]:
np.testing.assert_equal(shape_zeros(np.array([[1, 2, 3], [4, 5, 6]])),
                        np.array([[0, 0, 0], [0, 0, 0]]), "Example failed\n")
np.testing.assert_equal(shape_zeros(np.array([[1, 1], [2, 2], [3, 3], [4, 4]])),
                        np.array([[0, 0], [0, 0], [0, 0], [0, 0]]), "Example failed\n")

matrix_copy = np.array(matrix)
np.testing.assert_equal(shape_zeros(matrix_copy).dtype, np.int64, "The array should be of type int64\n")
np.testing.assert_equal(matrix_copy, matrix, "The function shape_zeros should not modify the input array\n")
print("All tests passed!")

## Array indexing

To complete the comparisons between a 2-D arrays and a list of lists, let's take a look at retrieving and assigning individual elements. While indexing elements is identical for 1-D arrays and lists, with 2-D arrays this works a little differently. Using a list of lists, you need to index the outer and the inner list separately, but with an array you can index all the axes in a single operation. The basic difference is you always use a single pair of `[]` brackets to index any array, and separate the different axes with a comma. This is easiest to see in an example

In [None]:
print(f'The first element of list_of_list is {list_of_lists[0][0]}')

list_of_lists[1][1] = 9

list_of_lists

In [None]:
print(f'The first element of matrix is {matrix[0, 0]}')

matrix[1, 1] = 9

matrix

While this might only seem like a small changes allowing you to write fewer brackets, it will actually turn out to be very useful when combined with more advanced methods like indexing arrays and boolean masking. We'll come back to those in later sections.

## N-D arrays

All the operations we discussed so far don't just work for 2-D arrays, but also work for 3-D or 4-D arrays, which are sometimes called *tensors*. The general term is *N-D array*, meaning *N* dimensional, so the number of dimensions is variable. This is also the proper name for a Numpy array, an `ndarray`.

Creating a 2-D array just means repeating a 1-D array for each row in the matrix. In the same way, creating a 3-D array just requires repeating a 2-D array for each matrix in the tensor. The shape is then a tuple of length 3, where the first values indicates the size of the first axis, which is the number of times the inner matrix is repeated. 

In [None]:
three_array = np.ones((3, 4, 5))

three_array

Indexing works exactly the same as for 2-D arrays

In [None]:
three_array[1, 2, 3] = 9

three_array

The first index indicates the location on the first axis, the second index on the second axis, and so on. Modify the example above to make sure you understand the indexed locations, before moving on to the next part.

This same extension applies to 4-D and higher dimensional arrays, they just become more difficult to visualize or print, so play close attention to the placement of the brackets here

In [None]:
four_array = np.zeros((3, 2, 5, 6))

four_array[2, 0, 4, 5] = 7

four_array[0, 1, 2, 3] = 5

four_array

### Assignment 3: Checkerboard

Complete the function `checkerboard()`. The input for this function is a 2-D array `arr` and the function should return a new array of the same shape, containing the original input value when the row and column index are both even or both odd, and a zero values otherwise. The function should not modify the input array `arr` at all, but the new returned array should be of the same data type of input array. Take a look at some example outputs below

    >>> checkerboard(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]))
    array([[1, 0, 3],
           [0, 5, 0],
           [7, 0, 9]])

    >>> checkerboard(np.array([[1.0, 1.0], [2.0, 2.0], [3.0, 3.0], [4.0, 4.0]]))
    array([[1., 0.],
           [0., 2.],
           [3., 0.],
           [0., 4.]])

**Note:** This assignment is harder than the previous two, and requires elements from each of the previous sections. The most important strategy here is to try and build each step of the program incrementally, instead of writing the whole function at once and hoping it passes the tests. The easiest option for this is to just create a new code cell and start writing small bits of code to solve some part of the problem using the provided sample inputs. Make sure to test each of these pieces of code and experiment with different inputs, before including them in the main checkerboard function!

After writing a part of the code and including it in your checkboard function, test the function with both provided sample inputs and check that the outputs match what you would expect. Below are some hints to help you get started with each of the separate parts:

*Part 1 hint:* What would be a sensible first part to try and solve here? Creating a new array with same shape and same data type is quite close to a problem you solved before, and would also give you an array to fill in during later parts. Once you've added this part to the function, test it with each of the provided sample inputs. What output would you expect this function to give for each sample input?

*Part 2 hint:* As you'll need to work with both row and column indices, you'll also need two loops, one for each index. Can you determine how many steps you would need for the first loop and for second loop? Start by just trying to determine the relevant number of steps for each, without writing the actual loops just yet. Next try to use these loops to copy *all* the values from the input array to new array. What output would you expect from this function for each of the sample inputs? 

*Part 3 hint:* The last required part is determining if the column and row indices are both even or both odd and changing the array value accordingly. This is also similar to an assignment you did earlier, so you could use the same modulo operation to determine if an index is even or odd. To make the problem simpler, you could start with only changing the even rows for example. What output would you then expect from that function for each sample input?


In [None]:
def checkerboard(arr):
    # YOUR CODE HERE


# Using Numpy effectively

Up to this point we've covered the very basics of Numpy, which was creating and modifying multidimensional arrays to store numerical data of the same type together. However, almost everything we've covered up to this point could have also been accomplished using (list of) lists, just the syntax was a little different. It is possible to use Numpy in this way, and it is important that you know how to, but in general you actually want to try and avoid treating Numpy arrays exactly like lists.

**The intended way to use Numpy is to not work with for-loops and simple indices, but to try and write operations to work over whole vectors or matrices in a single step.** This does not mean your code shouldn't have for-loops at all, but certain steps become a lot easier when defined for a whole matrix instead of doing a computation for every element separately. Besides being easier to write, Numpy code that modifies a whole vector will usually be quite a bit faster than some equivalent code using an explicit for-loop.

So, if you want to use Numpy effectively, it is important to try and write code that operates on vectors instead of just using for-loops. This style of programming also called *vectorized code*, and there are several different ways you can achieve this. You've already used one method extensively in the previous module, namely using matrix multiplication to compute some result for a whole matrix of data in single multiplication. The next part of this introduction will cover some other common Numpy methods you can use to write vectorized code.

As it can take some getting used to writing this style of code, we'll also include some practical tips on writing vectorized code. The `checkerboard()` function you just wrote was completely unvectorized, as the steps for each cell were repeated using two for-loops, but we already came across our first useful tip


### Numpy Tip 1:  Test every step separately using simple inputs

Build every step of any Numpy function incremently, testing to make sure each part works before moving on to the next part. Writing vectorized code can take some getting used to and can sometimes produce unexpected results, which can be hard to debug. Even more than with regular Python, it is important to separately check every line does what you expect. Start by writing some simple example inputs for your function which you can use to test each part. These example inputs should be small enough that you can check the results by hand, as testing directly on a large data matrix is a lot harder to verify.


## Axis

One of the easiest ways to vectorize code, is to use Numpy's own built-in functions. These functions are always written to be used on whole matrices at once. A simple example is Numpy's `np.sum()` function, which does what you might expect based on it's name


In [None]:
matrix = np.array([[1, 2, 3], [4, 5, 6]])
display(matrix)

np.sum(matrix)

However, you might not want to sum all elements, but just sum each row for every column. For this you can use the optional `axis` argument, which works for most Numpy functions. If you are working with a function, but only want to perform the operation in one direction, the function will usually support this optional `axis` argument.

The axis argument specifies which axis you want the function to operate on. Each different dimension of an ndarray had a different axis. The first position you index is called the first axis, the second position the second axis and so on.

For a 2-D array the first axis would be the row index. So, if you want to sum all the rows, you'd use the optional `axis=0` argument

In [None]:
np.sum(matrix, axis=0)

Writing this code using for-loops would not only be slower to compute, but also require quite a bit more code than just this sum function!

Summing over the columns for each row is just as easy, just change the axis for the sum to the column axis instead

In [None]:
np.sum(matrix, axis=1)

Note that both sum results become a 1-D array when summing a 2-D array, as the axis that is being summed over also ends up getting removed. If you want to keep the summed dimension, there is an optional `keepdims` argument too

In [None]:
np.sum(matrix, axis=1, keepdims=True)

Remember that all Numpy functions have been specifically created to work with ndarrays, so a lot of them support optional arguments like `axis` and `keepdims`. If a Numpy function does something close to what you want, you can usually modify the output with a few of these optional arguments. Check the documentation of a function to see all these arguments, like for `np.sum` [here](https://numpy.org/doc/stable/reference/generated/numpy.sum.html).

For some operations figuring out which axis to use can be tricky. A simple solution is to use tip 1 and just create a small test matrix to try out what happens when you use `axis=1` or `axis=0`. If you're unsure about what the result should look like, then you should also try and reason about what axis you want to operate on, which is where tip 2 comes in

### Numpy Tip 2:  List each axis of every input and output

When working with functions that use vectors and matrices as input and/or output, a good first step is to always list what data each dimension of every ndarray contains. This can make it a lot easier to see which dimensions should match in size and how data from different these ndarrays could be combined.

Let's a look at a quick example from the previous model, the `linear_model(X, theta)` function

    linear_model(X, theta)
        Input
            X: data matrix, shape (m, n)
                m - number of samples
                n - number of input features
                
            theta: parameter vector, shape (n, 1)
                n - number of input features
        
        Output
            y_hat: vector of predictions, shape (m, 1)
                m - number of samples

Write out an overview like this *before* you start writing the actual code, so you have a clear picture of all the data you'll be working with.


## Elementwise operations and broadcasting

Most arithmetic operations that work between two numbers in Python, like `+` or `*`, work on ndarrays in Numpy too. The simplest case is when two ndarray of the same shape are combined with an operation. In that case the operation is performed elementwise, so the first element of one array is combined with first element of the other array, and so on

In [None]:
matrix_1 = np.array([[1, 2, 3], [4, 5, 6]])
matrix_2 = np.array([[3, 1, 5], [7, 3, 4]])

display(matrix_1)
print('+')
display(matrix_2)

addition_matrix = matrix_1 + matrix_2

print('=')
display(addition_matrix)

In [None]:
display(matrix_1)
print('*')
display(matrix_2)

multiplication_matrix = matrix_1 * matrix_2

print('=')
display(multiplication_matrix)

These elementwise computations can obviously be used to quickly combine results from whole matrices. However, these operations don't just work when the shapes of the inputs match. Generally Numpy will repeat the shape of smaller input fit on the larger input of the operation. The easiest case for this is a single number, which then get repeated for every element in other ndarray of the operation

In [None]:
display(matrix_1)
print('+ 1\n=')

add_one = matrix_1 + 1

display(add_one)

In [None]:
display(matrix_2)
print('* 3\n=')

times_three = matrix_2 * 3

display(times_three)

This doesn't just work with single numbers either, but between vectors and matrices too. The general method by which smaller inputs of operations are repeated to match the shape of the larger input is called [broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html). When broadcasting, each axis of an operation is matched separatedly. The basic rules for the matching are as follows

1. If the axes are of equal size, the operation is performed elementwise.
2. If the smaller axis is of size 1, then the value for this axis gets repeated on to the larger axis.

These rules also describe what happens when you multiply a matrix by a single number, as you can just interpret that scalar number as a (1, 1) ndarray, so the number gets repeated across both axes of the matrix for the elementwise multiplication.

Let's take a look at some examples on how this works for matrices and vectors too


In [None]:
row_vector = np.array([[3, 1, 7]])

display(matrix_1)
print('+')
display(row_vector)

addition_matrix = matrix_1 + row_vector

print('=')
display(addition_matrix)

In [None]:
column_vector = np.array([[2], [6]])

display(matrix_1)
print('*')
display(column_vector)

multiplication_matrix = matrix_1 * column_vector

print('=')
display(multiplication_matrix)

If you're not sure how the broadcasting would work in a specific case, apply tip 1 and make a small example input, as that is the easiest way to test the broadcasting rules. The general idea here is that Numpy will always try to broadcast the smaller input to make the operation work elementwise. If you do try to do operations with unmatched axis you'll get a `ValueError` saying "operands could not be broadcast together".

Broadcasting does not just work for arithmetic operations, but for comparison operations like `<` and `==` too

In [None]:
display(matrix_2)
print('< 4\n=')

less_than_four = matrix_2 < 4

display(less_than_four)

In [None]:
display(matrix_1)
print('== 5\n=')

equal_five = matrix_1 == 5

display(equal_five)

The results of these broadcasted comparison operators are then often used in combination with functions like [np.any](https://numpy.org/doc/stable/reference/generated/numpy.any.html) or [np.all](https://numpy.org/doc/stable/reference/generated/numpy.all.html) to check if a condition holds for a whole matrix. The `np.any` function performs an logical `or` between all elements in the matrix, and `np.all` performs a logical `and`. Note that both also support the optional `axis` argument, in the same way that `np.sum` does

In [None]:
print(np.any(equal_five))

display(np.all(less_than_four, axis=0))

The other common way these broadcasted comparison often get used is combination with masking, which we'll cover in the next section.

## Advanced indexing

The last Numpy feature we'll introduce are the different options for indexing multiple element in ndarrays.

The basic slicing operations used on list can be performed on any axis of a Numpy ndarray. Using an integer index for the first axis and complete slice for the second, selects the whole first row of the matrix


In [None]:
index_matrix = np.array([[1, 2, 3], [4, 5, 6]])
display(index_matrix)

index_matrix[0, :]

The same can of course also be used to select out a specific column, when slicing over all the rows. Note that this would be a lot more complicated to do when indexing a lists of lists, as you'd need to construct this new column list separately

In [None]:
index_matrix[:, 1]

In both cases the result of the slice is a 1-D array, while the original ndarray was 2-D. This is because the axis indexed with a simple integer is no longer needed for result, while the axis with slice remains.

If we include a slice on both axes, then the result would still be a 2-D array. You can also use start and/or end points for the slice of an axis, just like with like with lists

In [None]:
index_matrix[1:, :]

Assigning to a entire slice in one step is also possible, using an ndarray of the right size

In [None]:
index_matrix[0, :] = np.array([7, 8, 9])

display(index_matrix)

The right-hand side of the assignment doesn't even have to be an exact match for the shape of the slice, as long as it can be broadcast to right shape. In that case the right-hand side will the be repeated so the assignment fits the shape of the left-hand side, according to those same broadcasting rules covered in the previous section

In [None]:
index_matrix[:, 1] = 3

display(index_matrix)

Instead of using a slice, you can even use a ndarray of integers, which will each be treated as an index and the result combined like a slice. This allows you to select any specific combination of rows or columns in a single step

In [None]:
index_matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [0, 1, 2]])
display(index_matrix)

integer_index = np.array([0, 2])
display(integer_index)

display(index_matrix[:, integer_index])


The other way select any specific row or column, is to use boolean array indexing. This requires an ndarray the same size as the axis, containing `True` if a row is to be included and `False` otherwise

In [None]:
display(index_matrix)

boolean_index = np.array([True, False, True, True])
display(boolean_index)

display(index_matrix[boolean_index, :])

This technique is also called *boolean masking*. It is often used in combination with the broadcasted comparisons from the previous section, where you first create a large ndarray of booleans using a comparison and then use this boolean array to select out any relevant elements for which the comparison was `True`.

**Note:** When you start combining advanced indexing methods across several axes at once, the rules become a little more complicated. We won't need that for this module, so we won't introduce them, but if you're interested, you can find the full description [here](https://numpy.org/doc/stable/user/basics.indexing.html).


### Assignment 4: Odd ones (vectorized)

Next we'll practice a bit with using all these Numpy techniques to write actual vectorized code. Start by writing the same `odd_ones()` function from before, but this time don't use any for loops! As a reminder, the function should replace all the odd indices with the value 1 and only needs to work for 1-D arrays, like so

    >>> odd_ones(np.array([2, 3, 4, 5, 6]))
    array([2, 1, 4, 1, 6])
    
    >>> odd_ones(np.array([0, 0, 0, 0, 0, 0, 0, 0]))
    array([0, 1, 0, 1, 0, 1, 0, 1])

When writing vectorized code, it is important to start from the point that would normally be a for-loop, as that is the critical part to vectorize. In the case of the `odd_ones()` function, it was the assignment of each value that was inside the loop, so that would be the part we will need to vectorize.

One of the techniques we've just covered is advanced indexing and assignment of several values, which seems like a good candidate to try here. Knowing that this multi-element assignment will be the final step, you can start to think about ways to construct an integer or boolean array index, that would allow you to allow you to only modify the values of the odd indices.

Giving that this advanced indexing will have to be based on the actualy index positions, you'll probably need an array containing *all* the indices and then filter or modify them accordingly. Numpy has a `range` equivalent function for arrays, called [np.arange](https://numpy.org/doc/stable/reference/generated/numpy.arange.html), but you could also use the regular `range` function and convert the result into an array using [np.array](https://numpy.org/doc/stable/reference/generated/numpy.array.html).

Once you've created this array of indices, you can apply some broadcasted modulo and broadcasted equals operations to construct the actual odd positions index array. Note that you'll need to use quite a few of the covered techniques for this already, so remember to use tip 1, and build and test each step separately. In addition, we've just covered another generally useful tip here

### Numpy Tip 3: Work backwards from the vectorized point

When writing vectorized code you want to start working from the specific point you're trying to vectorize. If you know what ndarrays you need for that vectorized step, then you can start thinking about how to construct these. That way you're working backwards from the point you want to vectorize and figuring out what operations you need to get there.

In the example above we started with the conclusion that the assignment part would need to be vectorized, as it was inside a for-loop in the old version of the function. From there we worked backwards to the boolean indexing that could be used to perform that assignment, and ended up at just creating an integer array with all the indices so we could use that to create the boolean array required for the indexing. Creating this integer index array them became the first actual step when writing the code.


In [None]:
def odd_ones(arr):
    # YOUR CODE HERE
    
    return arr

In [None]:
np.testing.assert_equal(odd_ones(np.array([2, 3, 4, 5, 6])),
                        np.array([2, 1, 4, 1, 6]), "Example failed\n")
np.testing.assert_equal(odd_ones(np.array([0, 0, 0, 0, 0, 0, 0, 0])),
                        np.array([0, 1, 0, 1, 0, 1, 0, 1]), "Example failed\n")
print("All tests passed!")

## One-hot encoding

For the last part of this assignment we'll build a function to make a one-hot encoding matrix. One-hot encodings are data representations that are used quite often in machine learning, so they will be useful learn about. Writing a function to create such a matrix will of course also give you some more practice with creating vectorized code.

Any categorical value, meaning values consisting of a distinct and finitely limited set of options, can be represented with a one-hot encoding. As a simple example, let's say we're classifying animals and have a vector of 5 labels

$$l = \left[\begin{array}{c} dog \\ cat \\ cat \\ rabbit \\ dog \end{array} \right]$$

We could encode these labels as integer values, by simply assigning a number to each label

$$c = \left[\begin{array}{c} 2 \\ 1 \\ 1 \\ 3 \\ 2 \end{array} \right]$$

However, what is often done in machine learning instead, is to represent these same labels in a one-hot matrix

$$ B = \left[\begin{array}{ccc}
0 & 1 & 0\\
1 & 0 & 0\\ 
1 & 0 & 0\\
0 & 0 & 1\\ 
0 & 1 & 0\\
\end{array} \right]$$

In this matrix $B$, the rows still correspond to each sample from the labels vector, but the columns now correspond to each of the different labels. All the values for a row are $0$, except for the column corresponding to the label of that sample, which will be $1$. So there is always exactly one value in a row that is *on* (or *hot*), and all the others are *off* (or *cold*), which is why it is called a *one-hot encoding*.

There are quite a few algorithms for which this type of encoding ends up being useful, but we won't go into all of those now and just write a function to create the encoding. This will be the function `categorical_to_one_hot()`, which takes a vector of labels as input and returns a one-hot encoded version of those labels as the output.

A useful first step here would be to create a vector which indicates which one-hot column corresponds to which label. For this you can use use [np.unique](https://numpy.org/doc/stable/reference/generated/numpy.unique.html), which removes any duplicates from an ndarray, to create a vector of unique labels.

This will be the first assignment where you write vectorized code for a matrix, instead of just a vector. Applying tip 2, so writing out what data each axis of the inputs output contain, might help to get you started here. It can still be difficult to vectorize creating a whole matrix at once, so there is one last tip we'll add here.

### Numpy Tip 4: Try to vectorize a single row/column first

Instead of trying to vectorize creating a whole matrix, start by working on vectorizing just one row or column of that matrix. Once you have the vectorized code for that row/column, from there you can look at ways to expand this operation to vectorize the whole matrix in a single step. Note that this might not always be possible, but in that case you can still just repeat your vectorized code in a loop for each row or column.

Writing vectorized code does not mean not using any for-loops at all, but just trying to minimize how many loops you write. If you cannot find a way to vectorize something, writing a loop is a good solution. Starting with a vectorized row and repeating this row solution in a loop still removes one loop from the traditional double for-loop to fill a matrix.

Whether you choose to start from the rows or columns will usually depend on the specific matrix you're trying to vectorize. Sometimes there is an obvious choice and sometimes either option might work. For the `categorical_to_one()` function the later is the case, so you can start vectorizing from either the rows or the columns.


### Assignment 5a: One hot (rows method)

Write the function `categorical_to_one()` which vectorizes filling the rows of one-hot matrix. You should repeat this vectorized step in a for-loop, meaning your solution should have exactly one loop over all the rows. You'll need to use a lot of the tips and techniques covered in this introduction, so build up your function step by step.


In [None]:
def categorical_to_one_hot(cat_vec):
    # YOUR CODE HERE


In [None]:
np.testing.assert_equal(categorical_to_one_hot(np.array([2, 1, 1, 3, 2])),
                        np.array([[0, 1, 0], [1, 0, 0], [1, 0, 0], [0, 0, 1], [0, 1, 0]]), "Example failed\n")
np.testing.assert_equal(categorical_to_one_hot(np.array(['dog', 'cat', 'cat', 'rabbit', 'dog'])),
                        np.array([[0, 1, 0], [1, 0, 0], [1, 0, 0], [0, 0, 1], [0, 1, 0]]), "Example failed\n")
print("All tests passed!")

### Assignment 5b: One hot (columns method)


Write the function `categorical_to_one()` which vectorizes filling the columns of one-hot matrix. You should repeat this vectorized step in a for-loop, meaning your solution should have exactly one loop over all the columns. You can probably reuse a lot of your code from 5a here, with only the column vectorization being a little different.


In [None]:
def categorical_to_one_hot(cat_vec):
    # YOUR CODE HERE


In [None]:
np.testing.assert_equal(categorical_to_one_hot(np.array([2, 1, 1, 3, 2])),
                        np.array([[0, 1, 0], [1, 0, 0], [1, 0, 0], [0, 0, 1], [0, 1, 0]]), "Example failed\n")
np.testing.assert_equal(categorical_to_one_hot(np.array(['dog', 'cat', 'cat', 'rabbit', 'dog'])),
                        np.array([[0, 1, 0], [1, 0, 0], [1, 0, 0], [0, 0, 1], [0, 1, 0]]), "Example failed\n")
print("All tests passed!")

# Overview

To conclude this introduction we'll do a quick recap. We started with the basics of Numpy, which was creating and modifying *multi-dimensional arrays* to store numerical data of the same type together. Then we covered *vectorization*, which does not use for-loops and simple indices, but instead to writes operations to work over whole vectors or matrices in a single step. We covered several different techniques you can use in Numpy to write vectorized code and gave some practical tips for this too


### Vectorization techniques

1. Use **built-in functions** that work on `ndarrays` and support *optional arguments* like `axis`
2. Use *element-wise* operations and **broadcasting** to do arithmetic and comparison operations on `ndarrays`
3. Use **advanced indexing** to access and modify *multiple elements* with slicing, index arrays and **boolean masking**
4. Use **matrix multiplication** for operations that require *repeated multiplication and summations*


### Numpy tips

1. Test every step separately using simple inputs
2. List each axis of every input and output
3. Work backwards from the vectorized point
4. Try to vectorize a single row/column first

