### Overview
The goal of these exercises is to challenge you a little bit and make you think. The questions are designed to allow you to have a play around and try a few things before getting to the answer. The solutions are obtainable by writing very similar to code to what has been explored in these notebooks, but may require an extra step of logic. 

The solutions will be released after the lecture on Thursday.

Let's start with importing libraries and the lists and arrays that you will be working on throughout this notebook.

In [None]:
import numpy as np

list1 = [-10, -5, -2, 0, 2, 5, 10]
mat = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]])

## Exercise 1

The Rectified Linear Unit function $\operatorname{ReLU}(x) = \max\{0,x\}$ is a non-linearity which is essential to most neural networks.

**1. a)** Write a function that will take a floating point number $x$ and return $\operatorname{ReLU}(x)$. Test your function on the `list1`. 

Hint: 
The ReLU function can be reformulated as 
> $\operatorname{ReLU}(x) =  \begin{cases}
x & \mathrm{if} \, x \geq 0,\\ 
0 & \mathrm{if} \, x < 0.
\end{cases}$

In [2]:
# Write your answer here

def ReLU(x):
    if x>=0:
        return x
    elif x<0:
        return 0

for i in list1:
    print(i, '->', ReLU(i))

-10 -> 0
-5 -> 0
-2 -> 0
0 -> 0
2 -> 2
5 -> 5
10 -> 10


**1. b)  [Optional]** An extension of ReLU is the Exponential Linear Unit. 

> $\operatorname{ELU}(x) =  \begin{cases}
x &  \mathrm{if} \, x \geq 0,\\ 
\alpha(\mathrm{e}^x-1) & \mathrm{if} \, x < 0. \end{cases}$

Write a function which will return $\operatorname{ELU}(x)$ for a given $\alpha$. Test your function on the `list1`. 

Hint:
You might need `np.exp`.

In [3]:
# Write your answer here

alpha = 0.1

def ELU(x, alpha):
    if x>=0:
        return x
    elif x<0:
        return alpha*(np.exp(x)-1)

for i in list1:
    print(i, '->', ELU(i, alpha))

-10 -> -0.09999546000702375
-5 -> -0.09932620530009145
-2 -> -0.08646647167633874
0 -> 0
2 -> 2
5 -> 5
10 -> 10


## Exercise 2
**2. a)** Take the matrix `mat` that we used in the 'Numpy and Arrays' notebook. Use numpy to switch the second and third rows of the matrix.

In [4]:
# Write your answer here

mat2 = mat.copy()
mat2[1] = mat[2]
mat2[2] = mat[1]
print(mat)
print(mat2)

[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[0 1 2]
 [6 7 8]
 [3 4 5]]


**2. b)** Create a function that can swap any two rows of any given matrix. Include checks that the `idx1` and `idx2` exist for the loaded matrix.

In [5]:
# Write your answer here

def swap_rows(matrix, idx1, idx2): #This is where the function is defined
    mat2 = matrix.copy()
    height = matrix.shape[0]
    width = matrix.shape[1]
    print('Height =', height)
    
    if idx1 > height:
        print('Error, index 1 greater than height')
        return
    if idx2 > width:
        print('Error, index 2 greater than width')
        return
    
    mat2[idx1] = matrix[idx2]
    mat2[idx2] = matrix[idx1]
    return mat2

In [6]:
#This cell is where we are going to call the function
swap_rows(mat, 1, 2)

Height = 3


array([[0, 1, 2],
       [6, 7, 8],
       [3, 4, 5]])

**2. c)** Print the top right 2x2 square of the matrix `mat`.

In [7]:
# Write your answer here

mat[:2,1:3]

array([[1, 2],
       [4, 5]])

## Exercise 3
In artificial intelligence, the data sets that you work on will sometimes be missing values. Handling missing values is one of the tasks you will have to handle before you can start to implement your model. Many off-the-shelf methods won't run if missing values exist, or will just delete any row where even one value is missing.

This exercise will cover some of the simplest ways to handle this problem.

**3. a)** Load the data set `missing.csv` and use `np.where` to find where the dataset is missing values

**3. b) i)** Replace the missing values with a "don't care" value. Usually, this means fill in all missing values with `-1`

In [8]:
# Write your answer here
values = np.genfromtxt('Data/missing.csv', delimiter = ',', dtype = '|U') 
#Note that if we load this as int, numpy automatically sends  missing values to -1
print(values)

[['House Price' ' Number of Bedooms' ' Number of Bathrooms'
  ' Square Meters']
 ['80000' '1' '1' '25']
 ['220000' '3' '1' '48']
 ['625000' '6' '3' '']
 ['' '4' '2' '57']
 ['455500' '4' ' ' '70']
 ['350000' ' ' '2' '60']]


In [9]:
values[np.where(values == ' ')] = '-1' # There was a small error when loading the data set. The second line shouldn't be 
values[np.where(values == '')] = '-1'  # needed. Lots of data sets have similar errors however. It's good to see how to
print(values)                          # deal with them when they crop up. 

[['House Price' ' Number of Bedooms' ' Number of Bathrooms'
  ' Square Meters']
 ['80000' '1' '1' '25']
 ['220000' '3' '1' '48']
 ['625000' '6' '3' '-1']
 ['-1' '4' '2' '57']
 ['455500' '4' '-1' '70']
 ['350000' '-1' '2' '60']]


**3. b) ii)** Replace the missing value with a random value. You should generate the random value using numpy, and generate a new random value for each cell to fill in. The random value should be larger than the smallest value in the column, and smaller than the largest value in the column. 

Hint: You might need `np.random.randint(low, high)`

In [10]:
# Write your answers here
values = np.genfromtxt('Data/missing.csv', delimiter = ',', dtype = int) 
#Note that if we load this as int, numpy automatically sends  missing values to -1

values = values[1:] #Remove the top row with feature names

#print(values) #You can comment out lines of code that could be useful later.

In [11]:
#Define a function to get the min and max from a given column.

def column_range(arr):
    min = np.min(arr)
    max = np.max(arr)
    return min, max

In [12]:
#Now write a for loop over all four columns. Use the above function to get the min and the max, then fill in the missing
#value

for i in range(4):
    a = values[:,i] # Remember here a is viewing the data. So if we change a, we change 'values' as well.
    min, max = column_range(a)
    # print(min, max)   
    a[np.where(a==-1)] = np.random.randint(min, max) #This is the line where a is changed

print(values)

[[ 80000      1      1     25]
 [220000      3      1     48]
 [625000      6      3     59]
 [140864      4      2     57]
 [455500      4      1     70]
 [350000     -1      2     60]]


**3. d)** Write a function which takes a filename as input. The function needs to load the data set from the filename. It should then check for any missing values and replace them with the mean of the column. Test the function with `missing.csv`

In [13]:
#This time the function will load the data and process missing values

def impute_values(csv_file): 
    values = np.genfromtxt(csv_file, delimiter = ',', dtype = int) #missing values are -1 now, not ' ' as we use type int
    
    values = values[1:] #Remove the top row with feature names
    #print(values)

    for i in range(4):
        a = values[:,i]
        a[np.where(a==-1)] = np.mean(a)
    return values

In [14]:
#In this cell, we will call the function.

csv_file = 'Data/missing.csv'

impute_values(csv_file)

array([[ 80000,      1,      1,     25],
       [220000,      3,      1,     48],
       [625000,      6,      3,     43],
       [288416,      4,      2,     57],
       [455500,      4,      1,     70],
       [350000,      2,      2,     60]])