# Lesson 2 Class Exercises: NumPy Part 2

For these exercises we will use a rather famous set of data from [iris flowers](https://en.wikipedia.org/wiki/Iris_flower_data_set).  The dataset will be provided to you.   Here's a peak at what it looks like:

```
sepal_length,sepal_width,petal_length,petal_width,species
5.1,3.5,1.4,0.2,setosa
4.9,3.0,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
5.0,3.6,1.4,0.2,setosa
```
Notice that the first 4 columns are numerical representing sepal length, sepal width, petal length and petal width. The last column is a string.  

But first, let's import Numpy.

In [None]:
import numpy as np

## Exercise 1: Data Import
Import the Iris dataset and list the dimensions of the numpy matrix after importing to ensure it's the correct size. 

Hints 
+ Use the `genfromtxt` function of NumPy
+ Remember Numpy arrays must be all of the same data type. 
+ If you give the wrong arguments you may be reading in the file as a 1D array of tuples rather than a 2D array of values.
+ This file has a header row.

In [None]:
iris = np.genfromtxt('../data/iris.csv', delimiter=',', dtype='float', skip_header=1, usecols=[0,1,2,3])
iris.shape

## Exercise 2: Indexing & Slicing
### Task 1

Print the top 5 lines of the iris matrix.

In [None]:
print(iris[0:5])

Retrieve the first element of the first row

In [None]:
iris[0,0]

Retrieve the first column of the first five rows of the matrix

In [None]:
iris[0:5, 0:1]

Retrieve the last 2 cells of the last row and the last 2 cells of the second-to-last row of the iris matrix (i.e. the 4 cells in the bottom-right corner of the matrix)

In [None]:
iris[-2:,-2:]

Identify all values in the matrix that are greater than 4 and those that are not

In [None]:
iris > 4

Retrieve all values in the matrix that are greater than 4

In [None]:
iris[iris > 4]

### Task 3
Retrieve all rows that have at least one value greater than 4

In [None]:
iris[np.sum(iris > 4, axis=1) > 1,]

### Task 4

Write code to create a new 5 _x_ 5, 2D array that looks like the following:
```python
    [[ 1. 2. 3. 4. 5.]
     [ 1. 2. 3. 4. 5.]
     [ 1. 2. 3. 4. 5.]
     [ 1. 2. 3. 4. 5.]
     [ 1. 2. 3. 4. 5.]]
```   
Try to create this matrix using these functions:
- `np.arange()`
- `np.zero()`
- `np.reshape()`

In [None]:
x = np.arange(1,6)
y = np.zeros(5).reshape(5,1)
arr = x + y
print(arr)

Replace all elements in the array that have a value of 4 with the value 0:
```python
    [[ 1. 2. 3. 0. 5.]
     [ 1. 2. 3. 0. 5.]
     [ 1. 2. 3. 0. 5.]
     [ 1. 2. 3. 0. 5.]
     [ 1. 2. 3. 0. 5.]]
```   

In [None]:
arr[arr == 4] = 0
print(arr)

Extract the inner matrix (no border rows and columns).  Hint: you can use negative indexing.

In [None]:
arr[1:-1,1:-1] 

Write code to create a new 5 _x_ 5, 2D array with 1 on the border and 0 inside.  Without using for loops.  Output example:
```python
    [[ 1. 1. 1. 1. 1.]
     [ 1. 0. 0. 0. 1.]
     [ 1. 0. 0. 0. 1.]
     [ 1. 0. 0. 0. 1.]
     [ 1. 1. 1. 1. 1.]]
```    
Hint:
- Use the `np.ones()` function
- Use indexing

In [None]:
x = np.ones((5,5)) 
print("Original array:") 
print(x) 
print("1 on the border and 0 inside in the array") 
x[1:-1,1:-1] = 0 
print(x)

## Exercise 3 
Putting Things Together (NumPy Parts 1 & 2). 

### Task 1
Calculate and print the mean, median and standard deviation of the sepal length

In [None]:
print(np.mean(iris[:,0]))
print(np.median(iris[:,0]))
print(np.std(iris[:,0]))

### Task 2
Normalize the sepal length column so that all valus are between 0 and 1 such that the minimum value is 0 and the maximum is 1. Print the results.

In [None]:
sl_max = iris[:,0].max()
sl_min = iris[:,0].min()
sl_norm = (iris[:,0] - sl_min)/(sl_max - sl_min)
print(sl_norm)

### Task 3
Find the Pearson's correlation coefficient between sepal length and petal length.

In [None]:
np.corrcoef(iris[:,0], iris[:,1])

### Task 4
Get a list of unique values from the sepal length column

In [None]:
np.unique(iris[:,0])

### Task 5

Retrieve 5 random rows from the iris matrix

In [None]:
low = 0
high = iris.shape[0]
iris[np.random.randint(low, high, 5),]

### Task 6

Retrieve 5 random values from the iris matrix and store them in a new NumPy array


In [None]:
low = 0
high = iris.shape[0]
indexes = np.random.randint(low, high, 5)
selection = np.zeros(5)
i = 0
for j in indexes:
    selection[i] = np.random.choice(iris[i,], 1, replace=False)
    i += 1
selection

## Exercise 4: Transpose

Transpose the iris array, save it into a new variable, and prints it's shape to confirm

In [None]:
irisT = np.transpose(iris)
irisT.shape

## Exercise 5:  Resize

Use the `resize` function to expand the iris dataset to add 5 new rows, save it in a new variable. Print it's dimensions to confirm

In [None]:
iris2 = np.resize(iris, (155, 4))
iris2.shape

Print the last five rows of the matrix

In [None]:
iris2[-5:,]

## Exercise 6: Putting Things Together

Fill the last 5 rows of the newly resized array with random numbers between 10 and 20, and print the last 5 rows of the matrix to confirm

In [None]:
for i in range(150, 155, 1):
    iris2[i, ] = (np.random.rand(4) * 10) + 10
iris2[-5:,]

## Exercise 7: Append

Use the `append` function to add five new rows of random numbers between 30 and 40. Save the new matrix in a new variable. Print the last 5 rows to confirm

In [None]:
iris3 = np.append(iris, np.random.rand(5, 4) * 10 + 30, axis=0)
iris3[-5:,]

## Exercise 8: Insert

Insert a row of random numbers between 40 and 50 in the second row position of the iris matrix store the value in a new variable.  Print the first 5 rows to confirm

In [None]:
iris5 = np.insert(iris, 2, np.random.rand(4) * 10 + 40,  axis=0)
iris5[0:5,]

## Exercise 9: Delete

Delete the row just added and print the first 5 rows to confirm

In [None]:
iris5 = np.delete(iris5,2, axis=0)
iris5[0:5,]

## Exersize 10: Putting things Together

Remove all rows where all the values in the row are greater than 2 and save in a new array. Print the shape to confirm (should loose 23 rows). Hint: this is a trick question (you wont use the `delete` function to do this).

In [None]:
iris5 = iris[np.invert((np.sum(iris > 2, axis=1) == 4)),]
iris5.shape