# Problem set: Jupyter, pyplot and numpy
## Fisher’s Iris Data Set

The Fisher's Iris data set is a multivariate data set introduced by the British statistician and biologist [Ronald Fisher](https://en.wikipedia.org/wiki/Ronald_Fisher).
This data set was used as an example of [linear discriminant analysis](https://en.wikipedia.org/wiki/Linear_discriminant_analysis) in his 1936 paper *The use of multiple measurements in taxonomic problems*.

#### *The data set consists of 50 samples from each of three species of Iris*
* *Iris setosa*
* *Iris virginica*
* *Iris versicolor*

#### *Four features were measured from each sample*
* *The length & the width of the sepals , in centimetres.(cm)*
* *The length & the width of the petals , in centimetres.(cm)*

Based on the combination of these four features, Fisher developed a linear discriminant model to distinguish the species from each other. More information can be found [here](https://en.wikipedia.org/wiki/Iris_flower_data_set).

In [1]:
# Importing libraries
import numpy as np
import matplotlib.pyplot as plt

# Adapted from https://stackoverflow.com/questions/332289/how-do-you-change-the-size-of-figures-drawn-with-matplotlib
# Changing default plot size
plt.rcParams['figure.figsize'] = (16.0, 8.0)

### *Problem #1 - Get and load the data*

In [2]:
# Declaration of for the data file path for the numpy array, giving instructions which columns to use
# Adapted from https://stackoverflow.com/questions/3518778/how-to-read-csv-into-record-array-in-numpy
irisData = np.genfromtxt('Fishers-Iris-Data.csv', delimiter=',', skip_header=1, usecols=(0,1,2,3))

sepalLength = np.genfromtxt('Fishers-Iris-Data.csv', delimiter=',', skip_header=1, usecols=0)
sepalWidth = np.genfromtxt('Fishers-Iris-Data.csv', delimiter=',', skip_header=1, usecols=1)

petalLength = np.genfromtxt('Fishers-Iris-Data.csv', delimiter=',', skip_header=1, usecols=2)
petalWidth = np.genfromtxt('Fishers-Iris-Data.csv', delimiter=',', skip_header=1, usecols=3)

setosa = np.genfromtxt('Fishers-Iris-Data.csv', delimiter=',', skip_header=1, usecols=(0,1,2,3), max_rows=50)
versicolor = np.genfromtxt('Fishers-Iris-Data.csv', delimiter=',', skip_header=51, max_rows=50)
virginica = np.genfromtxt('Fishers-Iris-Data.csv', delimiter=',', skip_header=101, max_rows=50)

# Below are commands to display individual data sets from the csv file that was read
# Use one of the below to inspect the data output (comment out irisData to show none)
irisData
#sepalLength
#sepalWidth
#petalLength
#petalWidth
#setosa
#versicolor
#virginica

array([[ 5.1,  3.5,  1.4,  0.2],
       [ 4.9,  3. ,  1.4,  0.2],
       [ 4.7,  3.2,  1.3,  0.2],
       [ 4.6,  3.1,  1.5,  0.2],
       [ 5. ,  3.6,  1.4,  0.2],
       [ 5.4,  3.9,  1.7,  0.4],
       [ 4.6,  3.4,  1.4,  0.3],
       [ 5. ,  3.4,  1.5,  0.2],
       [ 4.4,  2.9,  1.4,  0.2],
       [ 4.9,  3.1,  1.5,  0.1],
       [ 5.4,  3.7,  1.5,  0.2],
       [ 4.8,  3.4,  1.6,  0.2],
       [ 4.8,  3. ,  1.4,  0.1],
       [ 4.3,  3. ,  1.1,  0.1],
       [ 5.8,  4. ,  1.2,  0.2],
       [ 5.7,  4.4,  1.5,  0.4],
       [ 5.4,  3.9,  1.3,  0.4],
       [ 5.1,  3.5,  1.4,  0.3],
       [ 5.7,  3.8,  1.7,  0.3],
       [ 5.1,  3.8,  1.5,  0.3],
       [ 5.4,  3.4,  1.7,  0.2],
       [ 5.1,  3.7,  1.5,  0.4],
       [ 4.6,  3.6,  1. ,  0.2],
       [ 5.1,  3.3,  1.7,  0.5],
       [ 4.8,  3.4,  1.9,  0.2],
       [ 5. ,  3. ,  1.6,  0.2],
       [ 5. ,  3.4,  1.6,  0.4],
       [ 5.2,  3.5,  1.5,  0.2],
       [ 5.2,  3.4,  1.4,  0.2],
       [ 4.7,  3.2,  1.6,  0.2],
       [ 4