# Introduction

This notebook will cover the Iris flower data set (Fisher's Iris data set). The aim is to classify iris flowers among three species (setosa, versicolor or virginica) using measurements of length and width of sepals and petals. This is a multivariate data set, which simply means that the data set contains more than one attribute for each instance of a class, in this case each instance has four attributes. This data set contains three classes each with fifty instances, for a total of 150. The three classes are *Iris-setosa*, *Iris-versicolor* and *Iris-virginica*. The four attributes for each instance are *sepal_length*, *sepal_width*, *petal_length* and *petal_width*. Based on the combination of these four features, Fisher developed a linear discriminant model to distinguish the species from each other. 

The *Iris-setosa* class is linearly separable from both the *Iris-versicolor* and *Iris-virginica*, but the *Iris-versicolor* and *Iris-virginica* are NOT linearly separable from each other without the species information that Ronald Fisher used. This is a good example of the difference between supervised and unsupervised techniques in data mining. These definitions are as follows:
-  [Supervised Learning](https://en.wikipedia.org/wiki/Supervised_learning)<br>
    *"Supervised learning is the Data mining task of inferring a function from labeled training data.The training data consist of a set of training examples. In supervised learning, each example is a pair consisting of an input object (typically a vector) and a desired output value (also called thesupervisory signal). A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a “reasonable” way."*
-  [Unsupervised Learning](https://en.wikipedia.org/wiki/Unsupervised_learning)<br>
    *"In Data mining, the problem of unsupervised learning is that of trying to find hidden structure in unlabeled data. Since the examples given to the learner are unlabeled, there is no error or reward signal to evaluate a potential solution."*

Below is an image displaying the supervised learning network architecture required to represent the classification function:

![Image](https://www.neuraldesigner.com/images/learning/iris_flowers_neural_network_graph.png)
This is broken down into the following sections:
-  Inputs<br>The inputs section contains information about the input variables in the neural network, these input variables are the attributes defined above.
-  Scaling layer<br>This layer contains information about how we scale the input variables, we will use the minimum and maximum method for scaling.
-  Neural network<br>The neural network must have four inputs and three output neurons, one for each difinitive class.
-  Probabilistic layer<br>The probabilistic layer allows the outputs to be interpreted as probabilities.
-  Outputs<br>The outputs from this neural network are the probability of each class.

# Implementation

In this notebook I will be using a series of libraries which are all imported below. You can find out more about these libraries in the links below, as these libraries are not the core focus of this notebook they will not be covered here.
-  [pandas](https://pandas.pydata.org/)
-  [matplotlib](https://matplotlib.org/)
-  [sklearn](http://scikit-learn.org/stable/)

##### Loading the libraries

In [2]:
import pandas
from pandas.plotting import scatter_matrix
import matplotlib.pyplot as plt
from sklearn import model_selection
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC

##### Loading the data set with specified names.

In [3]:
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
dataset = pandas.read_csv(url, names=names)

##### Breakdown of the data

In [7]:
# shape of the data set
print(dataset.shape)

(150, 5)


In [10]:
# class distribution of the data set
print(dataset.groupby('class').size())

class
Iris-setosa        50
Iris-versicolor    50
Iris-virginica     50
dtype: int64


In [6]:
# view the first 20 rows of the data set
print(dataset.head(20))

    sepal-length  sepal-width  petal-length  petal-width        class
0            5.1          3.5           1.4          0.2  Iris-setosa
1            4.9          3.0           1.4          0.2  Iris-setosa
2            4.7          3.2           1.3          0.2  Iris-setosa
3            4.6          3.1           1.5          0.2  Iris-setosa
4            5.0          3.6           1.4          0.2  Iris-setosa
5            5.4          3.9           1.7          0.4  Iris-setosa
6            4.6          3.4           1.4          0.3  Iris-setosa
7            5.0          3.4           1.5          0.2  Iris-setosa
8            4.4          2.9           1.4          0.2  Iris-setosa
9            4.9          3.1           1.5          0.1  Iris-setosa
10           5.4          3.7           1.5          0.2  Iris-setosa
11           4.8          3.4           1.6          0.2  Iris-setosa
12           4.8          3.0           1.4          0.1  Iris-setosa
13           4.3    

In [9]:
# statistical description of the data set
print(dataset.describe())

       sepal-length  sepal-width  petal-length  petal-width
count    150.000000   150.000000    150.000000   150.000000
mean       5.843333     3.054000      3.758667     1.198667
std        0.828066     0.433594      1.764420     0.763161
min        4.300000     2.000000      1.000000     0.100000
25%        5.100000     2.800000      1.600000     0.300000
50%        5.800000     3.000000      4.350000     1.300000
75%        6.400000     3.300000      5.100000     1.800000
max        7.900000     4.400000      6.900000     2.500000


##### Visualising the data

# Sources

[Iris Flower Classification](https://www.neuraldesigner.com/learning/examples/iris_flowers_classification)<br>
[Iris Data Set](https://archive.ics.uci.edu/ml/datasets/iris)<br>
[Iris Data Set Wiki](https://en.wikipedia.org/wiki/Iris_flower_data_set)<br>
[Python Iris Data Set Implementation](https://www.kaggle.com/jchen2186/machine-learning-with-iris-dataset)<br>
[Python Iris Data Set Walkthrough](https://machinelearningmastery.com/machine-learning-in-python-step-by-step/)<br>
[]()<br>
[]()<br>
[]()<br>
[]()<br>
[]()<br>
[]()<br>
[]()<br>
[]()<br>