# ***Iris flower classification***
### **Here are the steps used in this notebook** </br>
*step1: Explore the iris data* </br>
*step2: Prepare X and Y values* </br>
*step3: Import the machine learning algorithm* </br>
*step4: Train the ML model to fit the data* </br>
*step5: Predict on new input given* </br>

### ***Loading important packages***


In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [2]:
''' 
1. sepal length in cm
2. sepal width in cm
3. petal length in cm
4. petal width in cm
5. class:
-- Iris Setosa
-- Iris Versicolour
-- Iris Virginica
'''
columns=['sepal length','sepal width','petal length','petal width','class']
df=pd.read_csv('/content/iris.data',names=columns)
df.head()

Unnamed: 0,sepal length,sepal width,petal length,petal width,class
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


In [3]:
#Exploring the data
df['class'].value_counts()

Iris-setosa        50
Iris-versicolor    50
Iris-virginica     50
Name: class, dtype: int64

###***Preparing the data***

In [4]:
''' Having 3 classes setosa, versicolor, virginica
Map classes labels into integers 0, 1, 2
'''
# a dictionary of mapped labels
Iris_labels={'Iris-setosa':0,'Iris-versicolor':1,'Iris-virginica':2}
#change the labels of class columns in the dataframe into numeric labels using map function
df['class']=df['class'].map(Iris_labels)

In [5]:
df.head()

Unnamed: 0,sepal length,sepal width,petal length,petal width,class
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0


In [6]:
X=df[['sepal length','sepal width','petal length','petal width']].values
Y=df[['class']].values

In [7]:
# print(X)
# print(Y)

###***Importing the ML algorithm***

In [8]:
#using logistic regression model
from sklearn.linear_model import LogisticRegression

In [9]:
model=LogisticRegression()

### ***Training the model***

In [11]:
model.fit(X,Y)

  y = column_or_1d(y, warn=True)


LogisticRegression()

### ***Training accuracy***

In [12]:
model.score(X,Y)

0.9733333333333334

### ***Prediction of input***

In [13]:
# Y is the expected output
predicted=model.predict(X)

### ***Checking the accuracy of prediction***
#### ***Comparing expected to predicted***

In [14]:
from sklearn import metrics

In [15]:
''' 
The classification report to check accuracy
'''
print(metrics.classification_report(Y,predicted))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        50
           1       0.98      0.94      0.96        50
           2       0.94      0.98      0.96        50

    accuracy                           0.97       150
   macro avg       0.97      0.97      0.97       150
weighted avg       0.97      0.97      0.97       150



**Here in the classification report the model misclassifies the verginica and versicolor**

In [17]:
''' To better see and understand the values predicted
for each class '''
print(metrics.confusion_matrix(Y,predicted))

[[50  0  0]
 [ 0 47  3]
 [ 0  1 49]]
