**Step 1: Importing required libraries**

In [None]:
import pandas as pd
import numpy as np

**Step 2: Loading the Data**

In [None]:
from sklearn.datasets import load_iris
iris_dataset = load_iris()

**Step 3: Exploring the Dataset**

In [None]:
print('Keys of iris dataset :\n{}'.format(iris_dataset.keys()))

Keys of iris dataset :
dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])


**Let us now check the description of the dataset that is given by the value of the key 'DESCR'. We are checking the description only upto 500 characters. If you want you can increase the value and read more description of the dataset.**

In [None]:
print(iris_dataset['DESCR'][:500] + '\n...')

.. _iris_dataset:

Iris plants dataset
--------------------

**Data Set Characteristics:**

    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica
                

...


In [None]:
print('Target names : {}'.format(iris_dataset['target_names']))

Target names : ['setosa' 'versicolor' 'virginica']


**As we can see, the dataset is about 3 species of the flower IRIS with the species names as** 
1. Setosa
2. Versicolor
3. Virginica

In [None]:
print('Feature names : {}'.format(iris_dataset['feature_names']))

Feature names : ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']


**The details about the flower given in the data are its** 
1. Sepal Length
2. Sepal Width
3. Petal Length
4. Petal Width 
**Also note that all the values given are in cms.**

In [None]:
print('Type of data: {}'.format(type(iris_dataset['data'])))

# the type of the column 'data' is numpy array

Type of data: <class 'numpy.ndarray'>


In [None]:
print('Shape of data: {}'.format(iris_dataset['data'].shape))

Shape of data: (150, 4)


In [None]:
print('First Five columns of data: \n{}'.format(iris_dataset['data'][:5]))

First Five columns of data: 
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]]


**From the data we understand that the first 5 flowers have the same petal width of 0.2 cm and that the first flower has the longest sepal with 5.1 cm size.**

In [None]:
print('Type of target: {}'.format(type(iris_dataset['target'])))

# the type of the column 'target' is also numpy array

Type of target: <class 'numpy.ndarray'>


In [None]:
print('Shape of target: {}'.format(iris_dataset['target'].shape))

Shape of target: (150,)


In [None]:
print('Target:\n {}'.format(iris_dataset['target']))

Target:
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]


**The values in the variable 'target' are 0, 1, 2. The value 0 stands for the target name SETOSA while 1 is VERSICOLOR and 2 is VIRGINICA. We can also write is as**
* 0 --> SETOSA
* 1 --> VERSICOLOR
* 2 --> VIRGINICA

**Step 4: Splitting the data into TRAIN and TEST**

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
    iris_dataset['data'], iris_dataset['target'], random_state = 0)

In [None]:
# printing the shape of train and test data

print('X Train shape = {}'.format(X_train.shape))
print('Y Train shape = {}'.format(y_train.shape))
print('X Test shape = {}'.format(X_test.shape))
print('Y Test shape = {}'.format(y_test.shape))

X Train shape = (112, 4)
Y Train shape = (112,)
X Test shape = (38, 4)
Y Test shape = (38,)


**Step 5: Implementing Machine Learning Model**

In [None]:
# importing the required model
from sklearn.neighbors import KNeighborsClassifier

# instantiating the object of the class
knn = KNeighborsClassifier(n_neighbors = 1)

In [None]:
# Fitting the model
knn.fit(X_train, y_train)

KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
                     metric_params=None, n_jobs=None, n_neighbors=1, p=2,
                     weights='uniform')

**Now, we will create a new array to give as input to the model. The shape of the array is 1 row and 4 columns. However, we have created it as a 2d array since the 'predict' class to which we are going to pass it takes 2d array object.**

In [None]:
X_new = np.array([[5, 2.9, 1, 0.2]])

print('X_new Shape = {}'.format(X_new.shape))

X_new Shape = (1, 4)


**Predicting the class of flower depending on the new data that we created above**

In [None]:
prediction = knn.predict(X_new)

print('Prediction : {}'.format(prediction))
print('Prediction target name : {}'.format(
    iris_dataset['target_names'][prediction]))

Prediction : [0]
Prediction target name : ['setosa']


**Predicting the class of flowers based on the test data**

In [None]:
y_pred = knn.predict(X_test)
print('Test set predictions:\n {}'.format(y_pred))

Test set predictions:
 [2 1 0 2 0 2 0 1 1 1 2 1 1 1 1 0 1 1 0 0 2 1 0 0 2 0 0 1 1 0 2 1 0 2 2 1 0
 2]


**Calculating the efficiency of the model in order to check if it is an effective model or not**

In [None]:
print('Test set score: {}'.format(np.mean(y_pred == y_test)))

Test set score: 0.9736842105263158
