# ML model in Scikit-Learn for the IRIS dataset 
The IRIS dataset consists of 3 different types of irises’ (Setosa, Versicolor, and Virginica) petal and sepal length.
The rows being the samples and the columns being: Sepal Length, Sepal Width, Petal Length and Petal Width. The last column is the correct flower class.

Data Set Characteristics:
*   Number of Instances: 150 (50 in each of three classes)
*   Number of Attributes: 4 numeric, predictive attributes and the class
*   Attribute Information: sepal length in cm, sepal width in cm,  petal length in cm, petal width in cm
*   Class: Iris-Setosa, Iris-Versicolor, Iris-Virginica


In [0]:
#Import datasets, train_test_split, model_selection, estimators and accuracy scor
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn import model_selection
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

In [0]:
#Load IRIS dataset and assign X (features) and y (label)
iris = datasets.load_iris()
X, y = iris.data, iris.target

In [0]:
#Split the data into 80% training and 20% testing
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)

In [0]:
#List the estimators
models = []
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier()))
models.append(('SVM', SVC(gamma='auto')))

In [0]:
#Evaluation metric

scoring = 'accuracy'

In [0]:
#10-fold cross validation; Output accuracy mean and std for each estimator
results = []
names = []
seed = 7
for name, model in models:
 kfold = model_selection.KFold(n_splits=10, random_state=seed)
 cv_results = model_selection.cross_val_score(model, X_train, y_train, cv=kfold, scoring=scoring)
 results.append(cv_results)
 names.append(name)
 msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
 print(msg)

KNN: 0.950000 (0.055277)
CART: 0.941667 (0.053359)
SVM: 0.958333 (0.041667)


In [0]:
#Train the KNN model and test it on the test dataset
knn = KNeighborsClassifier()
knn.fit(X_train, y_train)
predictions = knn.predict(X_test)
print(accuracy_score(y_test, predictions))

1.0


In [0]:
#Train the SVM model and test it on the test dataset
svm = SVC(gamma='auto')
svm.fit(X_train, y_train)
predictions = svm.predict(X_test)
print(accuracy_score(y_test, predictions))

1.0
