# SVM - Project Iris

In this project I will analyze the Iris floral data set or Fisher's Iris data set is a multivariate data set introduced by Sir Ronald Fisher in 1936 as an example of discriminant analysis.
The data set consists of 50 samples from each of the three species of iris (Iris setosa, I ris virginica and Iris versicolor), so that 150 samples total. Four characteristics of each sample were measured: the length and width of the sections and petals, in centimeters.

The first step is to import the basic libraries and within the Seaborn library we have the Iris data set that I will use in this project.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [None]:
iris = pd.read_csv('../input/iris-flower-dataset/IRIS.csv')

Afterwards I will do a light exploratory analysis of the data.

In [None]:
iris.head()

In [None]:
iris.info()

I can see in this Pairplot that the data is easy to apply the SVM model

In [None]:
sns.pairplot(data=iris, hue='species', diag_kind='hist', palette='Dark2')

In [None]:
iris.head()

In [None]:
setosa = iris[iris['species']=='Iris-setosa']
sns.set_style('dark')
sns.kdeplot(x='sepal_width', y='sepal_length', data=setosa, cmap='plasma', shade=True)

# Training the Model

After a brief analysis of the data, we will apply the model, starting by separating the data.

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
X = iris.drop('species', inplace=False, axis=1)
y = iris['species']

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

After the separate data, I apply the SVM model to the data.

In [None]:
from sklearn.svm import SVC

In [None]:
svc = SVC()

In [None]:
svc.fit(X_train, y_train)

In [None]:
pred = svc.predict(X_test)

Now it's time to do the first analysis of the results for that matter from SKlearn.matrics classification_report, confusion_matrix.

In [None]:
from sklearn.metrics import classification_report, confusion_matrix

In [None]:
print(classification_report(y_test, pred))

In [None]:
print(confusion_matrix(y_test, pred))

The model has already proved to be excellent, but as an example I will apply GridSearch, which is a methodology used when the data is not so good and if you want to have better results.

In [None]:
from sklearn.model_selection import GridSearchCV 

In [None]:
param_grid = {'C' : [0.1, 1, 10, 100, 100], 'gamma' : [1, 0.1, 0.01, 0.001]}

In [None]:
grid_cv = GridSearchCV(SVC(), param_grid, refit=True, verbose=2)

In [None]:
grid_cv.fit(X_train, y_train)

In [None]:
pred = grid_cv.predict(X_test)

Now again we are going to make a classification report and a cofusion matrix to analyze possible improvements.

In [None]:
print(classification_report(y_test, pred))

In [None]:
print(confusion_matrix(y_test, pred))

From the results above, we see that the GridSearch method was not worth it, as it proved to be the same as the previous result.

So we concluded our project to apply the SVM model to the iris data.