# SVM Classifier with the Iris Dataset

Support Vector Machines (SVM) are powerful supervised machine learning algorithms used for classification and regression tasks. They work by finding a hyperplane that best divides a dataset into classes.

## Overview

In this tutorial, we will cover:

1. Introduction to SVM Classifier
2. Understanding the Concept of Hyperplanes and Margins
3. SVM with the Iris Dataset
4. Evaluating the SVM Classifier
5. Advantages and Limitations of SVM

Let's start with a brief introduction to SVM Classifier.

## Understanding the Concept of Hyperplanes and Margins

A hyperplane is a flat affine subspace of one dimension less than its ambient space. For instance, in a 2-dimensional space, a hyperplane is a line; in a 3-dimensional space, it's a flat plane, and so on.

In the context of SVM, a hyperplane is used to separate data points of different classes. The objective is to find the optimal hyperplane that maximizes the margin between two classes.

The **margin** is defined as the distance between the nearest data point (of either class) and the hyperplane. The goal of SVM is to maximize this margin. The data points that are closest to the hyperplane and influence its position and orientation are termed as **support vectors**.

SVMs can be used in both linear and non-linear classification. For non-linear classification, SVM uses a technique called the kernel trick. The kernel trick involves transforming the input data into a higher-dimensional space where a hyperplane can be used to separate the data.

Next, we'll apply the SVM classifier to the Iris dataset and visualize the decision boundaries.

In [None]:
import matplotlib.pyplot as plt
from sklearn import datasets
import numpy as np

# Load the iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Extract only the features of interest (sepal length and sepal width) and the relevant classes (setosa and versicolor)
X = X[y != 2, :2]
y = y[y != 2]

# Plot the data
plt.scatter(X[y == 0][:, 0], X[y == 0][:, 1], color='red', marker='o', label='setosa')
plt.scatter(X[y == 1][:, 0], X[y == 1][:, 1], color='blue', marker='x', label='versicolor')
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.legend(loc='upper left')
plt.title('Iris Dataset (setosa and versicolor)')
plt.show()

In [None]:
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1, stratify=y)

# Standardize the features
sc = StandardScaler()
X_train_std = sc.fit_transform(X_train)
X_test_std = sc.transform(X_test)

# Train an SVM classifier
svm = SVC(kernel='linear', C=1.0, random_state=1)
svm.fit(X_train_std, y_train)

# Predict the class labels
y_pred = svm.predict(X_test_std)

# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
accuracy

In [None]:
from matplotlib.colors import ListedColormap

def plot_decision_regions(X, y, classifier, test_idx=None, resolution=0.02):
    # Setup marker generator and color map
    markers = ('s', 'x', 'o', '^', 'v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])

    # Plot the decision surface
    x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),
                           np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha=0.3, cmap=cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())

    # Plot class samples
    for idx, cl in enumerate(np.unique(y)):
        plt.scatter(x=X[y == cl, 0], y=X[y == cl, 1],
                    alpha=0.8, c=colors[idx],
                    marker=markers[idx], label=cl,
                    edgecolor='black')

    # Highlight test samples
    if test_idx:
        X_test, y_test = X[test_idx, :], y[test_idx]
        plt.scatter(X_test[:, 0], X_test[:, 1],
                    c='g', edgecolor='black', alpha=1.0,
                    linewidth=1, marker='o',
                    s=100, label='test set')

# Combine the training and test datasets for visualization
X_combined_std = np.vstack((X_train_std, X_test_std))
y_combined = np.hstack((y_train, y_test))

# Plot the decision regions
plot_decision_regions(X_combined_std, y_combined, classifier=svm, test_idx=range(70, 100))
plt.xlabel('Sepal length [standardized]')
plt.ylabel('Sepal width [standardized]')
plt.legend(loc='upper left')
plt.title('SVM Classifier on Iris Dataset')
plt.show()

## Advantages and Limitations of SVM

### Advantages:

1. **Effective in High Dimensional Spaces**: SVM works well when the number of features is large.
2. **Versatility**: Different kernel functions can be specified for the decision function, making SVM versatile for various datasets.
3. **Memory Efficient**: SVM uses only a subset of training points (support vectors) in the decision function.
4. **Robust**: SVM is robust to outliers and can produce accurate results even when the data is not linearly separable by using the kernel trick.

### Limitations:

1. **Sensitive to Kernel Choice**: The choice of kernel and its parameters can greatly influence the performance of SVM.
2. **Not Suitable for Large Datasets**: Training an SVM on a large dataset can be computationally intensive and time-consuming.
3. **No Direct Probability Estimates**: SVM does not provide direct probability estimates for predictions.

In conclusion, SVM is a powerful and versatile classifier that can be used for both linear and non-linear classification tasks. However, it's essential to choose the right kernel and tune the hyperparameters for optimal performance. The Iris dataset example demonstrated the capability of SVM in finding the optimal hyperplane that maximizes the margin between classes.