[ Support vector machine (SVM)](https://en.wikipedia.org/wiki/Support_vector_machine)
 ---
- a supervised max-margin model
- supports linear and nonlinear classification, regression and outlier detection
  - nonlinearity is achieved by kernel functions
- suitable from small to medium-sized nonlinear datasets
  - not well-scalable to very large datasets

In [None]:
import numpy as np, pandas as pd, matplotlib.pyplot as plt, matplotlib as mpl
import sklearn as skl, sklearn.datasets as skds

A linearly separable dataset - [the iris flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set)
---
- consists of 3 different types of irises 
  - Setosa, Versicolour, and Virginica
  - 150 samples with 4 features
    - Sepal Length, Sepal Width, Petal Length and Petal Width

In [None]:
iris = skds.load_iris(as_frame=True)
print(iris.DESCR)

In [None]:
iris.data.head(2)

In [None]:
# the plot below uses the first two features
_,ax1 = plt.subplots()
# iris.data.plot(ax=ax1, kind='scatter', x='sepal length (cm)', y='sepal width (cm)', c=iris.target)
iris_scatter = ax1.scatter(iris.data['sepal length (cm)'], iris.data['sepal width (cm)'], c=iris.target)
ax1.set(xlabel=iris.feature_names[0], ylabel=iris.feature_names[1])
_ = ax1.legend(iris_scatter.legend_elements()[0],
               iris.target_names,
               loc='lower right',
               title='Classes')

- with the first two features,
  - setosa is linearly separable from the other two
  - however, the other two are not

In [None]:
from sklearn.decomposition import PCA
fig2 = plt.figure(1, figsize=(8,6))
ax2 = fig2.add_subplot(111, projection='3d', elev=-150, azim=110)
X_reduced = PCA(n_components=3).fit_transform(iris.data)
ax2.scatter(X_reduced[:,0], X_reduced[:,1], X_reduced[:,2], c=iris.target, s=40)
ax2.set(title="First three PCA dimensions",
        xlabel="1st Eigenvector", ylabel="2nd Eigenvector", zlabel="3rd Eigenvector")
ax2.xaxis.set_ticklabels([]), ax2.yaxis.set_ticklabels([]), ax2.zaxis.set_ticklabels([])