<a href="https://colab.research.google.com/github/slyofzero/ML-algorithms-from-SCRATCH/blob/main/SVC_Hyperplane_Visualization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AIM -

To visualize SVC hyperplanes for the linear, rbf, and poly kernel.

---

In [1]:
# Importing all neccessary modules.
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris, make_blobs
import itertools

from plotly.subplots import make_subplots
import plotly.express as px
import plotly.graph_objects as go

import warnings
warnings.filterwarnings("ignore")

Sorry to butt in on your fun, but before we get into SVC hyperplane visualization, we need to understand how SVCs work. If you don't understand that then the whole visualization part won't make much sense.

So suppose we have data that looks something like the scatterplot below.

In [2]:
a = 50
b = 50
c = a
print(a is b)
print(a is c)

True
True


In [3]:
# Creating random blobs of data.
features, target = make_blobs(n_samples = 1000, n_features = 2, centers = 2, random_state = 42)
target = pd.Series(target).replace({0:False, 1:True}).values

px.scatter(x = features[:, 0], y = features[:, 1], color = target)

In the scatterplot, you can see that there are two classes - True and False. Both of these classes have certain attributes and make a blob on their respective side of the graph. The blob for all **True** values is on the right, while the blob for all the **False** values is on the left.

An SVC firstly identifies which points of a class are the nearest to the other one. In our case it'll be these three points -

\

<center><img src = "https://drive.google.com/uc?export=view&id=1oNh9oewaYMqfdoHMlcNJ_X4yvjAWDW-J" width = 70%></center>

<!-- https://drive.google.com/file/d/1zA0BClR3qrBUrL0EBJN9-WlXye3Y2TmP/view?usp=sharing -->

\

<center><img src = "https://drive.google.com/uc?export=view&id=1zA0BClR3qrBUrL0EBJN9-WlXye3Y2TmP" width = 70%></center>

\

These three points would be called the **Support Vectors** and would be used to create a **hyperplane**. Now what's a hyperplane? In simple words, a hyperplane is just a line or a plane that can help us in seperating two classes. Any point to one side of the hyperplane would belong to one class, while any point on the other side would belong to the other class.

A hyperplane is something like a decision boundary, but unlike decision boundaries hyperplanes can be multi-dimensional. To create a hyperplane we just need to find a line between the Support Vectors and maximise the distance between this line and the Support Vectors. This distance is called the **margin** or **margin width**.

\

<center><img src = "https://drive.google.com/uc?export=view&id=1RmmwKLMxz_0LOHgOFUUBDybu6BPnzFnF" width = 70%></center>

\

The hyperplane can be of different trends based on the kernel trick we choose. Meaning a linear kernel model would have a different hyperplan when compared to an rbf kernel model even if they were trained on the same data. This is the main reason to why you'll see different accuracy scores based upon the kernel trick you choose.

To learn more about the theory behind SVMs check out this [StatQuest](https://youtu.be/efR1C6CvhmE) and [this article](https://www.ml-concepts.com/2022/01/23/3-svm-support-vector-machine/) from ml-concepts.com.

---

Now that the boring stuff is done, let's move onto Hyperplane Visualization. For this activity, I'll be using the [IRIS Dataset](https://www.kaggle.com/datasets/uciml/iris).

In [4]:
# Loading the dataset.
data = load_iris()

features_df = pd.DataFrame(data.data, columns = data.feature_names)
target = pd.DataFrame(data.target, columns = ["label"])
target["species"] = target.replace(dict(zip(np.unique(data.target), data.target_names)))

df = pd.concat([features_df, target], axis = 1)
df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),label,species
0,5.1,3.5,1.4,0.2,0,setosa
1,4.9,3.0,1.4,0.2,0,setosa
2,4.7,3.2,1.3,0.2,0,setosa
3,4.6,3.1,1.5,0.2,0,setosa
4,5.0,3.6,1.4,0.2,0,setosa


In [5]:
# Getting the information for each column.
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   sepal length (cm)  150 non-null    float64
 1   sepal width (cm)   150 non-null    float64
 2   petal length (cm)  150 non-null    float64
 3   petal width (cm)   150 non-null    float64
 4   label              150 non-null    int64  
 5   species            150 non-null    object 
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB


Okay so the data has no null values, that makes the task a bit easier. 

Let's plot the scatter plots for features which are related to one another. For example - Sepal Length and Sepal Width are related to each other because both are Sepal attributes, Sepal Length and Petal Length are related to each other because both are Length attributes, and so on.

In [6]:
# Plotting the feature distribution for each flower species.
plots_xy = [("sepal length (cm)", "petal length (cm)"), ("sepal width (cm)", "petal width (cm)"), ("sepal length (cm)", "sepal width (cm)"), ("petal length (cm)", "petal width (cm)")]

fig = make_subplots(rows = 2, cols = 2, subplot_titles = tuple(map(lambda item: f"{item[0].title()} Vs {item[1].title()}", plots_xy)))
plot_coords = list(itertools.product([1,2], repeat = 2))

for index, plot in enumerate(plots_xy):
  fig.add_traces(px.scatter(data_frame = df, x = plot[0], y = plot[1], color = "species").data, rows = plot_coords[index][0], cols = plot_coords[index][1])

fig.update_layout(width = 1200, height = 900, showlegend = False)
fig.show()

From the graph above we can see that classification flower species would be very easier for the "Sepal Length Vs Petal Length" and the "Petal Length Vs Petal Width" graphs.

Now let's make a function that can create an SVC model for us based upon a training data that's decided by the attributes we decide.

In [7]:
# Importing the neccessary modules.
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC

In [8]:
# Making the function to create SVC models.
def make_model(df, feature1, feature2, kernel):
  X_train, X_test, y_train, y_test = train_test_split(features_df[[feature1, feature2]], target["label"], test_size = 0.3, random_state = 42)

  model = SVC(kernel = kernel)
  model.fit(X_train, y_train)

  return model

Hyperplanes can be higher dimensional, meaning they won't always be 2D. Because of this, we'll be using `contour plots` to plot these hyperplanes. Let's create another function which would take x, y, and the model as the parameters and plot the contour plots for them.

In [9]:
def plot_hyperplanes(df, x_name, y_name, model):
  x, y = df[x_name], df[y_name]

  pred_input = np.c_[x, y]

  z =  model.predict(pred_input)

  fig = go.Figure(
      data = go.Contour(x = x, y = y, z = z, colorscale = "viridis")
  )

  scatter = px.scatter(data_frame = df, x = x, y = y, color = "species")

  scatter.update_traces(marker = {"line":{"width":1, "color":"black"}})

  fig.add_traces(
      scatter.data
  )

  fig.update_layout(height = 700, width = 1150, title = f"{x_name.title()} Vs {y_name.title()}", showlegend = False)

  fig.show()

test_model = make_model(df = df, feature1 = "sepal length (cm)", feature2 = "petal length (cm)", kernel = "linear")
plot_hyperplanes(df = df, x_name = "sepal length (cm)", y_name = "petal length (cm)", model = test_model)

The plot seems to be seperating the classes pretty nicely but the sizing of the plot is a bit of a problem. To fix this I would use a `meshgrid`.

A meshgrid takes in two series and them make coordinates out of them, that's all. Look at code below for example.

In [10]:
# Meshgrid Example.
a = [1,2]
b = [1,2,3]

mesh_a, mesh_b = np.meshgrid(a, b)

print(mesh_a)
print(mesh_b)

[[1 2]
 [1 2]
 [1 2]]
[[1 1]
 [2 2]
 [3 3]]


In [None]:
def plot_hyperplanes(df, x_name, y_name, model):
  x, y = df[x_name], df[y_name]

  x_series = np.arange(x.min() - 1, x.max() + 1, 0.2)
  y_series = np.arange(y.min() - 1, y.max() + 1, 0.2)

  mesh_x, mesh_y = np.meshgrid(x_series, y_series)
  pred_input = np.c_[mesh_x.ravel(), mesh_y.ravel()]

  z =  model.predict(pred_input).reshape(mesh_x.shape)

  fig = go.Figure(
      data = go.Contour(x = x_series, y = y_series, z = z, colorscale = "viridis")
  )

  scatter = px.scatter(data_frame = df, x = x, y = y, color = "species")

  scatter.update_traces(marker = {"line":{"width":1, "color":"black"}})

  fig.add_traces(
      scatter.data
  )

  fig.update_layout(height = 700, width = 1150, title = f"{x_name.title()} Vs {y_name.title()}", showlegend = False)

  return fig

In [None]:
kernels = ["linear", "rbf", "poly"]

for row, xy in enumerate(plots_xy):
  x, y = xy
  fig = make_subplots(rows = 1, cols = 3, subplot_titles = tuple(map(lambda kernel:f"{kernel.title()} kernel", kernels)))
  for col, kernel in enumerate(kernels):
    svc_model = make_model(df = df, feature1 = x, feature2 = y, kernel = kernel)
    cp = plot_hyperplanes(df = df, x_name = x, y_name = y, model = svc_model)
    
    fig.add_traces(data = cp.data, rows = 1, cols = col + 1)

  fig.update_layout(height = 400, width = 1150, showlegend = False, title = f"{x.title()} Vs {y.title()}")
  fig.show()

In [None]:
for x, y in plots_xy:
  svc_lin = make_model(df = df, feature1 = x, feature2 = y, kernel = "linear")
  fig = plot_hyperplanes(df = df, x_name = x, y_name = y, model = svc_lin)
  fig.show()

In [None]:
for x, y in plots_xy:
  svc_lin = make_model(df = df, feature1 = x, feature2 = y, kernel = "rbf")
  fig = plot_hyperplanes(df = df, x_name = x, y_name = y, model = svc_lin)
  fig.show()

In [None]:
for x, y in plots_xy:
  svc_lin = make_model(df = df, feature1 = x, feature2 = y, kernel = "poly")
  fig = plot_hyperplanes(df = df, x_name = x, y_name = y, model = svc_lin)
  fig.show()