<h2 style="text-align:center;">Support Vector Machine (SVM) - Basic Classification</h2>

Support Vector Machine (SVM) is a **supervised machine learning algorithm** used for both classification and regression.  
It works by finding the **optimal hyperplane** that best separates different classes in the feature space.

**Key Points about SVM:**
- **Type**: Supervised Classification Algorithm (can also be used for regression → SVR)
- **Goal**: Maximize the margin between classes
- **Decision Boundary**: Defined by support vectors (critical data points)
- **Kernel Trick**: Can handle non-linear data using kernels (linear, polynomial, RBF, sigmoid)
- **Use Cases**: 
  - Spam detection
  - Image classification
  - Face recognition
  - Bioinformatics (e.g., cancer classification)


In [1]:
# ➤ Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report


<h2 style="text-align:center;">Step 1: Load Dataset</h2>


In [3]:
df = pd.read_csv("./data/Social_Network_Ads.csv")
df.head()


Unnamed: 0,User ID,Gender,Age,EstimatedSalary,Purchased
0,15624510,Male,19,19000,0
1,15810944,Male,35,20000,0
2,15668575,Female,26,43000,0
3,15603246,Female,27,57000,0
4,15804002,Male,19,76000,0


<h2 style="text-align:center;">Step 2: Feature Selection & Train-Test Split</h2>


In [4]:
X = df.iloc[:, [2, 3]].values   # Age & Estimated Salary
y = df.iloc[:, -1].values       # Purchased

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.20, random_state=0
)


<h2 style="text-align:center;">Step 3: Feature Scaling</h2>


In [5]:
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)


<h2 style="text-align:center;">Step 4: Train the SVM Model</h2>


In [6]:
classifier = SVC(kernel="linear", random_state=0)
classifier.fit(X_train, y_train)


<h2 style="text-align:center;">Step 5: Make Predictions</h2>


In [7]:
y_pred = classifier.predict(X_test)
y_pred[:10]  # show first 10 predictions


array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0], dtype=int64)

<h2 style="text-align:center;">Step 6: Model Evaluation</h2>


In [8]:
cm = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:\n", cm)

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

bias = classifier.score(X_train, y_train)
print("Bias (Training Accuracy):", bias)

variance = classifier.score(X_test, y_test)
print("Variance (Test Accuracy):", variance)

report = classification_report(y_test, y_pred)
print("Classification Report:\n", report)


Confusion Matrix:
 [[57  1]
 [ 6 16]]
Accuracy: 0.9125
Bias (Training Accuracy): 0.821875
Variance (Test Accuracy): 0.9125
Classification Report:
               precision    recall  f1-score   support

           0       0.90      0.98      0.94        58
           1       0.94      0.73      0.82        22

    accuracy                           0.91        80
   macro avg       0.92      0.86      0.88        80
weighted avg       0.91      0.91      0.91        80



<h2 style="text-align:center;">Summary</h2>

In this notebook, we implemented **Support Vector Machine (SVM)** for binary classification 
using the **Social_Network_Ads** dataset.

✔ We imported the dataset and selected relevant features (Age, Estimated Salary).  
✔ Performed **train-test split** and applied **feature scaling**.  
✔ Trained an **SVM model with linear kernel**.  
✔ Evaluated model performance using **confusion matrix, accuracy, bias-variance check, and classification report**.  

**Key Takeaway:**  
SVM is a powerful classification algorithm that performs well in high-dimensional spaces 
and is effective when the number of features is greater than the number of samples.  
Its ability to use kernels makes it suitable for both linear and non-linear classification problems.
