# SVM (Support Vector Machine) â€“ Step-by-Step (Classification)
Predict **Pass / Fail** using an **SVM classifier**.

You'll learn:
- Why SVM finds the **best separating boundary** (max margin)
- Why **scaling** is important
- How to train, predict, and evaluate
- How to get probabilities (`probability=True`)

In [None]:
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

## 1) Create a small dataset

In [None]:
X = np.array([
    [1, 0],
    [2, 0],
    [2, 1],
    [3, 1],
    [3, 2],
    [4, 2],
    [5, 2],
    [6, 3],
    [7, 3],
    [8, 4],
])
y = np.array([0, 0, 0, 0, 1, 1, 1, 1, 1, 1])  # 0=Fail, 1=Pass

print('X shape:', X.shape)
print('y shape:', y.shape)
print('First 5 rows of X:\n', X[:5])
print('First 5 labels:', y[:5])

## 2) Train/Test split

In [None]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)
print('Train size:', len(X_train))
print('Test size:', len(X_test))

## 3) Scale features (important for SVM)

In [None]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print('Before scaling (first train row):', X_train[0])
print('After scaling  (first train row):', X_train_scaled[0])

## 4) Create the SVM model (RBF kernel)

In [None]:
svm = SVC(
    kernel='rbf',
    C=1.0,
    gamma='scale',
    probability=True,
    random_state=42
)
svm

## 5) Train the model

In [None]:
svm.fit(X_train_scaled, y_train)
print('SVM trained!')

## 6) Predict and evaluate

In [None]:
y_pred = svm.predict(X_test_scaled)
y_prob = svm.predict_proba(X_test_scaled)

print('Predictions:', y_pred)
print('Actual:     ', y_test)

print('\nAccuracy:', accuracy_score(y_test, y_pred))
print('\nConfusion Matrix:\n', confusion_matrix(y_test, y_pred))
print('\nClassification Report:\n', classification_report(y_test, y_pred))

print('\nProbabilities (first 5 rows):\n', y_prob[:5])

## 7) Try your own input (new student)

In [None]:
new_student = np.array([[4, 1]])
new_student_scaled = scaler.transform(new_student)

pred = svm.predict(new_student_scaled)[0]
proba = svm.predict_proba(new_student_scaled)[0]

print('New student [hours, practice_tests]:', new_student[0])
print('Predicted class (0=Fail, 1=Pass):', pred)
print('Probabilities [P(Fail), P(Pass)]:', proba)

## 8) Optional: compare linear vs RBF kernels

In [None]:
for kernel in ['linear', 'rbf']:
    m = SVC(kernel=kernel, C=1.0, gamma='scale', probability=True, random_state=42)
    m.fit(X_train_scaled, y_train)
    acc = accuracy_score(y_test, m.predict(X_test_scaled))
    print(f'kernel={kernel} -> accuracy={acc:.3f}')