# svm

## What is SVM?

Support Vector Machine (SVM) is a supervised machine learning algorithm mainly used for classification (it can also be used for regression).

Its main idea is to separate classes using the best possible boundary.

## Hyperplane

A hyperplane is the decision boundary that separates data points:

In 2D → a line

In 3D → a plane

In higher dimensions → a hyperplane

## Margin

Margin is the distance between the hyperplane and the nearest data points from each class.

### SVM tries to maximize this margin.

A larger margin usually means better generalization and less overfitting.

Support Vectors

Support vectors are the data points closest to the hyperplane.

These points define the position of the hyperplane.

Removing or changing them will change the decision boundary.

## Kernel (Idea Only)

Sometimes data is not linearly separable.

A kernel transforms the data into a higher dimension where separation becomes easier.

Common kernels:

linear

rbf

polynomial

In [1]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix


In [2]:
# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target


In [3]:
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


In [4]:
# Feature scaling (very important for SVM)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


In [5]:
# Create SVM model
model = SVC()


In [6]:
# Train model
model.fit(X_train, y_train)


In [7]:
# Prediction
y_pred = model.predict(X_test)


In [8]:
# Evaluation
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))


Accuracy: 0.9824561403508771
Confusion Matrix:
 [[41  2]
 [ 0 71]]


### Linear and RBF kernel

In [9]:
svm_linear = SVC(kernel='linear')
svm_linear.fit(X_train, y_train)
pred_linear = svm_linear.predict(X_test)

print("Linear Kernel Accuracy:", accuracy_score(y_test, pred_linear))


Linear Kernel Accuracy: 0.956140350877193


In [10]:
svm_rbf = SVC(kernel='rbf')
svm_rbf.fit(X_train, y_train)
pred_rbf = svm_rbf.predict(X_test)

print("RBF Kernel Accuracy:", accuracy_score(y_test, pred_rbf))


RBF Kernel Accuracy: 0.9824561403508771
