# Support Vector Machine (SVM)

- Support Vector Machine (SVM) is a supervised machine learning algorithm which can be used for both classification or regression.
- However, it is mostly used in classification problems.
- In this algorithm, we plot each data item as a point in n-dimensional space (where n is number of features) with the value of each feature being the value of a particular coordinate.
- SVM finds the best possible boundary (called a hyperplane) that maximizes the margin between the two classes.

# Hyperplane

- A hyperplane is indeed an n−1 dimensional Euclidean space that separates an n-dimensional Euclidean space into two disconnected regions. 
- In the context of Support Vector Machines (SVM), a hyperplane is used to separate data points of different classes as distinctly as possible.

#### For example:

- In a 2-dimensional space (like a flat plane), a hyperplane is a line (1-dimensional).
- In a 3-dimensional space, a hyperplane is a plane (2-dimensional).
- In a 4-dimensional space, a hyperplane is a 3-dimensional object that we can't easily visualize.
- In SVM, the goal is to find the optimal hyperplane — the one that maximizes the margin between the two classes. The data points that lie closest to this hyperplane are called support vectors, and they are crucial in defining the hyperplane's position.

In [13]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.decomposition import PCA

## Load and Explore Dataset

In [14]:
# Load dataset
df = pd.read_csv("heart.csv")
df.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,52,1,0,125,212,0,1,168,0,1.0,2,2,3,0
1,53,1,0,140,203,1,0,155,1,3.1,0,0,3,0
2,70,1,0,145,174,0,1,125,1,2.6,0,0,3,0
3,61,1,0,148,203,0,1,161,0,0.0,2,1,3,0
4,62,0,0,138,294,1,1,106,0,1.9,1,3,2,0


## Preprocess the Data

In [15]:
# Handling missing values
df.dropna(inplace=True)

# Encode categorical variables
label_encoders = {}
for col in df.select_dtypes(include=['object']).columns:
    le = LabelEncoder()
    df[col] = le.fit_transform(df[col])
    label_encoders[col] = le

## Split Dataset into Training and Testing Sets

In [16]:
# Define features and target variable
X = df.drop(columns=['target'])
y = df['target']

# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Standardize Features

In [17]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

## Train and Evaluate SVM Models

In [18]:
# Train SVM with Linear Kernel
svm_linear = SVC(kernel='linear', C=1)
svm_linear.fit(X_train, y_train)
y_pred_linear = svm_linear.predict(X_test)

# Train SVM with Polynomial Kernel
svm_poly = SVC(kernel='poly', degree=3, C=1)
svm_poly.fit(X_train, y_train)
y_pred_poly = svm_poly.predict(X_test)

# Train SVM with RBF Kernel
svm_rbf = SVC(kernel='rbf', C=1, gamma='scale')
svm_rbf.fit(X_train, y_train)
y_pred_rbf = svm_rbf.predict(X_test)

## Evaluate Model Performance

In [19]:
# Print accuracy scores
print(f'Linear Kernel Accuracy : {accuracy_score(y_test, y_pred_linear) : .2f}')
print(f'Polynomial Kernel Accuracy : {accuracy_score(y_test, y_pred_poly) : .2f}')
print(f'RBF Kernel Accuracy : {accuracy_score(y_test, y_pred_rbf) : .2f}')

Linear Kernel Accuracy :  0.81
Polynomial Kernel Accuracy :  0.91
RBF Kernel Accuracy :  0.89
