SVM is a supervised machine learning algorithm used for classification and regression tasks.
    It works by finding the optimal hyperplane that separates data points of different classes in a high-dimensional space. SVM aims to maximize the margin between the hyperplane and the nearest data points from each class, known as support vectors. This helps improve the model's generalization ability and robustness to noise. SVM can also handle non-linear decision boundaries using kernel functions, which transform the input data into a higher-dimensional space where a linear separation is possible.

In [4]:
import seaborn as sns
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC, SVR
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.metrics import (
    confusion_matrix,
    classification_report,
    mean_absolute_error,
    mean_squared_error,
    r2_score
)

# SVM CLASSIFIER

In [5]:

# Load dataset (Iris dataset)
df = sns.load_dataset('iris')
print("First 5 rows of Iris dataset:\n", df.head())

# Features (X) and target (y)
X = df.drop('species', axis=1)
y = df['species']

# Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# SVM Classifier with RBF Kernel
model = SVC(kernel='rbf')
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)

# Evaluation
print("\n===== SVM Classification Results (Iris) =====")
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))

First 5 rows of Iris dataset:
    sepal_length  sepal_width  petal_length  petal_width species
0           5.1          3.5           1.4          0.2  setosa
1           4.9          3.0           1.4          0.2  setosa
2           4.7          3.2           1.3          0.2  setosa
3           4.6          3.1           1.5          0.2  setosa
4           5.0          3.6           1.4          0.2  setosa

===== SVM Classification Results (Iris) =====
Confusion Matrix:
 [[10  0  0]
 [ 0  9  0]
 [ 0  0 11]]
Classification Report:
               precision    recall  f1-score   support

      setosa       1.00      1.00      1.00        10
  versicolor       1.00      1.00      1.00         9
   virginica       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30



# SVM REGRESSOR

In [6]:
# Load dataset (Tips dataset)
df = sns.load_dataset('tips')
print("\nFirst 5 rows of Tips dataset:\n", df.head())

# Features (X) and target (y)
X = df.drop('tip', axis=1)
y = df['tip']

# Define numeric & categorical features
numeric_features = ['total_bill', 'size']
categorical_features = ['sex', 'smoker', 'day', 'time']

# Preprocessing (Scaling + One-Hot Encoding)
preprocessor = ColumnTransformer([
    ('num', StandardScaler(), numeric_features),
    ('cat', OneHotEncoder(), categorical_features)
])

# Pipeline: Preprocessing + SVR Model
pipeline = Pipeline([
    ('preprocess', preprocessor),
    ('svr', SVR(kernel='rbf', C=100, gamma=0.1))  # Tuned SVR
])

# Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Fit the model
pipeline.fit(X_train, y_train)

# Predictions
y_pred = pipeline.predict(X_test)

# Evaluation
print("\n===== SVM Regression Results (Tips) =====")
print("MAE :", mean_absolute_error(y_test, y_pred))
print("MSE :", mean_squared_error(y_test, y_pred))
print("RMSE:", np.sqrt(mean_squared_error(y_test, y_pred)))
print("R² Score:", r2_score(y_test, y_pred))



First 5 rows of Tips dataset:
    total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4

===== SVM Regression Results (Tips) =====
MAE : 0.7604232038093937
MSE : 0.9937527294028048
RMSE: 0.996871470854094
R² Score: 0.20497959160342127
