Q1. Relationship Between Polynomial Functions and Kernel Functions in Machine Learning
In Support Vector Machines (SVM), kernel functions help transform data into a higher-dimensional space where it can be separated linearly. A polynomial kernel is a type of kernel function that maps input features into a higher-degree polynomial space without explicitly computing the transformation. It is defined as:

𝐾
(
𝑥
𝑖
,
𝑥
𝑗
)
=
(
𝑥
𝑖
𝑇
𝑥
𝑗
+
𝑐
)
𝑑
K(x
i
​
 ,x
j
​
 )=(x
i
T
​
 x
j
​
 +c)
d

where:

𝑥
𝑖
,
𝑥
𝑗
x
i
​
 ,x
j
​
  are input feature vectors,
𝑐
c is a constant (controls influence of higher-order terms),
𝑑
d is the degree of the polynomial.
This is useful when data is non-linearly separable but can be separated in a higher-dimensional space.

Q2. Implementing SVM with a Polynomial Kernel in Python (Scikit-learn)

In [1]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

# Load dataset
iris = datasets.load_iris()
X = iris.data[:, :2]  # Using first two features for visualization
y = iris.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train SVM with Polynomial Kernel
svm_poly = SVC(kernel="poly", degree=3, C=1.0)
svm_poly.fit(X_train, y_train)

# Predictions
y_pred = svm_poly.predict(X_test)

# Evaluate model
accuracy = accuracy_score(y_test, y_pred)
print(f"Polynomial SVM Accuracy: {accuracy:.2f}")


Polynomial SVM Accuracy: 0.67


Q3. Effect of Increasing Epsilon on Support Vectors in SVR
In Support Vector Regression (SVR), epsilon (ε) defines a margin of tolerance where errors are ignored. Increasing ε results in:

Fewer support vectors because more data points fall within the margin.
A simpler model with less flexibility.
Higher bias and lower variance, meaning the model generalizes more but may underfit.
Conversely, decreasing ε results in more support vectors, making the model more flexible but potentially overfitting.

Q4. Effect of Kernel, C, Epsilon, and Gamma on SVR Performance
Parameter	Effect	When to Increase?	When to Decrease?
Kernel Function	Defines feature transformation	If data is non-linearly related	If data is linearly related
C (Regularization Parameter)	Controls trade-off between margin and misclassification	If you want fewer errors, but risk overfitting	If you want better generalization, but allow some misclassification
Epsilon (ε)	Defines margin of tolerance for error	If you want simpler models (fewer support vectors)	If you want higher precision (more support vectors)
Gamma (γ) in RBF Kernel	Controls influence of a single data point	If the dataset is complex (higher model complexity)	If the dataset is simple (reduce overfitting)
Q5. Assignment: SVM Classifier with Hyperparameter Tuning
Steps:
Load a dataset of your choice.
Split it into training and testing sets.
Preprocess data (scaling/normalization).
Train an SVC classifier on the training set.
Make predictions on the test set.
Evaluate performance using accuracy, precision, recall, and F1-score.
Tune hyperparameters using GridSearchCV or RandomizedSearchCV.
Retrain the model with the best parameters.
Save the trained classifier for future use.
### **Implementation using Breast Cancer Dataset**

In [2]:
import numpy as np
import joblib
from sklearn import datasets
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report

# Load dataset
cancer = datasets.load_breast_cancer()
X, y = cancer.data, cancer.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train initial SVC model
svc = SVC(kernel="rbf", C=1.0, gamma="scale")
svc.fit(X_train, y_train)

# Predictions
y_pred = svc.predict(X_test)

# Evaluate model
print("Initial Model Performance:")
print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print(classification_report(y_test, y_pred))

# Hyperparameter tuning using GridSearchCV
param_grid = {
    'C': [0.1, 1, 10],
    'gamma': ['scale', 0.1, 1, 10],
    'kernel': ['rbf', 'poly', 'sigmoid']
}

grid_search = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

# Best parameters
print(f"Best Parameters: {grid_search.best_params_}")

# Train tuned model
best_svc = grid_search.best_estimator_
best_svc.fit(X_train, y_train)

# Final predictions
y_pred_best = best_svc.predict(X_test)

# Evaluate tuned model
print("Tuned Model Performance:")
print(f"Accuracy: {accuracy_score(y_test, y_pred_best):.2f}")
print(classification_report(y_test, y_pred_best))

# Save the trained model
joblib.dump(best_svc, "svm_classifier.pkl")
print("Trained model saved as 'svm_classifier.pkl'")


Initial Model Performance:
Accuracy: 0.98
              precision    recall  f1-score   support

           0       1.00      0.95      0.98        43
           1       0.97      1.00      0.99        71

    accuracy                           0.98       114
   macro avg       0.99      0.98      0.98       114
weighted avg       0.98      0.98      0.98       114

Best Parameters: {'C': 1, 'gamma': 'scale', 'kernel': 'rbf'}
Tuned Model Performance:
Accuracy: 0.98
              precision    recall  f1-score   support

           0       1.00      0.95      0.98        43
           1       0.97      1.00      0.99        71

    accuracy                           0.98       114
   macro avg       0.99      0.98      0.98       114
weighted avg       0.98      0.98      0.98       114

Trained model saved as 'svm_classifier.pkl'
