Q1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?
In machine learning, kernel functions allow algorithms like SVMs to operate in higher-dimensional spaces without explicitly transforming the data. A polynomial kernel is one such kernel that corresponds to a polynomial transformation of the input space.

 Polynomial Kernel Function:
𝐾
(
𝑥
,
𝑥
′
)
=
(
𝛾
⋅
𝑥
⊤
𝑥
′
+
𝑟
)
𝑑
K(x,x
′
 )=(γ⋅x
⊤
 x
′
 +r)
d

𝛾
γ: scale factor (default 1)

𝑟
r: constant term (controls offset)

𝑑
d: degree of the polynomial

 Relationship:
The polynomial kernel implicitly maps data into a space where polynomial decision boundaries can separate it.

This avoids the need to explicitly compute the high-dimensional polynomial features, thanks to the kernel trick.

In [1]:
# Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Binary classification for simplicity (class 0 vs. not 0)
y = (y == 0).astype(int)

# Train/test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Polynomial kernel SVM
clf = SVC(kernel='poly', degree=3, C=1.0, gamma='scale')
clf.fit(X_train, y_train)

# Predict and evaluate
y_pred = clf.predict(X_test)
print("Accuracy with polynomial kernel:", accuracy_score(y_test, y_pred))


Accuracy with polynomial kernel: 1.0


### Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?
Ans: \
In Support Vector Regression (SVR), the parameter
𝜀
ε defines a margin of tolerance where no penalty is given for prediction errors:

Only predictions outside the
𝜀
ε-tube are considered as errors.

Larger
𝜀
ε → Fewer support vectors, because more points fall within the no-penalty zone.

Smaller
𝜀
ε → More support vectors, as more points lie outside the margin.

 In short: Increasing
𝜀
ε reduces model complexity, but might also lower accuracy.

### **Q4. How do kernel, C, epsilon, and gamma affect SVR performance?**
Ans: \

####  **Kernel** (e.g., `linear`, `poly`, `rbf`):
- Determines the **type of decision surface** (linear or non-linear).
- Use `rbf` for most non-linear data.
- Use `linear` when data appears linearly correlated.

####  **C (Regularization Parameter)**:
- **High \(C\)**: Model tries to fit the training data more closely → may overfit.
- **Low \(C\)**: More regularization → model is simpler and may underfit.

>  **Use case**: Increase \( C \) if the model is underfitting; decrease if it's overfitting.

####  **ε (Epsilon-tube width)**:
- Controls the **width of the no-penalty zone** in SVR.
- **Large ε**: Less sensitivity to small errors, smoother model.
- **Small ε**: Captures more variation, may overfit.

>  **Use case**: Start with default, increase ε for smoother predictions, especially with noisy data.

####  **γ (Gamma, only for RBF/poly kernels)**:
- Defines the **influence** of a single training point.
- **High γ**: Close points have a strong influence → overfitting.
- **Low γ**: Influence spreads out → underfitting.

>  **Use case**: Increase γ to capture more complexity, decrease for smoother generalization.

---

###  Summary Table:

| Parameter | Effect | When to Increase | When to Decrease |
|----------|--------|------------------|------------------|
| Kernel   | Defines shape of prediction function | When linear models fail | Use linear if data is simple |
| C        | Controls trade-off between bias and variance | Model underfitting | Model overfitting |
| ε        | Controls tolerance zone in SVR | Data is noisy | Need more precise fit |
| γ        | Controls locality of influence | Complex patterns | Simpler generalization |

In [2]:
"""
Q5. Assignment:
 Import the necessary libraries and load the dataseg
 Split the dataset into training and testing setZ
 Preprocess the data using any technique of your choice (e.g. scaling, normalization)
 Create an instance of the SVC classifier and train it on the training datW
 Use the trained classifier to predict the labels of the testing datW
 Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy,
recision, recall, F1-score)
 Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomizedSearchCV to
mprove its performanc_
 Train the tuned classifier on the entire dataseg
 Save the trained classifier to a file for future use.
"""
# 1. Import Libraries
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import classification_report, accuracy_score
import joblib



In [3]:
# 2. Load Dataset
iris = load_iris()
X = iris.data
y = iris.target


In [4]:
# 3. Train-Test Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42, stratify=y)


In [5]:
# 4. Preprocessing - Feature Scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)


In [6]:
# 5. Train Initial SVC Model
svc = SVC()
svc.fit(X_train_scaled, y_train)


In [7]:
# 6. Evaluate Initial Model
y_pred = svc.predict(X_test_scaled)
print("Initial Model Performance:\n")
print(classification_report(y_test, y_pred))


Initial Model Performance:

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        15
           1       0.88      0.93      0.90        15
           2       0.93      0.87      0.90        15

    accuracy                           0.93        45
   macro avg       0.93      0.93      0.93        45
weighted avg       0.93      0.93      0.93        45



In [8]:
# 7. Hyperparameter Tuning using GridSearchCV
param_grid = {
    'C': [0.1, 1, 10],
    'gamma': ['scale', 0.01, 0.1, 1],
    'kernel': ['rbf', 'linear', 'poly']
}

grid_search = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train_scaled, y_train)

print("Best Parameters:", grid_search.best_params_)


Best Parameters: {'C': 1, 'gamma': 0.1, 'kernel': 'rbf'}


In [9]:
# 8. Train Best Model on Entire Dataset
best_model = grid_search.best_estimator_

# Preprocess entire dataset
X_scaled = scaler.fit_transform(X)

# Retrain on full dataset
best_model.fit(X_scaled, y)

# Evaluate on same dataset (or use cross_val_score for better validation)
y_full_pred = best_model.predict(X_scaled)
print("Final Model Accuracy:", accuracy_score(y, y_full_pred))


Final Model Accuracy: 0.98
