### Q1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?

In SVMs, **polynomial functions** are used as **kernel functions** to map the original input space to a higher-dimensional space where the data becomes more separable.

The **polynomial kernel** is defined as:

K(x, x′) = (xᵀx′ + c)^d

Where:
- x, x′ = input feature vectors
- c = constant (controls flexibility)
- d = degree of the polynomial

It enables the SVM to learn complex boundaries **without explicitly computing the transformation** (kernel trick).

### Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

In [4]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42)

# SVM with Polynomial Kernel
model = SVC(kernel='poly', degree=3, C=1)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

# Accuracy
print("Accuracy with Polynomial Kernel:", accuracy_score(y_test, y_pred))

Accuracy with Polynomial Kernel: 0.9736842105263158


### Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?

In **Support Vector Regression (SVR)**:
- **Epsilon (ε)** defines the **margin of tolerance** where no penalty is given for errors.
- Increasing ε → **wider margin** → fewer support vectors (simpler model, less sensitive).
- Decreasing ε → **narrow margin** → more support vectors (complex model, more sensitive).

**Example:**
If ε is large, more data points lie inside the ε-tube, so fewer support vectors are needed to define the boundary.

### Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)? 

Let’s break it down:

- **Kernel function**:
  - Determines the type of decision boundary (linear, polynomial, RBF).
  - RBF works well when relationships are non-linear.
  
- **C (Regularization parameter)**:
  - High C → penalizes errors → low bias, high variance
  - Low C → allows more slack → high bias, low variance

- **Epsilon (ε)**:
  - Margin within which predictions are not penalized
  - Larger ε → simpler model
  - Smaller ε → captures finer details (risk of overfitting)

- **Gamma (γ)** (for RBF/poly kernels):
  - High γ → points must be close to influence each other → complex model
  - Low γ → more general model with smoother boundaries

**Example scenario:**
- Large C, low ε, high γ → overfitting
- Moderate C, moderate ε, moderate γ → balanced performance


### Q5. Assignment Task: SVM Classification with Tuning and Saving the Model

In [9]:
# Import Libraries
import pandas as pd
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import classification_report
import joblib

# Load dataset
wine = load_wine()
X = pd.DataFrame(wine.data, columns=wine.feature_names)
y = pd.Series(wine.target)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Preprocessing - Scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train initial model
svc = SVC(kernel='rbf')
svc.fit(X_train_scaled, y_train)
y_pred = svc.predict(X_test_scaled)

# Evaluation
print("Initial Model Performance:")
print(classification_report(y_test, y_pred))


Initial Model Performance:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        14
           1       1.00      1.00      1.00        14
           2       1.00      1.00      1.00         8

    accuracy                           1.00        36
   macro avg       1.00      1.00      1.00        36
weighted avg       1.00      1.00      1.00        36



In [11]:
# Hyperparameter Tuning using GridSearchCV
param_grid = {
    'C': [0.1, 1, 10],
    'gamma': [0.01, 0.1, 1],
    'kernel': ['rbf', 'poly']
}

grid = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy')
grid.fit(X_train_scaled, y_train)

print("Best Parameters:", grid.best_params_)

# Retrain best model on full dataset
best_model = grid.best_estimator_
best_model.fit(scaler.transform(X), y)

# Save model
joblib.dump(best_model, "svm_classifier.pkl")
print("Model saved as svm_classifier.pkl")


Best Parameters: {'C': 1, 'gamma': 0.01, 'kernel': 'rbf'}
Model saved as svm_classifier.pkl
