Q1:
Polynomial functions are used in polynomial kernels, which are a type of kernel function in machine learning. Kernel functions transform the input space into a higher-dimensional space, allowing Support Vector Machines (SVMs) to classify non-linearly separable data. The polynomial kernel is given by:

𝐾
(
𝑥
,
𝑦
)
=
(
𝑥
⋅
𝑦
+
𝑐
)
𝑑
K(x,y)=(x⋅y+c) 
d
 
where 
𝑑
d is the polynomial degree, and 
𝑐
c is a constant.

Q2.

In [2]:
from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Generate dataset with proper feature distribution
X, y = make_classification(n_samples=100, n_features=5, n_informative=3, 
                           n_redundant=1, n_repeated=0, random_state=42)

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train SVM with polynomial kernel
svm_poly = SVC(kernel='poly', degree=3, C=1.0)
svm_poly.fit(X_train, y_train)

# Predict and evaluate
y_pred = svm_poly.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)



Accuracy: 0.85


Q3:
Increasing the epsilon (ε) value in Support Vector Regression (SVR) increases the tolerance for prediction errors, leading to fewer support vectors because more data points fall within the ε-margin and are ignored. This reduces model complexity but may also decrease accuracy.

Q4:
Kernel function: Determines the transformation of the input space. Examples:
Linear: For simple, linearly separable data.
Polynomial: For complex, non-linear patterns.
RBF (Radial Basis Function): For highly non-linear data.
C parameter (Regularization):
Higher C → Less margin, better fit but may overfit.
Lower C → Larger margin, better generalization.
Epsilon (ε) (SVR tolerance):
Higher ε → Fewer support vectors, simpler model.
Lower ε → More support vectors, more precise model.
Gamma (γ) (Kernel coefficient for RBF and Polynomial):
Higher γ → More complex decision boundary, can overfit.
Lower γ → Simpler model, may underfit.

Q5.

In [3]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report
import joblib

# Load dataset (Example: Breast Cancer dataset)
from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()
X, y = data.data, data.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Preprocess data (Scaling)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train SVC classifier
svc = SVC(kernel='poly', degree=3, C=1.0)
svc.fit(X_train, y_train)

# Predict and evaluate
y_pred = svc.predict(X_test)
print(classification_report(y_test, y_pred))

# Hyperparameter tuning
param_grid = {'C': [0.1, 1, 10], 'degree': [2, 3, 4], 'kernel': ['poly']}
grid_search = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

# Train best model on entire dataset
best_svc = grid_search.best_estimator_
best_svc.fit(X, y)

# Save the trained model
joblib.dump(best_svc, "svc_model.pkl")
print("Model saved as svc_model.pkl")


              precision    recall  f1-score   support

           0       1.00      0.65      0.79        43
           1       0.83      1.00      0.90        71

    accuracy                           0.87       114
   macro avg       0.91      0.83      0.85       114
weighted avg       0.89      0.87      0.86       114

Model saved as svc_model.pkl
