

	1.	Polynomial and Kernel Functions Relationship:
	•	Polynomial kernels are a type of kernel function that transform data into a higher-dimensional space using polynomial functions, allowing for complex decision boundaries. Describe how they help create non-linear decision boundaries in SVM and other kernelized models by computing polynomial similarity between data points without explicitly transforming data.
	2.	Implementing SVM with a Polynomial Kernel in Scikit-Learn:
	•	Using Scikit-Learn’s SVC, specify the kernel='poly' parameter to apply a polynomial kernel. Adjust the degree parameter to control the complexity of the polynomial. Example:

from sklearn.svm import SVC
model = SVC(kernel='poly', degree=3, C=1)
model.fit(X_train, y_train)


	3.	Effect of Increasing Epsilon on Support Vectors in SVR:
	•	Explain that in Support Vector Regression (SVR), the epsilon parameter defines a margin of error where predictions do not incur any penalty. Increasing epsilon allows more points within this margin, leading to fewer support vectors and a simpler model.
	4.	Effects of Kernel, C, Epsilon, and Gamma in SVR:
	•	Kernel: Controls the type of decision surface (e.g., linear, polynomial, RBF). Use RBF for non-linear patterns and linear for simpler, linear data.
	•	C: Regularization parameter; lower values increase the margin but allow more misclassifications, while higher values focus on fitting data tightly.
	•	Epsilon: Defines the margin of tolerance in SVR; higher values allow larger deviations from the true values without penalties.
	•	Gamma: Determines the influence of a single data point; high gamma values create tighter, more complex decision boundaries, while low values yield smoother boundaries.

Practical Assignment

	1.	Library Import and Data Loading:
	•	Use libraries like pandas, scikit-learn, and joblib (for saving the model).
	2.	Data Split:
	•	Use train_test_split to divide the data.
	3.	Data Preprocessing:
	•	Apply scaling/normalization, e.g., using StandardScaler or MinMaxScaler.
	4.	Training an SVC Classifier:
	•	Train the SVC classifier with a kernel (e.g., ‘linear’, ‘poly’, or ‘rbf’).
	5.	Evaluation:
	•	Use metrics such as accuracy, precision, recall, or F1-score. Scikit-Learn’s classification_report can help.
	6.	Hyperparameter Tuning:
	•	Use GridSearchCV or RandomizedSearchCV to tune parameters (e.g., C, gamma, degree, etc.) and improve model performance.
	7.	Train on Full Dataset and Save the Model:
	•	Train the tuned classifier on the entire dataset, then use joblib.dump() to save the model for future use.

Example Code

import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report
from sklearn.externals import joblib  # for saving the model

# Load dataset
# df = pd.read_csv('your_dataset.csv')
X = df.drop('target', axis=1)
y = df['target']

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Preprocess (e.g., scaling)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Train SVC model
svc = SVC(kernel='rbf', C=1)
svc.fit(X_train, y_train)

# Evaluate
y_pred = svc.predict(X_test)
print(classification_report(y_test, y_pred))

# Hyperparameter tuning
param_grid = {'C': [0.1, 1, 10], 'gamma': [1, 0.1, 0.01]}
grid = GridSearchCV(SVC(kernel='rbf'), param_grid, refit=True, verbose=2)
grid.fit(X_train, y_train)

# Re-train on full data and save model
best_model = grid.best_estimator_
best_model.fit(X, y)
joblib.dump(best_model, 'svm_model.pkl')
