In [None]:
# Q1. Relationship between Polynomial Functions and Kernel Functions in Machine Learning

Polynomial functions and kernel functions are related in that a polynomial kernel is a specific type of kernel function used in kernel-based learning algorithms like Support Vector 
Machines (SVM).

A kernel function computes the similarity between two data points in a transformed feature space without explicitly computing the transformation. This is known as the kernel trick.
The polynomial kernel is defined as K(x,y) = (x.y + c)^d  where x and y are feature vectors, c is a constant (also called the coefficient), and d is the degree of the polynomial.

Relationship:
Polynomial kernels allow SVM to learn non-linear decision boundaries by mapping data into a higher-dimensional space.
For example, in a 2D space, a degree-2 polynomial kernel would enable the classifier to draw a quadratic boundary.

# Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?

In Support Vector Regression (SVR): Epsilon (ϵ) defines a margin of tolerance where predictions are not penalized. The model ignores errors within this margin.

Effect:
Increasing ϵ makes the model tolerate larger deviations from the actual target, resulting in fewer support vectors.
Decreasing ϵ reduces tolerance, leading to more support vectors and a more complex model.

# Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works
# and provide examples of when you might want to increase or decrease its value?

Effect of Parameters on SVR Performance:
    
Kernel Function: Defines the transformation of data into a higher-dimensional space. Example: Use RBF kernel for non-linear relationships.

C Parameter: Controls the trade-off between minimizing error and maximizing the margin.
             Low C: Larger margin, more tolerance for misclassified points (simple model).
             High C: Smaller margin, less tolerance (complex model).

Epsilon (ϵ): Defines the margin of tolerance for prediction errors.
             Low ϵ: High sensitivity (complex model).
             High ϵ: Low sensitivity (simpler model).
                
Gamma Parameter: Defines how far the influence of a single data point reaches.
                 Low γ: Generalizes better, smoother decision boundary.
                 High γ: Fits tightly around data points (overfitting risk).


In [None]:
# Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X, y = make_classification(n_samples=200, n_features=2, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
svm_poly = SVC(kernel='poly', degree=3, coef0=1, C=1.0)
svm_poly.fit(X_train, y_train)
y_pred = svm_poly.predict(X_test)

print("Accuracy with Polynomial Kernel:", accuracy_score(y_test, y_pred))


In [None]:
# Q5. Assignment:
#  Import the necessary libraries and load the dataseg
#  Split the dataset into training and testing setZ
#  Preprocess the data using any technique of your choice (e.g. scaling, normalizationK
#  Create an instance of the SVC classifier and train it on the training datW
#  Use the trained classifier to predict the labels of the testing datW
#  Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy, precision, recall, F1-scoreK
#  Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomizedSearchCV to improve its performanc_
#  Train the tuned classifier on the entire dataseg
#  Save the trained classifier to a file for future use.

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
import joblib

data = pd.read_csv('your_dataset.csv')

X = data.drop(columns=['target'])
y = data['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

svc = SVC()

svc.fit(X_train_scaled, y_train)

y_pred = svc.predict(X_test_scaled)

accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1 Score: {f1:.2f}")

param_grid = {
    'C': [0.1, 1, 10, 100],
    'kernel': ['linear', 'rbf', 'poly'],
    'gamma': ['scale', 'auto']
}
grid_search = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train_scaled, y_train)

print("Best parameters:", grid_search.best_params_)

best_svc = grid_search.best_estimator_
best_svc.fit(scaler.fit_transform(X), y)

joblib.dump(best_svc, 'svc_classifier.pkl')
print("Trained classifier saved as 'svc_classifier.pkl'")
