In [None]:
'''

Q1. What is the Relationship Between Polynomial Functions and Kernel Functions in Machine Learning Algorithms?
In machine learning, kernel functions are used to map data into a higher-dimensional space, where it may become linearly separable. A polynomial kernel is a specific type of kernel function that applies a polynomial transformation to the input features. The polynomial kernel function is defined as:

𝐾
(
𝑥
𝑖
,
𝑥
𝑗
)
=
(
𝑥
𝑖
⊤
𝑥
𝑗
+
𝑐
)
𝑑
K(x 
i
​
 ,x 
j
​
 )=(x 
i
⊤
​
 x 
j
​
 +c) 
d
 
Where:

𝑥
𝑖
x 
i
​
  and 
𝑥
𝑗
x 
j
​
  are input vectors.
𝑐
c is a constant that trades off the influence of higher-order versus lower-order terms.
𝑑
d is the degree of the polynomial.
Relationship:

The polynomial kernel allows an SVM or other kernel-based algorithms to fit non-linear decision boundaries by implicitly mapping the data into a higher-dimensional space where a linear classifier can be applied.
For example, a quadratic polynomial kernel can capture quadratic relationships in the data without explicitly computing the polynomial features, thanks to the kernel trick.
'''

In [None]:
"Q2. How Can We Implement an SVM with a Polynomial Kernel in Python Using Scikit-learn?"
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Create an SVM classifier with a polynomial kernel
svm_model = SVC(kernel='poly', degree=3, C=1.0)  # degree=3 specifies a cubic polynomial kernel
svm_model.fit(X_train, y_train)

# Predict the labels for the test set
y_pred = svm_model.predict(X_test)

# Evaluate the performance using accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')


In [None]:
'''
Q3. How Does Increasing the Value of Epsilon Affect the Number of Support Vectors in SVR?
In Support Vector Regression (SVR), the epsilon (
𝜖
ϵ) parameter defines a margin of tolerance where no penalty is given to errors (i.e., the distance between the predicted and actual values) that fall within this margin.

Increasing 
𝜖
ϵ:
Allows more data points to fall within the margin without being penalized.
As a result, fewer support vectors are needed because more data points are considered "within tolerance" and do not influence the model.
Decreasing 
𝜖
ϵ:
Reduces the margin, so more data points will lie outside this margin and become support vectors.
In summary, increasing 
𝜖
ϵ typically decreases the number of support vectors, while decreasing 
𝜖
ϵ increases the number of support vectors.
'''

In [None]:
'''
Q4. How Does the Choice of Kernel Function, C Parameter, Epsilon Parameter, and Gamma Parameter Affect the Performance of Support Vector Regression (SVR)?
1. Kernel Function:

The kernel function determines the shape of the decision boundary. Common choices are the linear, polynomial, and RBF (Radial Basis Function) kernels.
When to use:
Linear Kernel: When the data is linearly separable.
Polynomial Kernel: When the data has polynomial relationships.
RBF Kernel: When the data is not linearly separable and exhibits complex patterns.
2. C Parameter (Regularization Parameter):

Controls the trade-off between a smooth decision boundary and classifying the training points correctly.
Small C: The model is more regularized, allowing some misclassification but yielding a smoother decision boundary.
Large C: The model aims to classify all training points correctly, which may lead to overfitting.
When to use:
Small C: Use when you want to avoid overfitting and prefer a smoother model.
Large C: Use when you want to minimize training error.
3. Epsilon Parameter (
𝜖
ϵ):

In SVR, 
𝜖
ϵ defines a margin of tolerance where errors are not penalized.
Small 
𝜖
ϵ: The model will try to predict as close to the actual values as possible, leading to more support vectors.
Large 
𝜖
ϵ: The model will allow more error without penalizing, leading to fewer support vectors.
When to use:
Small 
𝜖
ϵ: Use when high accuracy is needed, and you can tolerate more complexity.
Large 
𝜖
ϵ: Use when you can tolerate some error and want a simpler model.
4. Gamma Parameter (for RBF Kernel):

Controls the influence of a single training example.
Small Gamma: A point far away from the margin can still influence the decision boundary, leading to a smoother model.
Large Gamma: The decision boundary is influenced only by points very close to it, leading to a more complex model that may overfit.
When to use:
Small Gamma: Use when you expect the data to be less complex or when you want to avoid overfitting.
Large Gamma: Use when you expect the data to be complex and want to capture intricate patterns.
'''

In [None]:
'''
Q5. Assignment
Let's walk through the assignment step by step:

1. Import the Necessary Libraries and Load the Dataset:
'''
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
import joblib

# Load dataset (Example: Iris dataset)
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target

"2. Split the Dataset into Training and Testing Sets:"
# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

"3. Preprocess the Data (e.g., Scaling):"
