In [None]:
# Q1. What is the relationship between polynomial functions and kernel functions in machine learning
# algorithms?
"""Polynomial functions and kernel functions are both used in machine learning algorithms, particularly in support vector machines
 , polynomial functions are a type of kernel function used in SVM.
In SVM, the goal is to find the hyperplane that separates the data points of different classes with the maximum margin. However,
 in some cases, the data points may not be linearly separable in the original input space. To solve this problem, SVM can use a
  kernel function to transform the input space into a higher dimensional feature space, where the data points may be more separable.
A kernel function is a function that takes two inputs (data points) and outputs their similarity. In SVM, the kernel function is
 used to calculate the dot product between the transformed feature vectors, without actually computing the transformation. 
 This is known as the kernel trick, and it allows SVM to efficiently solve problems in high-dimensional feature spaces.





"""

In [1]:
# Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

iris = datasets.load_iris()
X = iris.data
y = iris.target


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)


svm_poly = SVC(kernel='poly', degree=3)


svm_poly.fit(X_train, y_train)

y_pred = svm_poly.predict(X_test)


accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)


Accuracy: 0.9777777777777777


In [None]:
# Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?
"""In Epsilon-Support Vector Regression (SVR), the value of epsilon determines the width of the margin around the regression line. 
The larger the value of epsilon, the wider the margin and the more points will fall within it. Points that fall within the margin 
are not considered support vectors, so increasing the value of epsilon can reduce the number of support vectors. However,
 this is not always the case and the relationship between epsilon and the number of support vectors can be complex."""

In [None]:
# Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter
# affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works
# and provide examples of when you might want to increase or decrease its value?
"""Kernel Function: The kernel function maps the input data to a higher-dimensional feature space in which it is easier to find a
linear regression boundary. Common kernel functions used in SVR include linear, polynomial, radial basis function (RBF), and sigmoid.
The choice of kernel function can affect the performance of SVR, and it depends on the nature of the data.
For example, if the data has a nonlinear relationship between the input and output variables, using a nonlinear kernel such as
RBF or polynomial may improve the performance of SVR.

C Parameter: The C parameter controls the trade-off between the size of the margin and the number of training examples that are 
classified incorrectly. A smaller value of C creates a wider margin but allows more errors, while a larger value of C creates 
a narrower margin but reduces the number of errors. Increasing the value of C may improve the accuracy of the SVR model, 
but it may also increase overfitting. If the data has a large number of noisy examples, 
a smaller value of C may be appropriate to create a wider margin that is more tolerant of errors.

Epsilon Parameter: The epsilon parameter controls the width of the margin around the regression line within which no penalty
 is given for errors. A larger value of epsilon allows more training examples to be included within the margin, 
 potentially resulting in a wider margin and fewer support vectors. If the data has a large number of outliers, 
 a larger value of epsilon may be appropriate to make the model more robust to these outliers.

Gamma Parameter: The gamma parameter controls the smoothness of the decision boundary in the feature space. 
A smaller value of gamma creates a smoother decision boundary, while a larger value of gamma creates a more complex and 
wiggly decision boundary. Increasing the value of gamma may improve the accuracy of the SVR model, but it may also 
increase overfitting. If the data has a large number of features or interactions between features, a larger value of
 gamma may be appropriate to capture these interactions.

"""


In [2]:
# Import the necessary libraries and load the dataset:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
import pickle

iris = load_iris()
X = iris.data
y = iris.target


In [3]:
# Split the dataset into training and testing set:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)


In [4]:
# Preprocess the data using StandardScaler:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


In [5]:
# Create an instance of the SVC classifier and train it on the training data:
svc = SVC()
svc.fit(X_train, y_train)


In [6]:
# Use the trained classifier to predict the labels of the testing data:
y_pred = svc.predict(X_test)


In [7]:
# Evaluate the performance of the classifier using accuracy:
acc = accuracy_score(y_test, y_pred)
print(f"Accuracy: {acc:.2f}")


Accuracy: 1.00


In [8]:
# Tune the hyperparameters of the SVC classifier using GridSearchCV:
from sklearn.model_selection import GridSearchCV

param_grid = {
    'C': [0.1, 1, 10, 100],
    'kernel': ['linear', 'poly', 'rbf', 'sigmoid'],
    'gamma': ['scale', 'auto']
}

grid_search = GridSearchCV(SVC(), param_grid, cv=5)
grid_search.fit(X_train, y_train)

print(f"Best parameters: {grid_search.best_params_}")
print(f"Best score: {grid_search.best_score_:.2f}")


Best parameters: {'C': 10, 'gamma': 'scale', 'kernel': 'linear'}
Best score: 0.95


In [9]:
# Train the tuned classifier on the entire dataset:
svc_tuned = SVC(C=10, kernel='rbf', gamma='scale')
svc_tuned.fit(X, y)


In [10]:
# Save the trained classifier to a file for future use:
with open("svc_tuned.pickle", "wb") as f:
    pickle.dump(svc_tuned, f)
