## 1

In [None]:
'''
Polynomial functions and kernel functions are both used in machine learning algorithms to transform 
data into higher-dimensional spaces to make it easier to classify or analyze. In fact, polynomial
functions can be seen as a special case of kernel functions.
in machine learning algorithms, kernel functions are often used to transform data into higher-dimensional
spaces so that the data becomes linearly separable. This means that it is easier to find a linear 
boundary that separates the different classes of data. One common type of kernel function used in
machine learning is the radial basis function (RBF) kernel, which is defined as 
K(x, y) = exp(-gamma ||x - y||^2), where gamma is a hyperparameter.
Polynomial functions can also be used to transform data into higher-dimensional spaces, 
but they are less commonly used than kernel functions in machine learning algorithms.
Polynomial kernel functions, however, are a type of kernel function that are based on polynomial functions.
Specifically, a polynomial kernel function is defined as K(x, y) = (gamma x^T y + r)^d, where gamma,
r, and d are hyperparameters, and x^T y is the dot product between the vectors x and y.
'''

## 2

In [None]:
svm_model = SVC(kernel='poly', degree=3, gamma='scale')

## 3

In [None]:
'''
In SVR, support vectors are the training examples that lie on the margin or inside the
epsilon-insensitive zone. These are the examples that are most important for determining the position 
and orientation of the regression line, and they are the ones that contribute to the final prediction.
When the value of epsilon is increased, the width of the epsilon-insensitive zone is increased, 
which means that more training examples may lie within this zone and become support vectors. 
As a result, increasing epsilon can lead to an increase in the number of support vectors.
'''

## 4

In [None]:
'''
Kernel function: The choice of kernel function determines how the input data is mapped to a 
higher-dimensional space to enable linear separation. Common kernel functions include linear, 
polynomial, radial basis function (RBF), and sigmoid. Each kernel has its own strengths and 
weaknesses, and the choice of kernel depends on the problem at hand. For example, the RBF kernel 
is often used for non-linear problems, while the linear kernel is useful for linearly separable data.

C parameter: The C parameter determines the trade-off between maximizing the margin and minimizing 
the training error. A smaller value of C creates a wider margin but allows more misclassifications, 
while a larger value of C creates a narrower margin but minimizes the number of misclassifications.
For example, if the dataset has noise or outliers, a larger C value may help to reduce the effect 
of these points.

Epsilon parameter: The epsilon parameter controls the width of the epsilon-insensitive zone,
which is the region around the regression line where errors are not penalized. The larger the
value of epsilon, the more tolerant the model is to errors, and the wider the epsilon-insensitive 
zone. A smaller value of epsilon leads to a more strict model that penalizes errors more heavily. 
A larger epsilon value can be useful for datasets with noise or outliers, while a smaller value may
be appropriate for datasets with less noise.

Gamma parameter: The gamma parameter determines the width of the RBF kernel and controls the 
influence of each training example on the model. A smaller gamma value means that each training 
example has a wider influence, while a larger gamma value means that each example has a narrower 
influence. If gamma is too large, the model may overfit to the training data, while if gamma is
too small, the model may underfit the data. A larger gamma value is often used for datasets with
a high degree of non-linearity, while a smaller gamma value may be more appropriate for datasets 
with linearly separable data.

## 5

In [2]:
#Import the necessary libraries and load the dataset:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score,f1_score
import joblib

In [3]:
iris = load_iris()
x,y = iris.data ,iris.target

In [None]:
#Split the dataset into training and testing sets:

In [8]:
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

In [None]:
#Preprocess the data using StandardScaler:

In [9]:
scalerobj = StandardScaler()
X_train= scalerobj.fit_transform(X_train)
X_test = scalerobj.transform(X_test)

In [None]:
#Create an instance of the SVC classifier and train it on the training data:

In [10]:
clf = SVC()
clf.fit(X_train, y_train)

In [None]:
#Use the trained classifier to predict the labels of the testing data:

In [11]:
y_pred = clf.predict(X_test)

In [None]:
#Evaluate the performance of the classifier using accuracy and F1-score:

In [12]:
accuracy = accuracy_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred, average='weighted')
print(f"Accuracy: {accuracy}")
print(f"F1-score: {f1}")

Accuracy: 1.0
F1-score: 1.0


In [None]:
#Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomizedSearchCV to improve its performance:

In [13]:
from sklearn.model_selection import GridSearchCV

param_grid = {
    'C': [0.1, 1, 10, 100],
    'gamma': ['scale', 'auto', 0.1, 1, 10],
    'kernel': ['linear', 'poly', 'rbf', 'sigmoid']
}

grid_search = GridSearchCV(SVC(), param_grid, cv=5)
grid_search.fit(X_train, y_train)


In [14]:
grid_search.best_params_

{'C': 1, 'gamma': 0.1, 'kernel': 'sigmoid'}

In [None]:
# Train the tuned classifier on the entire dataseg

In [15]:
clf_tuned = SVC(C= 1, gamma=0.1, kernel= 'sigmoid')

In [16]:
clf_tuned.fit(X_train,y_train)

In [17]:
y_pred_tuned = clf.predict(X_test)

In [18]:
accuracy = accuracy_score(y_test, y_pred_tuned)
f1 = f1_score(y_test, y_pred_tuned, average='weighted')
print(f"Accuracy: {accuracy}")
print(f"F1-score: {f1}")

Accuracy: 1.0
F1-score: 1.0


In [None]:
#Save the trained classifier to a file for future use.

In [19]:
joblib.dump(clf_tuned, 'tuned_SVC.joblib')

['tuned_SVC.joblib']