# **Q1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?**

 Polynomial functions are often used as kernel functions in machine learning algorithms, particularly in Support Vector Machines (SVMs). In SVMs, kernel functions are used to implicitly map the input data into a higher-dimensional space where it may be linearly separable. Polynomial kernels are a type of kernel function that computes the dot product between two vectors in the transformed space, often using polynomial functions. This transformation allows SVMs to find nonlinear decision boundaries by mapping the input features into a higher-dimensional space where the data might be separable by a hyperplane.

# **Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?**


In [7]:
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler


# load the dataset
dataset = load_iris()
x = dataset.data
y = dataset.target

# split the dataset into train test split 
X_train, X_test, Y_train, Y_test = train_test_split(x,y, test_size=0.2,random_state=42)

# process the data 
scaler =  StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# create an instance of the svc with poly kernel
svm_classifier = SVC(kernel='poly',degree=3) # degree represent the degree of polynomial

# train the classifier on the training data
svm_classifier.fit(X_train_scaled,Y_train)

# use the trained classifier to predict the labeles of the testing data 
y_pred = svm_classifier.predict(X_test_scaled)

# evaluate the performance of the classifier
accuracy = accuracy_score(Y_test,y_pred)
print(accuracy) 

0.9666666666666667


# **Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?**


increasing the value of the epsilon in support vector regrassion(SVR ) typically result in  a larger margin of tolerance around the predicted values. cosequntly, it can lead to fewer support vectors being selected ,as the model becomes more tolerant errors beyond the margin, conversly decreasing epsilon can result in a smaller margin of tolerance, potentially leading to support vector beaing selected.

# **Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works and provide examples of when you might want to increase or decrease its value?**


* Kernel function: The choice of kernel function (e.g., linear, polynomial, radial basis function) determines how the input space is transformed. Different kernel functions may capture different types of relationships in the data.

* C parameter: This parameter trades off the flatness of the hyperplane against the amount of training error allowed. Increasing C allows the model to fit the training data more closely, potentially leading to overfitting, while decreasing it encourages a wider margin and may improve generalization.

* Epsilon parameter: Epsilon determines the width of the margin around the predicted values within which no penalty is associated with errors. Increasing epsilon increases the margin of tolerance, potentially reducing the number of support vectors.

* Gamma parameter: This parameter defines how far the influence of a single training example reaches, with low values meaning far and high values meaning close. High gamma values can lead to overfitting, while low values can lead to underfitting.

- Examples of when you might want to increase or decrease each parameter:

 - Kernel function: Experiment with different kernel functions based on the nonlinearity of your data.
 - C parameter: Increase C when you have a small number of noisy observations, decrease it when you want a smoother decision boundary.
 - Epsilon parameter: Increase epsilon when you have noisy data or want a wider margin, decrease it when you want to enforce a stricter tolerance for errors.
 - Gamma parameter: Increase gamma when you have complex data with intricate relationships, decrease it when you want simpler models or have a large dataset.

# **Q5. Assignment:**
* Import the necessary libraries and load the dataseg
* Split the dataset into training and testing setZ
* Preprocess the data using any technique of your choice (e.g. scaling, normaliMationK
* Create an instance of the SVC classifier and train it on the training datW
* hse the trained classifier to predict the labels of the testing datW
* Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy,precision, recall, F1-scoreK
* Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to
improve its performanc_
* Train the tuned classifier on the entire dataseg
* Save the trained classifier to a file for future use.

In [11]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split ,GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
import pickle

# load the data
data = load_iris()
x = data.data
y = data.target

# split the data into train test data 
X_train,X_test,Y_train,Y_test = train_test_split(x,y,test_size=0.2,random_state=42)

# preprocess the data using standerscaler technique 
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
y_train_scaled = scaler.transform(X_train)

# Define hyperparameters to tune
param_grid ={
    'C': [0.1,1,10],
    'kernel' : ['linear','poly','rbf'],
    'gamma' : [0.1,1,10],
    'degree' : [2,3,4],
}

# instance of svc classifier 
svm_classifier =  SVC()
grid_serach = GridSearchCV(svm_classifier,param_grid=param_grid,cv=5)
grid_serach.fit(X_train_scaled,Y_train)

# get the best hyper paramaeters
best_params = grid_serach.best_params_
print('best Hyperparameters:',best_params)

# train and tuned the classifier on the entire dataset 
best_svm_classifier = SVC(**best_params)
best_svm_classifier.fit(X_train_scaled,Y_train)

# save the trained classifier for future use
with open('svm_classifier.pkl','wb') as f:
    pickle.dump(best_svm_classifier,f)
    
# Evalute the performance of the classifier
y_pred = best_svm_classifier.predict(X_test_scaled)
accuracy = accuracy_score(Y_test,y_pred)
print("Accuray", accuracy)


best Hyperparameters: {'C': 10, 'degree': 2, 'gamma': 0.1, 'kernel': 'linear'}
Accuray 0.9666666666666667
