In [None]:
Q1. What is the relationship between polynomial functions and kernel functions in machine learning 
algorithms?

In [None]:
Answer : Polynomial functions and kernel functions are both mathematical tools used in machine learning, particularly in the
context of support vector machines (SVMs) and kernelized methods.

Polynomial Functions: Polynomial functions are mathematical functions of the form  f(X) = AnX^n + An-1X^n-1 + ...+ A0, here An,
An-1,.., A0 are coefficients, and n is a non-negative integer. In the context of SVMs, polynomial kernels are used to transform
input features into higher- dimensional spaces. The idea is to map the input features into a higher-dimensional space, making 
it easier to find a hyperplane that separates different classes.

Kernel Functions: Kernel functions are functions that compute the dot product between two transformed feature vectors in a higher
-dimensional space, without explicitly computing the transformation.The kernel trick is a method used in machine learning where
the dot product in the higher-dimensional space is replaced by a kernel function. This allows algorithms like SVMs to implicitly 
operate in a higher-dimensional space without the need to explicitly compute the transformed features.

Relationship:
Polynomial kernels in SVMs are a specific type of kernel function. The polynomial kernel can be expressed as K(x,y)=(x⋅y+c)**d
where : d is the degree of the polynomial, and c is a constant.

- So, polynomial kernels are a type of kernel function used in SVMs to achieve a similar effect as explicitly transforming 
  features using polynomial functions.
- In summary, the relationship is that polynomial kernels are a specific instance of kernel functions, and they leverage the concept
  of polynomial transformations to work in higher-dimensional spaces.

In [None]:
Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

In [3]:
# Answer : 
# To implement a Support Vector Machine (SVM) with a polynomial kernel in Python using Scikit-learn, you can use the
# SVC (Support Vector Classification) class. Here's an example code snippet :

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris

dataset = load_iris()

X = pd.DataFrame(dataset.data, columns = dataset.feature_names)

In [6]:
Y = dataset.target

In [8]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X,Y, test_size=0.25, random_state = 42)

In [13]:
from sklearn.svm import SVC

In [15]:
classifier = SVC(kernel = 'poly')

In [16]:
classifier.fit(x_train, y_train)

In [17]:
y_pred = classifier.predict(x_test)

In [19]:
from sklearn.metrics import accuracy_score, classification_report , confusion_matrix

In [20]:
print(accuracy_score(y_test,y_pred))

0.9736842105263158


In [21]:
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        15
           1       1.00      0.91      0.95        11
           2       0.92      1.00      0.96        12

    accuracy                           0.97        38
   macro avg       0.97      0.97      0.97        38
weighted avg       0.98      0.97      0.97        38



In [22]:
print(confusion_matrix(y_test, y_pred))

[[15  0  0]
 [ 0 10  1]
 [ 0  0 12]]


In [None]:
In this example, we use the Iris dataset, split it into training and testing sets, standardize the features, and then train
an SVM classifier with a polynomial kernel using the SVC class. The kernel parameter is set to 'poly', and the degree of the
polynomial is set using the degree parameter.

In [None]:
Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?

In [None]:
Answer : In Support Vector Regression (SVR), the epsilon-insensitive loss function is used to determine the error tolerance 
around the predicted values. The parameter associated with this error tolerance is commonly denoted as epsilon and is a crucial
parameter in SVR.

The epsilon parameter controls the width of the margin around the predicted values within which no penalty is incurred. Instances
that fall outside this margin are considered errors and contribute to the loss function. In SVR, these instances are referred to 
as support vectors.

Here's how the relationship between epsilon and the number of support vectors typically works:
Smaller Epsilon: A smaller value of epsilon tightens the acceptable error margin. This means that only instances very close to the
predicted values are considered acceptable, leading to a smaller margin. As a result, more instances may fall outside this narrower
margin, and the model may need to include more support vectors to accommodate them.

Larger Epsilon: Conversely, a larger value of epsilon allows for a wider margin and a more lenient acceptance of errors. With a
larger margin, fewer instances may be treated as errors, and fewer support vectors may be needed.

In [None]:
Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter 
affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works 
and provide examples of when you might want to increase or decrease its value?

In [None]:
Answer :
Support Vector Regression (SVR) is a type of regression algorithm that uses support vector machines to model the relationship 
between input features and the target variable. The choice of kernel function, C parameter, epsilon parameter, and gamma parameter 
can significantly impact the performance of SVR. Let's discuss each parameter and its effects:

Kernel Function:
The kernel function determines the type of decision boundary used by the SVR. Common kernel functions include linear, polynomial,
radial basis function (RBF), and sigmoid.
  The choice of the kernel depends on the nature of the data. For example:
  - Use a linear kernel when the relationship between inputs and outputs is expected to be linear.
  - RBF kernel is suitable for capturing non-linear relationships.
  - Polynomial kernel is useful when the relationship has a polynomial form.

C Parameter:
- The C parameter controls the trade-off between having a smooth decision boundary and fitting the training data points.
- A small C allows for a softer margin, meaning the algorithm may tolerate more training errors but has a smoother decision boundary.
- A large C makes the optimization prioritize fitting the training data closely, potentially leading to overfitting.
- Increase C when you want to reduce underfitting (make the decision boundary more complex) but be cautious about overfitting.

Epsilon Parameter (ε):
- Epsilon defines the margin of tolerance where no penalty is given to errors.
- It is particularly relevant in SVR for epsilon-insensitive loss function.
- A smaller epsilon requires the model to fit the data more precisely, potentially leading to overfitting.
- A larger epsilon allows for more errors within the margin, resulting in a smoother and simpler model.
- Adjust epsilon based on the level of noise and the desired tolerance for errors.

Gamma Parameter:
- Gamma defines the influence of a single training example. It affects the shape of the decision boundary.
- Small gamma values lead to a more generalized decision boundary, while large gamma values make the decision boundary more 
- dependent on the training data.
- High gamma can lead to overfitting, especially when the number of training samples is small.
- Low gamma can lead to underfitting and oversimplification of the model.
- Experiment with gamma values based on the complexity of the problem and the amount of available data.

In [None]:
Q5. Assignment:
- Import the necessary libraries and load the dataset.
- Split the dataset into training and testing sets.
- Preprocess the data using any technique of your choice (e.g. scaling, normaliMation).
- Create an instance of the SVC classifier and train it on the training data.
- hse the trained classifier to predict the labels of the testing data
- Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy, 
  precision, recall, F1-score)
- Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to 
  improve its performance
- Train the tuned classifier on the entire dataset
- Save the trained classifier to a file for future use.

In [63]:
import pandas as pd
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report , confusion_matrix

In [64]:
dataset = load_wine()
X = pd.DataFrame(data = dataset.data, columns = dataset.feature_names)
Y = dataset.target
x_train, x_test, y_train, y_test = train_test_split(X,Y, test_size = 0.25, random_state = 42)

In [65]:
classifier = SVC()
classifier.fit(x_train, y_train)

In [66]:
y_pred = classifier.predict(x_test)

In [67]:
print(accuracy_score(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test,y_pred))

0.7111111111111111
[[15  0  0]
 [ 0 13  5]
 [ 0  8  4]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        15
           1       0.62      0.72      0.67        18
           2       0.44      0.33      0.38        12

    accuracy                           0.71        45
   macro avg       0.69      0.69      0.68        45
weighted avg       0.70      0.71      0.70        45

