# 1.
## What is the relationship between polynomial functions and kernel functions in machine learning algorithms?
### ->Polynomial functions and kernel functions have a close relationship in machine learning algorithms, particularly in kernel methods such as Support Vector Machines (SVMs).

### ->A polynomial function is a mathematical function that consists of one or more terms, with each term being a constant multiplied by one or more variables raised to non-negative integer exponents. Polynomial functions can capture nonlinear relationships between input features.

### ->In machine learning, kernel functions are used to implicitly map the input features into a higher-dimensional feature space, where linear separation is possible. Kernel functions provide a way to compute dot products between feature vectors in the higher-dimensional space without explicitly transforming the data. They allow nonlinear relationships to be captured in a computationally efficient manner.

### ->Now, the relationship between polynomial functions and kernel functions arises from the fact that certain kernel functions can be interpreted as implicitly representing a polynomial function in a higher-dimensional space.

### ->The polynomial kernel function is a type of kernel function that computes the dot product between two vectors as if they were mapped into a higher-dimensional space using a polynomial function. It enables SVMs to learn nonlinear decision boundaries by implicitly working with polynomial functions of the input features.

### ->The polynomial kernel function is defined as:
#### K(x, y) = (gamma * <x, y> + coef0)^degree

# 2.
## How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

In [1]:
from sklearn.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

X,y =make_classification(n_samples=1000,n_features=5,n_redundant=2,shuffle=True,random_state=None)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
clf = SVC(kernel='poly', degree=3, random_state=42)
clf.fit(X_train,y_train)
y_pred=clf.predict(X_test)
print(accuracy_score(y_test,y_pred))

0.885


# 3.
## How does increasing the value of epsilon affect the number of support vectors in SVR?
### ->In Support Vector Regression (SVR), the parameter epsilon (ε) determines the width of the epsilon-insensitive tube around the regression line. It controls the trade-off between the model's accuracy and the number of support vectors.
### ->Support vectors are the data points that lie on the boundaries of the epsilon-insensitive tube or contribute to defining the regression line. They have a non-zero dual coefficient value (α) in the solution of the optimization problem.
### ->When you increase the value of epsilon (ε), the width of the epsilon-insensitive tube becomes larger. As a result, more data points may fall within or close to the tube, leading to a higher chance of being classified as support vectors.
### ->In other words, increasing epsilon allows more data points to be within the acceptable margin of error. Consequently, the number of support vectors is likely to increase when epsilon is larger.

# 4.
## How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works and provide examples of when you might want to increase or decrease its value?
### ->In Support Vector Regression (SVR), the choice of kernel function, C parameter, epsilon parameter, and gamma parameter can significantly impact the performance and behavior of the model. Let's explore each parameter and its effect:
### 1] Kernel function:The kernel function determines the type of non-linear mapping used to transform the input features into a higher-dimensional space. Common kernel functions include linear, polynomial, radial basis function (RBF), and sigmoid. Each kernel has different characteristics and is suitable for different types of data and relationships. Here are some considerations:
#### Linear kernel: Suitable for linearly separable data without complex patterns.
#### Polynomial kernel: Useful when data has non-linear patterns with different degrees of complexity. The degree parameter controls the degree of the polynomial.
#### RBF kernel: Effective when data has non-linear patterns and the degree of complexity is not known. The gamma parameter controls the width of the Gaussian function.
#### Sigmoid kernel: Appropriate for data with non-linear patterns resembling sigmoid functions.
### 2] C parameter (Regularization parameter):The C parameter determines the trade-off between model complexity and training error. It controls the penalty for misclassifying training examples. Considerations for the C parameter:
#### Smaller C: Allows more misclassifications and creates a larger margin, leading to a simpler model with potentially higher bias but lower variance. It generalizes better but might sacrifice accuracy.
#### Larger C: Emphasizes the importance of classifying training examples correctly, resulting in a smaller margin and a potentially more complex model. It can lead to lower bias but higher variance. It tends to fit the training data more closely.
### 3] Epsilon parameter:The epsilon parameter (ε) defines the width of the epsilon-insensitive tube around the regression line in SVR. It determines the acceptable margin of error for points within the tube. Considerations for the epsilon parameter:
#### Smaller epsilon: Constrains the allowable margin of error tightly. It results in a more sensitive model that tries to fit the data precisely. It might lead to overfitting if the noise level is high.
#### Larger epsilon: Allows a wider margin of error, resulting in a more lenient model that tolerates higher deviations. It may produce a more generalizable model and be less affected by noise.
### 4] Gamma parameter:The gamma parameter defines the influence of each training example on the decision boundary. It affects the flexibility of the model. Considerations for the gamma parameter:
#### Smaller gamma: Considers a broader influence range for each training example. It tends to create smoother decision boundaries and can generalize better. It is suitable for large datasets.
#### Larger gamma: Considers a smaller influence range for each training example. It leads to more complex and intricate decision boundaries, potentially overfitting the training data. It is suitable for smaller datasets.

# 5.
## Assignment:
####  Import the necessary libraries and load the dataset
####  Split the dataset into training and testing sets
####  Preprocess the data using any technique of your choice (e.g. scaling, normaliMation)
####  Create an instance of the SVC classifier and train it on the training data
####  Use the trained classifier to predict the labels of the testing data
####  Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy, precision, recall, F1-scoreK
####  Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to improve its performance
####  Train the tuned classifier on the entire dataset
####  Save the trained classifier to a file for future use.

In [2]:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score,confusion_matrix,classification_report
wine=load_wine()
X=wine.data
y=wine.target

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.20,random_state=42)

svm=SVC()
svm.fit(X_train,y_train)

y_pred=svm.predict(X_test)
print(accuracy_score(y_test,y_pred))
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred))

0.8055555555555556
[[14  0  0]
 [ 0 11  3]
 [ 0  4  4]]
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        14
           1       0.73      0.79      0.76        14
           2       0.57      0.50      0.53         8

    accuracy                           0.81        36
   macro avg       0.77      0.76      0.76        36
weighted avg       0.80      0.81      0.80        36



In [8]:
from sklearn.model_selection import GridSearchCV
classifier=SVC()
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.20,random_state=42)

parameter={
    "C":[100,10,1,0.1,0.01],
    "kernel":["poly","rbf","linear"],
}
clf=GridSearchCV(classifier,param_grid=parameter,cv=5,scoring='accuracy')
clf.fit(X_train,y_train)

In [14]:
clf.best_params_

{'C': 0.1, 'kernel': 'linear'}

In [15]:
y_pred=clf.predict(X_test)

In [20]:
import pickle as pkl
with open('svc_assignment.pkl', 'wb') as f:
    pkl.dump(clf, f)