### Q1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?

Polynomial functions and kernel functions are both used in machine learning algorithms, but they serve different purposes. Polynomial functions are used to model the relationship between input variables and output variables in a regression problem. On the other hand, kernel functions are used to transform the input data into a higher-dimensional space, where it is easier to find a linear boundary that separates the data points. 

In machine learning, the polynomial kernel is a kernel function that is commonly used with support vector machines (SVMs) and other kernelized models. It represents the similarity of vectors (training samples) in a feature space over polynomials of the original variables, allowing learning of non-linear models ². The polynomial kernel is similar to the linear kernel, but it can learn non-linear relationships by combining the input vectors in a polynomial way ³. 

### Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

To implement an SVM with a polynomial kernel in Python using Scikit-learn, you can follow these steps:

1. Import the necessary libraries:
```python
from sklearn import svm
```

2. Create an instance of the `SVC` class (Support Vector Classifier) and set the `kernel` parameter to `'poly'` to specify the polynomial kernel:
```python
clf = svm.SVC(kernel='poly')
```

3. Fit the model to your training data using the `fit` method:
```python
clf.fit(X_train, y_train)
```

4. Make predictions on new data using the `predict` method:
```python
y_pred = clf.predict(X_test)
```

Here, `X_train` and `y_train` represent your training data and labels, while `X_test` is your test data.

The polynomial kernel is the default kernel for the `SVC` class when you set `kernel='poly'`. However, you can also specify additional parameters for the polynomial kernel, such as the degree of the polynomial using the `degree` parameter.

For example, to use a polynomial kernel of degree 3, you can modify step 2 as follows:
```python
clf = svm.SVC(kernel='poly', degree=3)
```

### Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?

In **Support Vector Regression (SVR)**, the value of **epsilon** determines the width of the tube around the predicted values. It defines an **epsilon-insensitive region** where errors within this region are not penalized. The number of support vectors in SVR is influenced by the value of epsilon.

When the value of epsilon is decreased, the boundary of the tube is shifted inward, resulting in more data points around the boundary. This indicates that more support vectors are present²³⁴. Conversely, increasing epsilon will result in fewer points around the boundary²³.

The number of support vectors is affected by various factors, including the value of epsilon and the tolerance for error. However, it's important to note that increasing the value of epsilon does not necessarily increase or decrease the number of support vectors in a linear manner¹⁵. The relationship between epsilon and the number of support vectors can vary depending on other factors such as the dataset and the specific implementation.

### Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works and provide examples of when you might want to increase or decrease its value?

The choice of kernel function, C parameter, epsilon parameter, and gamma parameter can significantly affect the performance of Support Vector Regression (SVR). Here's how each parameter works and some examples of when you might want to increase or decrease its value:

1. **Kernel function**: The kernel function is used to transform the input data into a higher-dimensional space, where it is easier to find a linear boundary that separates the data points. The choice of kernel function depends on the data's characteristics and the task's complexity. For example, the **linear kernel** is suitable for linearly separable data, while the **polynomial kernel** and **radial basis function (RBF) kernel** are suitable for non-linearly separable data.

2. **C parameter**: The C parameter controls the trade-off between achieving a low training error and a low testing error. A smaller value of C makes the decision surface smoother, while a larger value of C allows more training errors but may lead to overfitting. If you have noisy data or outliers, you may want to decrease C to reduce overfitting.

3. **Epsilon parameter**: The epsilon parameter defines an epsilon-insensitive region where errors within this region are not penalized. It specifies the epsilon-tube within which no penalty is associated in the training loss function with points predicted within a distance epsilon from the actual value. If you have noisy data or outliers, you may want to increase epsilon to allow more errors within this region.

4. **Gamma parameter**: The gamma parameter defines how far the influence of a single training example reaches in the RBF kernel. A small gamma means that the influence of each example is far-reaching, while a large gamma means that each example has limited influence. If your model is overfitting, you may want to decrease gamma to increase the influence of each example.

It's important to note that these parameters are interdependent and should be tuned together for optimal performance. You can use techniques such as grid search or random search to find the optimal values for these parameters.

### Q5. Assignment:
 1. Import the necessary libraries and load the dataseg

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
df = pd.read_csv('heart.csv')
df.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


2. Split the dataset into training and testing set

In [3]:
X = df.drop('target', axis=1)
y = df['target']

In [4]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42)

3. Preprocess the data using any technique of your choice (e.g. scaling, normalisation)

In [5]:
from sklearn.preprocessing import StandardScaler

In [6]:
scaler = StandardScaler()
scaled_X_train = scaler.fit_transform(X_train)
scaled_X_test = scaler.transform(X_test)

4. Create an instance of the SVC classifier and train it on the training data

In [7]:
from sklearn.svm import SVC

In [18]:
model = SVC()
model.fit(scaled_X_train, y_train)

SVC()

5. Use the trained classifier to predict the labels of the testing data

In [21]:
y_pred = model.predict(scaled_X_test)

In [22]:
y_pred

array([0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0,
       1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0,
       1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1,
       1, 0, 1], dtype=int64)

6. Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy, precision, recall, F1-score)

In [23]:
from sklearn.metrics import accuracy_score, classification_report, f1_score

In [24]:
print(accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))
print(f1_score(y_test, y_pred))

0.8241758241758241
              precision    recall  f1-score   support

           0       0.80      0.80      0.80        41
           1       0.84      0.84      0.84        50

    accuracy                           0.82        91
   macro avg       0.82      0.82      0.82        91
weighted avg       0.82      0.82      0.82        91

0.8399999999999999


7. Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to improve its performance

In [25]:
from sklearn.model_selection import GridSearchCV

In [26]:
param_grid = {
    'C': [0.1, 1, 10],
    'kernel': ['linear', 'rbf'],
    'gamma': ['scale', 'auto'] + [0.001, 0.01, 0.1, 1],
}

In [27]:
grid = GridSearchCV(SVC(), param_grid=param_grid,verbose=3)

8. Train the tuned classifier on the entire dataset

In [28]:
grid.fit(scaled_X_train, y_train)
grid.get_params().keys()

Fitting 5 folds for each of 36 candidates, totalling 180 fits
[CV 1/5] END .C=0.1, gamma=scale, kernel=linear;, score=0.860 total time=   0.0s
[CV 2/5] END .C=0.1, gamma=scale, kernel=linear;, score=0.814 total time=   0.0s
[CV 3/5] END .C=0.1, gamma=scale, kernel=linear;, score=0.786 total time=   0.0s
[CV 4/5] END .C=0.1, gamma=scale, kernel=linear;, score=0.929 total time=   0.0s
[CV 5/5] END .C=0.1, gamma=scale, kernel=linear;, score=0.786 total time=   0.0s
[CV 1/5] END ....C=0.1, gamma=scale, kernel=rbf;, score=0.837 total time=   0.0s
[CV 2/5] END ....C=0.1, gamma=scale, kernel=rbf;, score=0.814 total time=   0.0s
[CV 3/5] END ....C=0.1, gamma=scale, kernel=rbf;, score=0.690 total time=   0.0s
[CV 4/5] END ....C=0.1, gamma=scale, kernel=rbf;, score=0.905 total time=   0.0s
[CV 5/5] END ....C=0.1, gamma=scale, kernel=rbf;, score=0.738 total time=   0.0s
[CV 1/5] END ..C=0.1, gamma=auto, kernel=linear;, score=0.860 total time=   0.0s
[CV 2/5] END ..C=0.1, gamma=auto, kernel=linear

[CV 4/5] END .......C=1, gamma=1, kernel=linear;, score=0.881 total time=   0.0s
[CV 5/5] END .......C=1, gamma=1, kernel=linear;, score=0.786 total time=   0.0s
[CV 1/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.558 total time=   0.0s
[CV 2/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.535 total time=   0.0s
[CV 3/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.571 total time=   0.0s
[CV 4/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.571 total time=   0.0s
[CV 5/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.619 total time=   0.0s
[CV 1/5] END ..C=10, gamma=scale, kernel=linear;, score=0.837 total time=   0.0s
[CV 2/5] END ..C=10, gamma=scale, kernel=linear;, score=0.837 total time=   0.0s
[CV 3/5] END ..C=10, gamma=scale, kernel=linear;, score=0.714 total time=   0.0s
[CV 4/5] END ..C=10, gamma=scale, kernel=linear;, score=0.881 total time=   0.0s
[CV 5/5] END ..C=10, gamma=scale, kernel=linear;, score=0.786 total time=   0.0s
[CV 1/5] END .....C=10, gamm

dict_keys(['cv', 'error_score', 'estimator__C', 'estimator__break_ties', 'estimator__cache_size', 'estimator__class_weight', 'estimator__coef0', 'estimator__decision_function_shape', 'estimator__degree', 'estimator__gamma', 'estimator__kernel', 'estimator__max_iter', 'estimator__probability', 'estimator__random_state', 'estimator__shrinking', 'estimator__tol', 'estimator__verbose', 'estimator', 'n_jobs', 'param_grid', 'pre_dispatch', 'refit', 'return_train_score', 'scoring', 'verbose'])

In [29]:
grid.best_params_

{'C': 0.1, 'gamma': 'scale', 'kernel': 'linear'}

In [30]:
y_pred = grid.predict(scaled_X_test)

In [31]:
print(accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))
print(f1_score(y_test, y_pred))

0.8131868131868132
              precision    recall  f1-score   support

           0       0.80      0.78      0.79        41
           1       0.82      0.84      0.83        50

    accuracy                           0.81        91
   macro avg       0.81      0.81      0.81        91
weighted avg       0.81      0.81      0.81        91

0.8316831683168315


8. Save the trained classifier to a file for future use.

In [32]:
classifier = SVC(C=0.1, gamma='scale', kernel='linear')

In [33]:
classifier.fit(scaled_X_train, y_train)

SVC(C=0.1, kernel='linear')

In [34]:
import pickle

In [35]:
pickle.dump(classifier, open('svcmodel.pkl', 'wb'))