**Q1. What is the relationship between polynomial functions and kernel functions in machine learning algorithms?**

In machine learning, kernel functions play a crucial role in support vector machines (SVMs) and other algorithms. A polynomial kernel is a type of kernel function. Kernel functions are used to transform input data into a higher-dimensional space, allowing the algorithm to find a non-linear decision boundary in the original feature space. The polynomial kernel, in particular, is defined as \( K(x, y) = (x \cdot y + c)^d \), where \(d\) is the degree of the polynomial, and \(c\) is a constant.

**Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?**

Here's an example of how to implement an SVM with a polynomial kernel using Scikit-learn:


In [34]:


from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

# Load dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Preprocess the data (scaling)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Create an instance of the SVC classifier with a polynomial kernel
svm_classifier = SVC(kernel='poly', degree=3, C=1.0)

# Train the classifier on the training data
svm_classifier.fit(X_train_scaled, y_train)

# Use the trained classifier to predict the labels of the testing data
y_pred = svm_classifier.predict(X_test_scaled)

# Evaluate the performance of the classifier using accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')



Accuracy: 0.9666666666666667



**Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?**

In Support Vector Regression (SVR), epsilon (\(\varepsilon\)) is the margin of tolerance where no penalty is given to errors. Increasing the value of epsilon allows for a larger margin of error in the prediction. As epsilon increases, the SVR model becomes more tolerant of errors, and the number of support vectors may increase because more data points fall within the expanded margin of tolerance.

**Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)?**

Support Vector Regression (SVR) is a machine learning algorithm used for regression tasks. It relies on the concept of support vectors and aims to find a hyperplane that best captures the relationships between input features and corresponding target values. The choice of kernel function, C parameter, epsilon parameter (ε), and gamma parameter can significantly impact the performance of SVR. Let's explore each parameter:

1. **Kernel Function:**
   - The kernel function determines the type of transformation applied to the input features.
   - Common kernel functions include Linear, Polynomial, Radial Basis Function (RBF), and Sigmoid.
   - The choice of the kernel depends on the nature of the data. RBF is often a good default choice and works well for a wide range of problems.
   - Example:
     ```python
     from sklearn.svm import SVR
     model = SVR(kernel='rbf')
     ```

2. **C Parameter:**
   - The C parameter controls the trade-off between achieving a smooth decision boundary and fitting the training data accurately.
   - A small C value makes the decision boundary smooth (increased bias, lower variance), while a large C value aims for correct classification of all training points (reduced bias, higher variance).
   - It helps prevent overfitting by penalizing large coefficients.
   - Example:
     ```python
     from sklearn.svm import SVR
     model = SVR(C=1.0)
     ```

3. **Epsilon Parameter (ε):**
   - The epsilon parameter determines the margin of tolerance for errors in SVR. It defines a tube around the regression line within which errors are not penalized.
   - Smaller values of ε result in a narrower tube, making the model more sensitive to errors.
   - Larger values of ε create a wider tube, allowing more errors to be tolerated.
   - The choice of ε depends on the desired balance between model complexity and fitting to noise.
   - Example:
     ```python
     from sklearn.svm import SVR
     model = SVR(epsilon=0.1)
     ```

4. **Gamma Parameter:**
   - The gamma parameter defines the influence of a single training example, with low values meaning 'far' and high values meaning 'close.'
   - A small gamma value considers a broader influence, leading to a smoother decision boundary.
   - A large gamma value makes the model focus on points that are closer, resulting in a more complex decision boundary.
   - Avoid high gamma values when the number of features is large, as it may lead to overfitting.
   - Example:
     ```python
     from sklearn.svm import SVR
     model = SVR(gamma='scale')
     ```

In summary, tuning these parameters involves finding a balance between bias and variance, and the optimal values depend on the specific characteristics of the dataset. It is common to perform a grid search or use techniques like cross-validation to find the best combination of hyperparameters for SVR.



**Q5. Assignment:**



In [35]:
# Import necessary libraries
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_digits
from sklearn.metrics import confusion_matrix,classification_report
import joblib

# Load dataset
digits = load_digits()
x = digits.data
y = digits.target


In [36]:
digits.target_names

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [37]:
x.shape,y.shape

((1797, 64), (1797,))

In [38]:

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)


In [39]:
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((1437, 64), (360, 64), (1437,), (360,))

In [40]:

# Preprocess the data (scaling)
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)


In [41]:

# Create an instance of the SVC classifier
svm_classifier = SVC()

# Train the classifier on the training data
svm_classifier.fit(X_train_scaled, y_train)


In [42]:

# Use the trained classifier to predict the labels of the testing data
y_pred = svm_classifier.predict(X_test_scaled)

# Evaluate the performance of the classifier using accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')


Accuracy: 0.9805555555555555


In [43]:

# Tune hyperparameters using GridSearchCV
param_grid = {
    'C': [0.1, 1, 10],
    'kernel': ['linear', 'poly', 'rbf'],
    'gamma': [0.1, 1, 10]
    }


In [44]:

grid_search = GridSearchCV(SVC(), param_grid, cv=5)
grid_search.fit(X_train_scaled, y_train)

# Get the best parameters
best_params = grid_search.best_params_
print(f'Best Parameters: {best_params}')


Best Parameters: {'C': 0.1, 'gamma': 0.1, 'kernel': 'poly'}


In [45]:

# Train the tuned classifier on the entire dataset
tuned_svm_classifier = SVC(**best_params)
tuned_svm_classifier.fit(X_train_scaled, y_train)


In [46]:
y_pred = tuned_svm_classifier.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))

Accuracy: 0.625


In [47]:
print(confusion_matrix(y_test,y_pred))

[[30  3  0  0  0  0  0  0  0  0]
 [ 0 28  0  0  0  0  0  0  0  0]
 [ 0 13 20  0  0  0  0  0  0  0]
 [ 0  5  0 16  0  0  0  0  0 13]
 [ 0 27  0  0 11  0  8  0  0  0]
 [ 0 17  1  0  0 23  2  0  0  4]
 [ 0  0  0  0  0  0 35  0  0  0]
 [ 0 14  0  0  0  0  0 20  0  0]
 [ 0 22  0  0  0  0  0  0  8  0]
 [ 0  6  0  0  0  0  0  0  0 34]]


In [48]:
print(classification_report(y_test,y_pred))

              precision    recall  f1-score   support

           0       1.00      0.91      0.95        33
           1       0.21      1.00      0.34        28
           2       0.95      0.61      0.74        33
           3       1.00      0.47      0.64        34
           4       1.00      0.24      0.39        46
           5       1.00      0.49      0.66        47
           6       0.78      1.00      0.88        35
           7       1.00      0.59      0.74        34
           8       1.00      0.27      0.42        30
           9       0.67      0.85      0.75        40

    accuracy                           0.62       360
   macro avg       0.86      0.64      0.65       360
weighted avg       0.88      0.62      0.65       360



In [49]:

# Save the trained classifier to a file for future use
joblib.dump(tuned_svm_classifier, 'tuned_svm_classifier.pkl')


['tuned_svm_classifier.pkl']