# **ASSIGNMENT**

**Q1. What is the relationship between polynomial functions and kernel functions in machine learning
algorithms?**

Polynomial functions and kernel functions are both mathematical tools used in machine learning, particularly in the context of support vector machines (SVMs) and kernel methods. Here's a brief overview of their relationship:

1. **Polynomial Functions:**
   - In the context of SVMs, polynomial functions are often used as basis functions to transform input data into a higher-dimensional space.
   - The idea is to map the input data from its original space to a higher-dimensional space, where a hyperplane can better separate the data points.
   - The polynomial kernel function is a specific type of kernel function used in SVMs that employs polynomial functions.

2. **Kernel Functions:**
   - In machine learning, a kernel function is a method of implicitly mapping input data into a higher-dimensional space without explicitly computing the new representation.
   - Kernels play a crucial role in kernelized algorithms, such as kernelized SVMs. They allow these algorithms to operate in a higher-dimensional space without explicitly computing the transformation.
   - Polynomial kernels are a type of kernel function commonly used. The polynomial kernel function \(K(x, y) = (x \cdot y + c)^d\) is based on polynomial functions, where \(d\) is the degree of the polynomial, and \(c\) is a constant.

3. **Relationship:**
   - Polynomial functions are a type of basis function used in the construction of polynomial kernel functions.
   - The polynomial kernel function is a specific form of a kernel function that relies on the polynomial transformation of input data.
   - Other types of kernel functions include linear kernels, radial basis function (RBF) kernels, and more. Each type of kernel function corresponds to a different way of transforming the input data.

In summary, polynomial functions are often used as the basis for polynomial kernel functions in machine learning algorithms like SVMs. The kernel functions, including polynomial kernels, enable the algorithms to operate in higher-dimensional spaces efficiently, allowing for better decision boundaries and improved performance on complex datasets.

**Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?**

Implementing an SVM with a polynomial kernel in Python using Scikit-learn is relatively straightforward. Scikit-learn provides the `SVC` (Support Vector Classification) class, which allows you to create an SVM model with different kernel functions, including the polynomial kernel.

Here's a simple example of implementing an SVM with a polynomial kernel using Scikit-learn:

```python
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load a sample dataset (you can replace this with your own dataset)
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create an SVM model with a polynomial kernel
degree_of_polynomial = 3  # Set the degree of the polynomial kernel
svm_model = SVC(kernel='poly', degree=degree_of_polynomial)

# Train the SVM model
svm_model.fit(X_train, y_train)

# Make predictions on the test set
y_pred = svm_model.predict(X_test)

# Evaluate the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")
```

In this example:

1. We load the Iris dataset as a sample dataset.
2. Split the dataset into training and testing sets.
3. Create an `SVC` model with a polynomial kernel using the `kernel='poly'` parameter. The `degree` parameter specifies the degree of the polynomial.
4. Train the model using the training data.
5. Make predictions on the test set.
6. Evaluate the accuracy of the model using the `accuracy_score` function from Scikit-learn's `metrics` module.


**Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?**

In Support Vector Regression (SVR), epsilon (often denoted as ε) is a hyperparameter that controls the width of the margin around the predicted values within which no penalty is associated with errors. It is part of the formulation of the loss function in SVR, and it introduces a tolerance for errors that fall within the margin.

The SVR optimization problem aims to find a function that fits the training data while allowing for a certain amount of error. The margin of tolerance is defined by the epsilon-insensitive loss function, which ignores errors smaller than epsilon. The SVR loss function is typically defined as:

\[ L(y, f(x)) = \max(0, |y - f(x)| - \epsilon) \]

Here:
- \(y\) is the true output,
- \(f(x)\) is the predicted output,
- \(|y - f(x)|\) is the absolute difference between the true and predicted outputs,
- \(\epsilon\) is the epsilon parameter.

Now, let's discuss the impact of increasing the value of epsilon on the number of support vectors:

1. **Smaller Epsilon (Tight Margin):**
   - Smaller values of epsilon result in a tighter margin around the predicted values.
   - With a tight margin, the SVR model becomes sensitive to small errors, and more data points may become support vectors.

2. **Larger Epsilon (Wider Margin):**
   - Larger values of epsilon lead to a wider margin, allowing for larger errors without incurring a penalty.
   - A wider margin means that fewer data points are treated as support vectors, as the model becomes more tolerant of errors within the margin.

In summary, increasing the value of epsilon in SVR tends to decrease the number of support vectors because a larger epsilon allows for a wider margin of tolerance for errors. The model becomes less sensitive to small deviations, and data points that fall within the wider margin are not treated as support vectors. The choice of epsilon should be made carefully based on the problem at hand and the desired balance between fitting the training data and generalizing to new data.

**Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter
affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works
and provide examples of when you might want to increase or decrease its value?**

Support Vector Regression (SVR) is a machine learning algorithm that uses support vector machines for regression tasks. The performance of SVR is influenced by several hyperparameters, including the choice of kernel function, the C parameter, the epsilon parameter (ε), and the gamma parameter (γ). Let's discuss each parameter and how it affects the SVR performance:

1. **Kernel Function:**
   - **Description:** The kernel function determines the type of mapping used to transform the input features into a higher-dimensional space. Common choices include linear, polynomial, and radial basis function (RBF) kernels.
   - **Effect on Performance:** The choice of the kernel depends on the nature of the data. RBF kernels are often versatile and work well in various scenarios. Linear kernels are suitable for linear relationships, while polynomial kernels are useful for capturing non-linear relationships.
   - **Example:** If you suspect that the relationship between input and output is non-linear, you might choose an RBF kernel (`kernel='rbf'`). Experiment with different kernels to see which performs best on your specific dataset.

2. **C Parameter:**
   - **Description:** The C parameter controls the trade-off between achieving a low training error and a smooth decision surface. A smaller C encourages a smoother decision surface, while a larger C allows the model to fit the training data more closely.
   - **Effect on Performance:** Larger values of C can lead to overfitting, especially if the data has noise. Smaller values of C promote a simpler model but may result in underfitting.
   - **Example:** If your training data has outliers or noise, you may want to decrease C (`C=0.1` or `C=1`) to prevent the model from fitting the noise too closely.

3. **Epsilon Parameter (ε):**
   - **Description:** Epsilon defines the margin of tolerance for errors within which no penalty is associated with the training data. It is part of the epsilon-insensitive loss function.
   - **Effect on Performance:** Smaller epsilon values make the model more sensitive to errors, potentially resulting in more support vectors. Larger epsilon values increase the margin and make the model more tolerant of errors.
   - **Example:** If you want the model to be less sensitive to small errors in the training data, you might increase epsilon (`epsilon=0.1` or `epsilon=0.2`).

4. **Gamma Parameter (γ):**
   - **Description:** The gamma parameter defines the influence of a single training example, with low values meaning far influence and high values meaning close influence. It is specific to RBF kernels.
   - **Effect on Performance:** Smaller gamma values result in a smoother decision boundary, while larger gamma values lead to a more complex decision boundary, potentially causing overfitting.
   - **Example:** If you observe overfitting with an RBF kernel, you might try decreasing gamma (`gamma=0.1` or `gamma=0.01`) to make the decision boundary smoother.

In summary, the choice of kernel function, C parameter, epsilon parameter, and gamma parameter in SVR depends on the characteristics of your data. It often involves a trade-off between fitting the training data well and generalizing to new data. Experimentation and tuning these parameters using techniques like cross-validation are crucial for achieving optimal SVR performance on a specific dataset.

**Q5. Assignment:**<br>

. Import the necessary libraries and load the dataset<br>
. Split the dataset into training and testing sets<br>
. Preprocess the data using any technique of your choice (e.g. scaling, normaliMation)<br>
. Create an instance of the SVC classifier and train it on the training data<br>
. Use the trained classifier to predict the labels of the testing data<br>
. Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy,
precision, recall, F1-score)<br>
. Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to
improve its performance<br>
. Train the tuned classifier on the entire dataset<br>
. Save the trained classifier to a file for future use.<br>

**Note: You can use any dataset of your choice for this assignment, but make sure it is suitable for
classification and has a sufficient number of features and samples.**

In [1]:
# Import necessary libraries
from sklearn import datasets
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report
import joblib

# Load the dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Preprocess the data - Standard Scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Create an instance of the SVC classifier
svc_classifier = SVC()

# Train the classifier on the training data
svc_classifier.fit(X_train_scaled, y_train)

# Use the trained classifier to predict labels on the testing data
y_pred = svc_classifier.predict(X_test_scaled)

# Evaluate the performance using accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")

# Evaluate the performance using classification report
print("Classification Report:\n", classification_report(y_test, y_pred))

# Tune hyperparameters using GridSearchCV
param_grid = {'C': [0.1, 1, 10], 'kernel': ['linear', 'rbf', 'poly'], 'gamma': ['scale', 'auto']}
grid_search = GridSearchCV(SVC(), param_grid, cv=3)
grid_search.fit(X_train_scaled, y_train)

# Get the best parameters from the grid search
best_params = grid_search.best_params_
print("Best Hyperparameters:", best_params)

# Train the tuned classifier on the entire dataset
tuned_classifier = SVC(**best_params)
tuned_classifier.fit(X, y)

# Save the trained classifier to a file using joblib
joblib.dump(tuned_classifier, 'tuned_svc_classifier.joblib')


Accuracy: 1.0
Classification Report:
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      1.00      1.00         9
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30

Best Hyperparameters: {'C': 0.1, 'gamma': 'scale', 'kernel': 'linear'}


['tuned_svc_classifier.joblib']

-------------------------