In [1]:
'''Q1. What is the relationship between polynomial functions and kernel functions in machine learning
algorithms?'''

'Q1. What is the relationship between polynomial functions and kernel functions in machine learning\nalgorithms?'

In [2]:
'''In machine learning algorithms, polynomial functions and kernel functions are related through their role in transforming data:

- **Polynomial Functions**: These are mathematical expressions where features are raised to various powers. For example, a quadratic polynomial involves terms like \( x^2 \) and \( xy \).

- **Kernel Functions**: In machine learning, particularly in SVMs, kernel functions allow algorithms to operate in higher-dimensional spaces without explicitly computing the transformed features. A polynomial kernel, for instance, computes the inner product of features in a high-dimensional space corresponding to a polynomial function.

**Relationship**: A polynomial kernel function is a type of kernel function that computes the dot product in a space where features are transformed into polynomial terms. This allows SVMs to learn complex boundaries by implicitly mapping the data into higher dimensions, akin to using polynomial features.'''


'In machine learning algorithms, polynomial functions and kernel functions are related through their role in transforming data:\n\n- **Polynomial Functions**: These are mathematical expressions where features are raised to various powers. For example, a quadratic polynomial involves terms like \\( x^2 \\) and \\( xy \\).\n\n- **Kernel Functions**: In machine learning, particularly in SVMs, kernel functions allow algorithms to operate in higher-dimensional spaces without explicitly computing the transformed features. A polynomial kernel, for instance, computes the inner product of features in a high-dimensional space corresponding to a polynomial function.\n\n**Relationship**: A polynomial kernel function is a type of kernel function that computes the dot product in a space where features are transformed into polynomial terms. This allows SVMs to learn complex boundaries by implicitly mapping the data into higher dimensions, akin to using polynomial features.'

In [3]:
'''Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?'''

'Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?'

In [4]:
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load dataset
iris = load_iris()
X = iris.data
y = iris.target

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create SVM model with polynomial kernel
svm_poly = SVC(kernel='poly', degree=3, C=1.0, random_state=42)

# Fit the model
svm_poly.fit(X_train, y_train)
from sklearn.metrics import accuracy_score

# Predict and evaluate
y_pred = svm_poly.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")


Accuracy: 0.98


In [5]:
'''Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?'''

'Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?'

In [6]:
'''Increasing the value of epsilon in Support Vector Regression (SVR) generally reduces the number of support vectors. A larger epsilon allows for a wider margin of tolerance, meaning more data points are within the epsilon-insensitive zone and do not contribute to the model as support vectors. Consequently, the model is less sensitive to individual data points, leading to fewer support vectors.'''

'Increasing the value of epsilon in Support Vector Regression (SVR) generally reduces the number of support vectors. A larger epsilon allows for a wider margin of tolerance, meaning more data points are within the epsilon-insensitive zone and do not contribute to the model as support vectors. Consequently, the model is less sensitive to individual data points, leading to fewer support vectors.'

In [7]:
'''Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter
affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works
and provide examples of when you might want to increase or decrease its value?'''

'Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter\naffect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works\nand provide examples of when you might want to increase or decrease its value?'

In [8]:
'''### Parameters Affecting SVR Performance

1. **Kernel Function**:
   - **Role**: Determines the shape of the decision boundary.
   - **Examples**:
     - **Linear Kernel**: Best for linear relationships.
     - **Polynomial Kernel**: Captures polynomial relationships; increase degree for more complexity.
     - **RBF Kernel**: Useful for capturing non-linear relationships; controlled by gamma.
   - **When to Adjust**:
     - Use linear for straightforward cases.
     - Use polynomial or RBF for complex, non-linear patterns.

2. **C Parameter**:
   - **Role**: Controls the trade-off between maximizing margin and minimizing classification error.
   - **Examples**:
     - **High C**: Reduces the margin but fits the training data more closely, which may lead to overfitting.
     - **Low C**: Increases the margin, allowing some errors, which may improve generalization.
   - **When to Adjust**:
     - Increase C for a stricter fit (if overfitting is not an issue).
     - Decrease C for a more generalized model (if the model is overfitting).

3. **Epsilon Parameter**:
   - **Role**: Defines the margin of tolerance where no penalty is given for errors.
   - **Examples**:
     - **Large Epsilon**: Allows a wider margin of tolerance, leading to fewer support vectors and a smoother model.
     - **Small Epsilon**: Requires stricter adherence to the training data, potentially resulting in more support vectors and a more complex model.
   - **When to Adjust**:
     - Increase epsilon for a smoother model (if noise is present).
     - Decrease epsilon for a more precise fit (if precision is crucial).

4. **Gamma Parameter** (for RBF Kernel):
   - **Role**: Controls the influence of a single training example; affects the shape of the decision boundary.
   - **Examples**:
     - **High Gamma**: Creates a more complex model with tighter fit to the training data, potentially leading to overfitting.
     - **Low Gamma**: Results in a smoother decision boundary, potentially underfitting the model.
   - **When to Adjust**:
     - Increase gamma for a more complex model (if training data has intricate patterns).
     - Decrease gamma for a simpler model (if the data is noisy).

In summary, tuning these parameters affects the balance between fitting the training data and generalizing to new data. Adjusting them depends on the specific problem and data characteristics.'''

'### Parameters Affecting SVR Performance\n\n1. **Kernel Function**:\n   - **Role**: Determines the shape of the decision boundary.\n   - **Examples**:\n     - **Linear Kernel**: Best for linear relationships.\n     - **Polynomial Kernel**: Captures polynomial relationships; increase degree for more complexity.\n     - **RBF Kernel**: Useful for capturing non-linear relationships; controlled by gamma.\n   - **When to Adjust**:\n     - Use linear for straightforward cases.\n     - Use polynomial or RBF for complex, non-linear patterns.\n\n2. **C Parameter**:\n   - **Role**: Controls the trade-off between maximizing margin and minimizing classification error.\n   - **Examples**:\n     - **High C**: Reduces the margin but fits the training data more closely, which may lead to overfitting.\n     - **Low C**: Increases the margin, allowing some errors, which may improve generalization.\n   - **When to Adjust**:\n     - Increase C for a stricter fit (if overfitting is not an issue).\n     - 

In [9]:
'''Q5. Assignment:
L Import the necessary libraries and load the dataseg
L Split the dataset into training and testing setZ
L Preprocess the data using any technique of your choice (e.g. scaling, normalizationK
L Create an instance of the SVC classifier and train it on the training datW
L Use the trained classifier to predict the labels of the testing datW
L Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy,
precision, recall, F1-scoreK
L Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomizedSearchCV to
improve its performanc_
L Train the tuned classifier on the entire dataseg
L Save the trained classifier to a file for future use.'''

'Q5. Assignment:\nL Import the necessary libraries and load the dataseg\nL Split the dataset into training and testing setZ\nL Preprocess the data using any technique of your choice (e.g. scaling, normalizationK\nL Create an instance of the SVC classifier and train it on the training datW\nL Use the trained classifier to predict the labels of the testing datW\nL Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy,\nprecision, recall, F1-scoreK\nL Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomizedSearchCV to\nimprove its performanc_\nL Train the tuned classifier on the entire dataseg\nL Save the trained classifier to a file for future use.'

In [10]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.model_selection import GridSearchCV
import joblib

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Scale the features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Create an instance of SVC with a linear kernel
svc = SVC(kernel='linear', random_state=42)

# Train the classifier
svc.fit(X_train_scaled, y_train)
# Predict on the test set
y_pred = svc.predict(X_test_scaled)
# Evaluate performance
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')

print(f"Accuracy: {accuracy:.2f}")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1 Score: {f1:.2f}")


Accuracy: 0.98
Precision: 0.98
Recall: 0.98
F1 Score: 0.98
