In [37]:
# Q1. What is the relationship between polynomial functions and kernel functions in machine learning
# algorithms?

In [38]:
# Polynomial functions are used to model complex relationships between input features and output targets. Kernel functions, 
# including polynomial kernel functions, transform input data into a higher-dimensional feature space, making it linearly 
# separable. Polynomial kernel functions use polynomial equations to perform this transformation, enabling linear algorithms
# to operate in higher-dimensional spaces.


In [39]:
# Q2. How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?

In [40]:
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler

In [41]:
iris = datasets.load_iris()
X = iris.data[:, :2]  
y = iris.target

In [42]:
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

In [43]:
svm_poly = SVC(kernel='poly', degree=3, C=1)
svm_poly.fit(X_scaled, y)

In [44]:
y_pred = svm_poly.predict(X_scaled)

In [45]:
# Q3. How does increasing the value of epsilon affect the number of support vectors in SVR?

In [46]:
# Increasing 𝜖 generally decreases the number of support vectors in SVR. This is because a larger 𝜖 means that more data 
# points fall within the 𝜖-insensitive margin, reducing the number of points that are considered "support vectors" that 
# influence the regression function.

In [47]:
# Q4. How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter
# affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works
# and provide examples of when you might want to increase or decrease its value?

In [48]:
# Kernel:
# Linear: Use when features are linearly related
# Polynomial: Use when features have non-linear relationships
# RBF (Radial Basis Function): Use when features have non-linear relationships and you're not sure about the degree of non-linearity
# Sigmoid: Use when features have non-linear relationships and you want to introduce non-linearity

# C (Regularization) Parameter:
# High C: Simple datasets, less regularization (allow model to fit training data closely)
# Low C: Noisy datasets, more regularization (prevent model from overfitting)

# Epsilon (ε) Parameter:
# High ε: Noisy datasets, allow for more errors (give model more flexibility)
# Low ε: Precise datasets, allow for fewer errors (make model more accurate)

# Gamma (γ) Parameter (RBF kernel only):
# High γ: Few features, focus on individual features
# Low γ: Many features, consider feature interactions
# These are general guidelines, and the best approach is to experiment with different values and evaluate the model's performance using metrics like mean squared error (MSE) or R-squared.

In [49]:
# Q5. Assignment:
# L Import the necessary libraries and load the dataseg
# L Split the dataset into training and testing setZ
# L Preprocess the data using any technique of your choice (e.g. scaling, normaliMationK
# L Create an instance of the SVC classifier and train it on the training datW
# L hse the trained classifier to predict the labels of the testing datW
# L Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy,
# precision, recall, F1-scoreK
# L Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomiMedSearchCV to
# improve its performanc_
# L Train the tuned classifier on the entire dataseg
# L Save the trained classifier to a file for future use.

# You can use any dataset of your choice for this assignment, but make sure it is suitable for
# classification and has a sufficient number of features and samples.

In [50]:
# Step 1: Import necessary libraries and load the dataset

In [51]:
import pandas as pd
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import GridSearchCV

In [52]:
from sklearn.datasets import load_iris
iris = load_iris()
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df['target'] = iris.target

In [53]:
# Step 2: Split the dataset into training and testing sets

In [54]:
X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [55]:
# Step 3: Preprocess the data using StandardScaler

In [56]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [57]:
# Step 4: Create an instance of the SVC classifier and train it on the training data

In [58]:
svm = SVC()
svm.fit(X_train_scaled, y_train)

In [59]:
# Step 5: Use the trained classifier to predict the labels of the testing data

In [60]:
y_pred = svm.predict(X_test_scaled)

In [61]:
# Step 6: Evaluate the performance of the classifier using accuracy and classification report

In [62]:
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
print("Classification Report:")
print(classification_report(y_test, y_pred))

Accuracy: 1.0
Classification Report:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      1.00      1.00         9
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30



In [63]:
# Step 7: Tune the hyperparameters of the SVC classifier using GridSearchCV

In [64]:
param_grid = {'C': [0.1, 1, 10], 'kernel': ['linear', 'rbf', 'poly'], 'gamma': ['scale', 'auto']}
grid_search = GridSearchCV(svm, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train_scaled, y_train)

print("Best Parameters:", grid_search.best_params_)
print("Best Score:", grid_search.best_score_)

Best Parameters: {'C': 10, 'gamma': 'scale', 'kernel': 'linear'}
Best Score: 0.9583333333333334


In [65]:
# Step 8: Train the tuned classifier on the entire dataset

In [66]:
tuned_svm = SVC(**grid_search.best_params_)
tuned_svm.fit(X_scaled, y)

In [67]:
# Step 9: Save the trained classifier to a file for future use



In [68]:
import joblib
joblib.dump(tuned_svm, 'tuned_svm_iris.joblib')

['tuned_svm_iris.joblib']