<a href="https://colab.research.google.com/github/mandesai/SciforTechnologies/blob/main/Machine_Learning_Test_Man.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Machine Learning Implementation Test

**Submitted by: Man Desai**

**1. Implement Decision Tree Classifier by applying HyperParameter Tuning also their metrics should be there.**

In [1]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

In [2]:
#Loading the iris dataset from sklearn library
iris = load_iris()

In [4]:
X = iris.data
y = iris.target

In [5]:
y

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

Iris dataset has 3 classes, so it is multi-class classification problem.

In [6]:
#Splitting the dataset into training and testing set.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [8]:
#Initializing the decision tree classifier
dtree = DecisionTreeClassifier()

In [12]:
param_grid = {
    'criterion': ['gini'],
    'max_depth': [None, 5, 10],
    'min_samples_leaf': [1, 2, 4],
    'min_samples_split': [2, 5, 10],
    'max_features': ['sqrt', 'log2']
}

In [13]:
# Perform Grid Search to find the best hyperparameters
grid_search = GridSearchCV(dtree, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

In [14]:
#Printing the best hyperparameters
print("Best Parameters:", grid_search.best_params_)

Best Parameters: {'criterion': 'gini', 'max_depth': None, 'max_features': 'sqrt', 'min_samples_leaf': 4, 'min_samples_split': 5}


In [18]:
#Using the best hyperparameters in our model
dtree_best = DecisionTreeClassifier(criterion= 'gini', max_depth= None, max_features= 'sqrt', min_samples_leaf= 4, min_samples_split= 5)
dtree_best.fit(X_train, y_train)

In [20]:
#Predicting on test set
y_pred = dtree_best.predict(X_test)

#Evaluation
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='macro')
recall = recall_score(y_test, y_pred, average='macro')
f1 = f1_score(y_test, y_pred, average='macro')

print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)
print("F1-score:", f1)

Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1-score: 1.0


**2. Implement Naive's Bayes Classifier by Applying HyperParameter Tuning**

In [21]:
from sklearn.naive_bayes import GaussianNB

In [22]:
#Initializing the naive bayes classifier
nb_clf = GaussianNB()

In [23]:
#Defining the hyperparameters
param_grid = {
    'var_smoothing': np.logspace(0,-9, num=100)
}

In [25]:
grid_search = GridSearchCV(nb_clf, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

In [26]:
print("Best Parameters:", grid_search.best_params_)

Best Parameters: {'var_smoothing': 0.02310129700083159}


In [27]:
nb_clf_best = GaussianNB(var_smoothing=0.02310129700083159)
nb_clf_best.fit(X_train, y_train)

In [29]:
#Prediction
y_pred = nb_clf_best.predict(X_test)

#Evaluation
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='macro')
recall = recall_score(y_test, y_pred, average='macro')
f1 = f1_score(y_test, y_pred, average='macro')

print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)
print("F1-score:", f1)

Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1-score: 1.0


**3. Implement Support Vector Machine by applying HyperParameter Tuning**


In [30]:
from sklearn.svm import SVC

In [32]:
#Initializing the svm classifier
svc = SVC()

In [33]:
#Parameter grid
param_grid = {
    'C': [0.1, 1, 10, 100],
    'gamma': ['scale', 'auto'],
    'kernel': ['linear', 'poly', 'rbf', 'sigmoid']
}

In [34]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [35]:
grid_search = GridSearchCV(svc, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

In [36]:
print("Best Parameters:", grid_search.best_params_)

Best Parameters: {'C': 1, 'gamma': 'scale', 'kernel': 'poly'}


In [37]:
svc_best = SVC(C=1, gamma='scale', kernel='poly')
svc_best.fit(X_train, y_train)

In [38]:
#Predictions
y_pred = svc_best.predict(X_test)

#evaluation metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='macro')
recall = recall_score(y_test, y_pred, average='macro')
f1 = f1_score(y_test, y_pred, average='macro')

print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)
print("F1-score:", f1)

Accuracy: 0.9777777777777777
Precision: 0.9761904761904763
Recall: 0.9743589743589745
F1-score: 0.974320987654321


**4. Explain various types of kernel with respect to formula**

**Linear Kernel:**

K(x,y) = x^T.y

Linear kernel is the simplest kernel function that performs the linear classification in input space. It calculates the dot product between input vectors.


**Sigmoid Kernel:**

K(x,y) = tanh(aplha x^T.y + c)

The sigmoid kernel maps the input vectors into a higher-dimensional space using the hyperbolic tangent function. It is particularly useful for binary classification tasks.


**Gaussian Radial Basis Function Kernel**

K(x,y) = e^-Gamma(||x-y||^2)

The Gaussian kernel, also known as the Radial Basis Function (RBF) kernel, maps the input vectors into an infinite-dimensional space. It computes the similarity between input vectors based on the Gaussian distribution centered at each training data point. The parameter

Gamma controls the width of the Gaussian.

**Polynomial Kernel:**

K(x,y) = (x^T.y + c)^d

Polynomial kernel maps the input vectors into a higher-dimensional space using polynomial functions. It has additional parameter d which denotes the degree.