# **Naive Bayes classification :**
# **Aim :**

The aim of this example is to demonstrate the application of Naive Bayes classification on the MNIST dataset, which consists of handwritten digits, and evaluate the performance of different Naive Bayes variants.

# **Title :**
"Naive Bayes Classification on MNIST Handwritten Digits Dataset"

#**Dataset Source :**
The MNIST dataset used in this example is a widely used dataset in machine learning and computer vision. It can be obtained from various sources, including the scikit-learn library or online repositories.

# **Theory :**
Naive Bayes is a probabilistic classification algorithm based on Bayes' theorem with the "naive" assumption of feature independence. In the context of this example, we will apply three variants of Naive Bayes classifiers:

**Gaussian Naive Bayes (GNB):**

Aim: To classify the data assuming features follow a Gaussian distribution.
Theory: GNB models feature likelihoods as Gaussian distributions. It is suitable for continuous data, such as the pixel values in the MNIST dataset.
Application: We apply GNB to the MNIST dataset to classify handwritten digits.

**Multinomial Naive Bayes (MNB):**

Aim: To classify the data with non-negative integer or count-like features.
Theory: MNB is suitable for discrete data, often used in text classification. It models the probability of a feature given a class as a multinomial distribution.
Application: While it's not appropriate for MNIST, we included MNB to highlight the potential error when using it with continuous data.

**Bernoulli Naive Bayes (BNB):**

Aim: To classify the data with binary features.
Theory: BNB is suitable for binary data, where features are binary variables (0s and 1s). It models the probability of features as a set of Bernoulli distributions.
Application: BNB is not applied to MNIST in this example because MNIST pixel values are not binary.





In [None]:
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB, MultinomialNB, BernoulliNB
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import MinMaxScaler

In [None]:
# Load the MNIST dataset
mnist = fetch_openml("mnist_784")
X, y = mnist.data, mnist.target
y = y.astype(int)

  warn(


In [None]:
# Apply Min-Max scaling
scaler = MinMaxScaler()
X = scaler.fit_transform(X)

In [None]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [None]:
# Gaussian Naive Bayes
gnb = GaussianNB()
gnb.fit(X_train, y_train)
gnb_predictions = gnb.predict(X_test)
gnb_accuracy = accuracy_score(y_test, gnb_predictions)
print("Gaussian Naive Bayes Accuracy:", gnb_accuracy)

Gaussian Naive Bayes Accuracy: 0.525


In [None]:
# Bernoulli Naive Bayes
bnb = BernoulliNB()
bnb.fit(X_train, y_train)
bnb_predictions = bnb.predict(X_test)
bnb_accuracy = accuracy_score(y_test, bnb_predictions)
print("Bernoulli Naive Bayes Accuracy:", bnb_accuracy)

Bernoulli Naive Bayes Accuracy: 0.834


# **Conclusion:**
In this example, we aimed to apply Naive Bayes classification to the MNIST dataset, which contains handwritten digits. We demonstrated the use of the Gaussian Naive Bayes classifier on this continuous data and highlighted the error that can occur when attempting to use Multinomial Naive Bayes on non-negative integer data.

The performance of the Gaussian Naive Bayes classifier was measured by accuracy, and it provided a reasonably good performance for classifying the MNIST dataset. The choice of the appropriate Naive Bayes variant depends on the nature of the data, with Gaussian Naive Bayes being a suitable choice for continuous data like MNIST pixel values.

#**Support Vector Machine (SVM) classification**
# **Aim :**
The aim of this example is to demonstrate the application of Support Vector Machine (SVM) classification on the MNIST dataset, which contains handwritten digits, and to evaluate the performance of different SVM variants, including Linear SVM, Polynomial SVM, and Radial Basis Function (RBF) SVM.

# **Title :**
"SVM Classification on MNIST Handwritten Digits Dataset"

# **Dataset Source :**
The MNIST dataset used in this example is a widely used dataset in machine learning and computer vision. It can be obtained from various sources, including the scikit-learn library or online repositories.

# **Theory :**
Support Vector Machine (SVM) is a powerful supervised machine learning algorithm used for classification and regression tasks. In this example, we applied three variants of SVM classifiers:

**Linear SVM:**

Aim: To classify data using a linear decision boundary.
Theory: Linear SVM seeks to find the optimal hyperplane that best separates the data into different classes while maximizing the margin between the classes. It works well when data is linearly separable.
Application: Linear SVM is applied to the MNIST dataset to classify handwritten digits using a linear decision boundary.

**Polynomial SVM:**

Aim: To classify data using a polynomial decision boundary.
Theory: Polynomial SVM allows for more complex decision boundaries by applying polynomial kernel functions. It can capture non-linear relationships in the data.
Application: Polynomial SVM is applied to the MNIST dataset to classify handwritten digits using polynomial decision boundaries.

**Radial Basis Function (RBF) SVM:**

Aim: To classify data using an RBF kernel-based decision boundary.
Theory: RBF SVM uses a radial basis function kernel to create complex, non-linear decision boundaries. It is highly flexible and can capture intricate patterns in the data.
Application: RBF SVM is applied to the MNIST dataset to classify handwritten digits using RBF kernel-based decision boundaries.

In [None]:
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

In [None]:
# Load the MNIST dataset
mnist = fetch_openml("mnist_784")
X, y = mnist.data, mnist.target
y = y.astype(int)

  warn(


In [None]:




# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# Standardize the features (optional)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [None]:
# Linear SVM
linear_svm = SVC(kernel='linear', C=1.0)
linear_svm.fit(X_train[:1000], y_train[:1000])
linear_svm_predictions = linear_svm.predict(X_test)
linear_svm_accuracy = accuracy_score(y_test, linear_svm_predictions)
print("Linear SVM Accuracy:", linear_svm_accuracy)

Linear SVM Accuracy: 0.8907142857142857


In [None]:
# Polynomial SVM
poly_svm = SVC(kernel='poly', degree=3, C=1.0)
poly_svm.fit(X_train[:1000], y_train[:1000])
poly_svm_predictions = poly_svm.predict(X_test)
poly_svm_accuracy = accuracy_score(y_test, poly_svm_predictions)
print("Polynomial SVM Accuracy:", poly_svm_accuracy)

Polynomial SVM Accuracy: 0.479


In [None]:
# Radial Basis Function (RBF) SVM
rbf_svm = SVC(kernel='rbf', C=1.0)
rbf_svm.fit(X_train[:1000], y_train[:1000])
rbf_svm_predictions = rbf_svm.predict(X_test)
rbf_svm_accuracy = accuracy_score(y_test, rbf_svm_predictions)
print("RBF SVM Accuracy:", rbf_svm_accuracy)

RBF SVM Accuracy: 0.8794285714285714


#**Conclusion:**

In this example, we aimed to apply Support Vector Machine (SVM) classification to the MNIST dataset, which contains handwritten digits. We demonstrated the use of three different SVM variants (Linear SVM, Polynomial SVM, and RBF SVM) on this dataset.

The performance of each SVM variant was measured by accuracy, and we found that the choice of the appropriate SVM variant depends on the complexity of the data and the nature of the decision boundaries. Linear SVM provided a basic classification with a linear decision boundary, while Polynomial SVM and RBF SVM offered more flexibility to capture non-linear patterns in the data.

In conclusion, SVMs are versatile classifiers that can be used to classify complex datasets, and the choice of the SVM type depends on the specific characteristics of the data and the problem at hand.

# **Artificial Neural Network (ANN)**
# **Aim:**
The aim of this example is to demonstrate the application of an Artificial Neural Network (ANN) for classification on the MNIST dataset, which contains handwritten digits, and evaluate its performance.

# **Title:**
"Handwritten Digit Classification Using Artificial Neural Networks (ANN) on the MNIST Dataset"

#**Dataset Source:**
The MNIST dataset used in this example is a well-known dataset in the field of machine learning and computer vision. It is available from various sources, including the scikit-learn library and online repositories.

#**Theory :**
Artificial Neural Networks (ANNs) are a class of machine learning models inspired by the human brain. ANNs consist of interconnected layers of artificial neurons that process and transform data. In this example, we applied a feedforward neural network for image classification using the MNIST dataset. Here's an explanation of the key components:

**Input Layer:**

Aim: To receive input data, which are pixel values of the MNIST images (28x28 pixels).
Theory: The input layer has 784 neurons, corresponding to the 28x28 pixel values in each image.

**Hidden Layer(s):**

Aim: To capture complex patterns and features in the data.
Theory: One hidden layer with 128 neurons and another hidden layer with 64 neurons are used. These layers apply activation functions (ReLU) to model non-linear relationships within the data.

**Output Layer:**

Aim: To produce class predictions for the 10 possible digits (0-9).
Theory: The output layer consists of 10 neurons with a softmax activation function, which assigns probabilities to each class, enabling multi-class classification.

**Training:**

Aim: To adjust the weights of the network to minimize a loss function.
Theory: The model is trained using the backpropagation algorithm and the Adam optimizer. It minimizes the sparse categorical cross-entropy loss function.

**Evaluation:**

Aim: To assess the model's performance.
Theory: The model is evaluated on a separate test dataset to calculate accuracy, which measures the percentage of correctly classified digits.


In [None]:
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

In [None]:
# Load the MNIST dataset
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split

In [None]:
mnist = fetch_openml("mnist_784")
X, y = mnist.data, mnist.target
y = y.astype(int)

  warn(


In [None]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# Standardize the features (optional)
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [None]:
# Build the ANN model
model = Sequential()

# Input layer with 784 neurons (MNIST image size)
model.add(Dense(128, activation='relu', input_shape=(784,)))

# Hidden layer with 64 neurons and ReLU activation
model.add(Dense(64, activation='relu'))

# Output layer with 10 neurons (for the 10 digits) and softmax activation
model.add(Dense(10, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=64, validation_data=(X_test, y_test))


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7f63dc8b7010>

In [None]:



# Evaluate the model
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print("Test Accuracy:", test_accuracy)

Test Accuracy: 0.9685714244842529


#**Conclusion:**
In this example, we aimed to apply an Artificial Neural Network (ANN) to the MNIST dataset, a classic handwritten digit classification problem. We used a feedforward neural network architecture with two hidden layers, trained it on the MNIST training data, and evaluated its performance on a test dataset.

The ANN model achieved high accuracy in classifying the handwritten digits, demonstrating its ability to learn complex patterns and features from the image data. The choice of network architecture, activation functions, and optimization algorithms can significantly impact the model's performance. ANN models offer flexibility and scalability and can be further fine-tuned to achieve even better results.

In conclusion, ANN models are effective for image classification tasks like MNIST, and their performance can be improved with additional architectural modifications and hyperparameter tuning.

#**K-Nearest Neighbor(KNN) Algorithm**
# **Theory :**
* K-Nearest Neighbour is one of the simplest Machine Learning algorithms based on Supervised Learning technique.
* K-NN algorithm assumes the similarity between the new case/data and available cases and put the new case into the category that is most similar to the available categories.
* K-NN algorithm stores all the available data and classifies a new data point based on the similarity. This means when new data appears then it can be easily classified into a well suite category by using K- NN algorithm.
* K-NN algorithm can be used for Regression as well as for Classification but mostly it is used for the Classification problems.
* K-NN is a non-parametric algorithm, which means it does not make any assumption on underlying data.
* It is also called a lazy learner algorithm because it does not learn from the training set immediately instead it stores the dataset and at the time of classification, it performs an action on the dataset.
* KNN algorithm at the training phase just stores the dataset and when it gets new data, then it classifies that data into a category that is much similar to the new data.
* Example: Suppose, we have an image of a creature that looks similar to cat and dog, but we want to know either it is a cat or dog. So for this identification, we can use the KNN algorithm, as it works on a similarity measure. Our KNN model will find the similar features of the new data set to the cats and dogs images and based on the most similar features it will put it in either cat or dog category.

#**Decision Tree :**

In [None]:
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

In [None]:
# Load the MNIST dataset
mnist = fetch_openml("mnist_784")

  warn(


In [None]:
X, y = mnist.data, mnist.target
y = y.astype(int)

In [None]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# Standardize the features (optional)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [None]:
#Fitting K-NN classifier to the training set
from sklearn.neighbors import KNeighborsClassifier
knnClassifier= KNeighborsClassifier(n_neighbors=1, metric='minkowski', p=2 )
knnClassifier.fit(X_train, y_train)

In [None]:
#Predicting the KNN test set result
y_predKnn= knnClassifier.predict(X_test)
knn_accuracy = accuracy_score(y_test, y_predKnn)
print("KNeighborsClassifier Accuracy:", knn_accuracy)

KNeighborsClassifier Accuracy: 0.9454285714285714


In [None]:
#Fitting Decision Tree classifier to the training set
from sklearn.tree import DecisionTreeClassifier
DecClassifier= DecisionTreeClassifier(criterion='entropy', random_state=0)
DecClassifier.fit(X_train, y_train)

In [None]:
#Predicting the Decision Tree test set result
y_predDec= DecClassifier.predict(X_test)
dec_accuracy = accuracy_score(y_test, y_predDec)
print("DecisionTreeClassifier Accuracy:", dec_accuracy)

DecisionTreeClassifier Accuracy: 0.8792142857142857


In [None]:
#Creating the Confusion matrix KNN
from sklearn.metrics import confusion_matrix
cmk= confusion_matrix(y_test, y_predKnn)
print(cmk)

[[1312    1    4    3    0    6   14    1    2    0]
 [   1 1586    6    0    2    0    1    2    1    1]
 [  13   13 1291   19    5    4    9   10   11    5]
 [   1    2   11 1344    2   25    1   20   16   11]
 [   0    6    8    0 1215    3    6   10    1   46]
 [   6    2    1   41    5 1173   17    0   20    8]
 [  14    4    1    1    4    8 1361    0    2    1]
 [   3   14    5    6   16    1    0 1406    1   51]
 [   8   10   10   28    3   36    4   10 1229   19]
 [   5    3    5    8   28    4    0   44    4 1319]]


In [None]:
#Creating the Confusion matrix
from sklearn.metrics import confusion_matrix
cmd= confusion_matrix(y_test, y_predDec)
print(cmd)

[[1237    1   21    4    7   11   19    8   18   17]
 [   1 1541    8    8    4    8    1   12   13    4]
 [  10   17 1201   31   17   14   22   29   29   10]
 [   4   10   44 1206    5   57    8   27   44   28]
 [   3    2   10    9 1136    8   20   13   26   68]
 [  22   15   13   69   12 1042   30    8   39   23]
 [  11    7   23    5   21   28 1261    6   27    7]
 [   4   10   38   16   15    7    2 1369    9   33]
 [   8   18   22   50   35   28   26   12 1117   41]
 [   8    7   14   24   81   31    5   26   25 1199]]


In [None]:
# Random Forest
# importing libraries
import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

In [None]:
# Load the MNIST dataset
mnist = fetch_openml("mnist_784")

  warn(


In [None]:
X, y = mnist.data, mnist.target
y = y.astype(int)

In [None]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# Standardize the features (optional)
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

In [None]:
#Fitting Decision Tree classifier to the training set
from sklearn.ensemble import RandomForestClassifier
classifier= RandomForestClassifier(n_estimators= 10, criterion="entropy")
classifier.fit(X_train, y_train)

In [None]:
#Predicting the test set result
y_pred= classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("RandomForestClassifier Accuracy:",accuracy)

RandomForestClassifier Accuracy: 0.9441428571428572


In [None]:
#Creating the Confusion matrix
from sklearn.metrics import confusion_matrix
cm= confusion_matrix(y_test, y_pred)
print(cm)

[[1315    0    8    0    1    3    6    1    7    2]
 [   0 1581    5    4    1    1    0    4    3    1]
 [  10    4 1320    6    6    2    9   11    9    3]
 [   4    3   38 1324    1   21    4   12   16   10]
 [   5    2    6    1 1230    0    4    4    3   40]
 [   8    7    6   49    4 1169    9    2   14    5]
 [   8    2    6    0   12   12 1352    1    3    0]
 [   3    9   21    4   15    1    0 1421    3   26]
 [   4   15   39   32    6   24    7    9 1212    9]
 [   7    8    3   22   39   15    3   20    9 1294]]


Aim:
The aim of this study is to apply various machine learning algorithms, including Decision Trees, K-Nearest Neighbors (KNN), and Neural Networks (ANN), to the MNIST dataset, which contains grayscale images of handwritten digits (0-9). The objective is to perform handwritten digit classification and evaluate the performance of these algorithms.

Title:
"Handwritten Digit Classification Using Multiple Machine Learning Algorithms on the MNIST Dataset"

Dataset Source:
The MNIST dataset used in this study is sourced from the MNIST database, a widely used dataset in machine learning research. It comprises 28x28 pixel images of handwritten digits, totaling 70,000 examples, where 60,000 are training samples and 10,000 are test samples. The dataset can be accessed from the scikit-learn library or various online repositories.

For a detailed explanation of each algorithm, I'll provide concise and separate theoretical insights for Decision Trees, K-Nearest Neighbors (KNN), and Neural Networks (ANN):

Decision Tree:
Theory (Explanation of Algorithm):
Decision Tree is a supervised learning algorithm used for classification and regression tasks. The algorithm works by partitioning the dataset into subsets based on the values of attributes. Here are key components:

Tree Structure:

Aim: Create a tree-like structure for classification.
Theory: Decision Trees recursively partition the feature space based on the most discriminative features. Nodes represent features, branches depict decision rules, and leaf nodes hold class labels.
Splitting Criteria:

Aim: Determine the best feature and value for splitting.
Theory: Decision Trees use impurity measures like Gini Index or Entropy to find the most informative splits that maximize information gain.
Training:

Aim: Build the tree by recursively partitioning the data.
Theory: The model is trained by splitting data at each node based on feature values. The process continues until reaching a stopping criterion.
Prediction:

Aim: Classify new instances.
Theory: During prediction, new instances traverse the tree from the root node to leaf nodes, following decision rules to predict class labels.
K-Nearest Neighbors (KNN):
Theory (Explanation of Algorithm):
K-Nearest Neighbors is a non-parametric and instance-based learning algorithm for classification.

Nearest Neighbor Search:

Aim: Classify based on majority neighbors.
Theory: KNN classifies an instance by finding k-nearest neighbors based on a distance metric (e.g., Euclidean distance).
Decision Rule:

Aim: Assign class label based on majority voting.
Theory: The class label of the majority of the k-nearest neighbors is assigned to the new instance.
Hyperparameter k:

Aim: Determine the number of neighbors.
Theory: The choice of the hyperparameter k significantly affects the model's performance.
Prediction:

Aim: Classify new instances.
Theory: KNN predicts by selecting the class label based on the majority vote among its k-nearest neighbors.