# Neural Networks 

- Author: Stan Baek  
- Department of Electrical & Computer Engineering
- United States Air Force Academy
- Date: Aug 08, 2023  

*2024-11-26: Comments heavily revised by Stan Baek*


**A note on this document**
This document is known as a Jupyter notebook; it allows text and executable code to coexist in a very easy-to-read format. Blocks can contain text or executable code. For blocks containing code, press `Shift + Enter`, `Ctrl+Enter`, or click the arrow on the block to run the code. Earlier blocks of code need to be run for the later blocks of code to work.

## MNIST-784

MNIST-784 is a widely used dataset in the field of machine learning and computer vision. It stands for the "Modified National Institute of Standards and Technology" database. The MNIST dataset contains a large collection of handwritten digits (0 through 9), which are commonly used for training and testing machine learning models, especially for image classification tasks. Each image in the MNIST dataset is a grayscale image with a resolution of 28x28 pixels, resulting in 784 total pixels. These images are typically used to develop and test algorithms for digit recognition.

In [None]:
# Import necessary libraries
from sklearn import datasets
import joblib
from pathlib import Path

# Load the MNIST dataset

mnist_filename = "./data/mnist_784.pkl"
path = Path(mnist_filename)

if not path.is_file():

    Path("data").mkdir(parents=True, exist_ok=True)

    # download the dataset. It will take about a minute.
    mnist_dataset = datasets.fetch_openml("mnist_784")
    joblib.dump(mnist_dataset, mnist_filename)

# Load the MNIST dataset
mnist = joblib.load(mnist_filename)

# Split the data into features and labels
features = mnist.data
labels = mnist.target

# features is a pandas.core.frame.DataFrame object
print(features.info())

# labels is a pandas.core.series.Series object
print(labels.info())

As indicated in the above information, the dataset comprises 70,000 entries. Let's display the first few entries.

In [None]:
import matplotlib.pyplot as plt

# Reshape and display the first few images
rows = 5  # Change this value to display more images
cols = 10  # Change this value to display more images
for i in range(rows):
    for j in range(cols):
        data = features.iloc[i * cols + j]
        image = data.to_numpy().reshape(28, 28)
        plt.subplot(rows, cols, i * cols + j + 1)
        plt.imshow(image, cmap="gray")
        plt.title(f"{labels.iloc[i*cols+j]}")
        plt.axis("off")

In [None]:
# features is a pandas.core.frame.DataFrame object
print(features.shape)

print(features.head())

Let's print labels.

In [None]:
print(labels.head())

The very first step is to normalize the pixel values to the range [0, 1]. For the MNIST dataset, which consists of grayscale pixel values ranging from 0 to 255, normalizing by dividing all pixel values by 255 is a straightforward way to achieve this. This normalization ensures that each pixel value is between 0 and 1, making the dataset more suitable for training neural networks and other machine learning models.

In [None]:
from sklearn.model_selection import train_test_split

# Preprocess the data
features_normalized = features / 255.0  # Normalize the pixel values to the range [0, 1]

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    features_normalized, labels, test_size=0.2, random_state=42
)

### Deliverable 2A

Complete Deliverable 1 first and come back here.


Use the SVM method to classify the handwritten digits. 

Use the `Radial Basis Function (RBF) kernel` to classify this dataset. The RBF kernel is a good choice when the decision boundary is not expected to be linear or polynomial. It is a versatile kernel for handling non-linear data.

**You can copy and paste your Lab 7 code.**

**Ensure you import necessary libraries**

Warning: It will take about 3-7 minutes.

In [None]:
# TODO: Write your code for SVM with RBF kernel that classifies the handwritten digits.
# Ensure you print out the accuracy score.



In [None]:
import joblib

file_path = "lab8_svm_rbf_classifier.joblib"

# Save the object to a file
joblib.dump(svm_rbf_classifier, file_path)

# Load the object from the file
svm_rbf_classifier = joblib.load(file_path)

# Evaluate the model
y_pred = svm_rbf_classifier.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy * 100:.2f}%")

Use the following index of misclassification to display the misclassified digits.

In [None]:
mis_indices_svm = np.where(y_pred != y_test.values)[0]  # index of misclassification

# TODO: Write your code to display the first 10 misclassified digits.
# You should add the predicted values on top of the digits.
# To do this, use plt.title.



### Deliverable 1

Use `MLPClassifier` to classify the handwritten digits.

Use the following parameters:

- one hidden layer with 100 neurons. 
- sgd for the solver
- random state = 1
- default max iterations
- default tolerance


In [None]:
# Import necessary libraries
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, classification_report


# Create an MLP Classifier
# You may need to adjust the hyperparameters (hidden_layer_sizes, activation, etc.) based on your needs
clf = MLPClassifier(
    hidden_layer_sizes=(100,),
    max_iter=200,
    solver="sgd",
    verbose=1,
    tol=1e-4,
    random_state=1,
    learning_rate_init=0.1,
)

# Train the classifier
clf.fit(X_train, y_train)

# Make predictions on the test set
y_pred = clf.predict(X_test)

# Evaluate the performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

# Display classification report
print("Classification Report:\n", classification_report(y_test, y_pred))

### Deliverable 2B

Complete Deliverable 2A and come back here.

Let's confirm whether the digits misclassified by the SVM classifier are now correctly classified by NN. Display these misclassified digits along with their predicted values from NN. To do this, use the same `mis_indices_svm` in conjunction with the `y_pred` from the NN classifier.

In [None]:
# TODO: Write your code to display the first 10 misclassified digits by Naive Bayes.
# You should add the predicted values by SVM on top of the digits.



Use 5-fold cross-validation to find the accuracy scores and their average.

In [None]:
# TODO: Perform 5-fold cross-validation


### Deliverable 3

Use different parameters for `MLPClassifier`.

In [None]:
# two hiddens layer with 100 neurons each. The rest are the same as Deliverable 1.

# Create an MLP Classifier
# You may need to adjust the hyperparameters (hidden_layer_sizes, activation, etc.) based on your needs
clf = MLPClassifier(
    hidden_layer_sizes=(100, 100),
    max_iter=200,
    solver="sgd",
    verbose=1,
    tol=1e-4,
    random_state=1,
    learning_rate_init=0.1,
)

# Train the classifier
clf.fit(X_train, y_train)

# Make predictions on the test set
y_pred = clf.predict(X_test)

# Evaluate the performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

# Display classification report
print("Classification Report:\n", classification_report(y_test, y_pred))

In [None]:
# activation function is sigmoid. The rest are the same as Deliverable 1.

# Create an MLP Classifier
# You may need to adjust the hyperparameters (hidden_layer_sizes, activation, etc.) based on your needs
clf = MLPClassifier(
    hidden_layer_sizes=(100,),
    max_iter=200,
    activation="logistic",
    solver="sgd",
    verbose=1,
    tol=1e-4,
    random_state=1,
    learning_rate_init=0.1,
)

# Train the classifier
clf.fit(X_train, y_train)

# Make predictions on the test set
y_pred = clf.predict(X_test)

# Evaluate the performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

# Display classification report
print("Classification Report:\n", classification_report(y_test, y_pred))

In [None]:
# learning_rate_init = default. The rest are the same as Deliverable 1.

# Create an MLP Classifier
# You may need to adjust the hyperparameters (hidden_layer_sizes, activation, etc.) based on your needs
clf = MLPClassifier(
    hidden_layer_sizes=(100,),
    max_iter=200,
    solver="sgd",
    verbose=1,
    tol=1e-4,
    random_state=1,
)

# Train the classifier
clf.fit(X_train, y_train)

# Make predictions on the test set
y_pred = clf.predict(X_test)

# Evaluate the performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

# Display classification report
print("Classification Report:\n", classification_report(y_test, y_pred))

In [None]:
# tol is 1e-6. The rest are the same as Deliverable 1.

# Create an MLP Classifier
# You may need to adjust the hyperparameters (hidden_layer_sizes, activation, etc.) based on your needs
clf = MLPClassifier(
    hidden_layer_sizes=(100,),
    max_iter=200,
    solver="sgd",
    verbose=1,
    tol=1e-6,
    random_state=1,
    learning_rate_init=0.1,
)

# Train the classifier
clf.fit(X_train, y_train)

# Make predictions on the test set
y_pred = clf.predict(X_test)

# Evaluate the performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

# Display classification report
print("Classification Report:\n", classification_report(y_test, y_pred))

TODO: `Discuss your findings in terms of the number of iterations, accuracy, computational time, and the final loss.`

If you see the following message, click on `scrollable element`. 

```
Output is truncated. View as a Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
```
