# Digit Recognition Using the MNIST Dataset

## 1. Introduction

This notebook aims to build a model to classify handwritten digits using the MNIST dataset.

## 2. About the data

The MNIST (Modified National Institute of Standards and Technology) dataset is a large database of handwritten digits that is commonly used for training and testing in the field of machine learning. The dataset contains 60,000 training images and 10,000 testing images. Each image is a 28x28 grayscale image, associated with a label from 0 to 9.

## 3. Data loading

We start by importing the necessary libraries and loading the data.

### Import libraries 

In [3]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import pickle

In [4]:
# Load the iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

## 4. Data exploration

Let's explore what these images look like.

In [5]:
# Split the dataset into a training set and a testing set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


## 5. Data preprocessing

We need to normalize our pixel values (between 0 and 255) to be between 0 and 1.

In [6]:
# Initialize a Random Forest classifier with 100 trees
forest = RandomForestClassifier(n_estimators=100, random_state=42)

# Fit the Random Forest classifier to the training data
forest.fit(X_train, y_train)

# Use the fitted model to make predictions on the test data
predictions = forest.predict(X_test)

## 6. Model building

Let's build a simple neural network model.

In [7]:
# Print the accuracy of the model
print("Accuracy: ", accuracy_score(y_test, predictions))

# Save the model to disk
pickle.dump(forest, open('model.pkl', 'wb'))

Accuracy:  1.0


We'll compile the model with an appropriate optimizer and loss function.

## 8. Model training

Let's train the model.

## 9. Save the model

After training, we can save our model for future use. Let's save it using TensorFlow's saved model format.
This function will create a directory named mnist_model in the current working directory, and will contain the architecture, optimizer, and learned parameters of our model.

## 10. Model evaluation

Finally, let's evaluate the performance of our model on the test data.

In [None]:
# # Get the predicted labels for the test set
# test_predictions = model.predict(test_images)
# test_predictions = np.argmax(test_predictions, axis=1)

# # Calculate accuracy
# accuracy = accuracy_score(test_labels, test_predictions)

# # Calculate precision
# precision = precision_score(test_labels, test_predictions, average='macro')

# # Calculate recall
# recall = recall_score(test_labels, test_predictions, average='macro')

# # Calculate F1-score
# f1 = f1_score(test_labels, test_predictions, average='macro')

# print('Test accuracy:', accuracy)
# print('Precision:', precision)
# print('Recall:', recall)
# print('F1-score:', f1)

Test accuracy: 0.9834
Precision: 0.9833615723771738
Recall: 0.9831984222663305
F1-score: 0.9832697916031774
