## Understanding the MNIST Dataset

- What is MNIST? 'Modified National Institute of Standards and Technology'
- Contents of MNIST dataset: Grayscale images of handwritten digits, each of size 28x28 pixels
- Labels in MNIST: Integer values representing the digit in the corresponding image, ranging from 0 to 9

## Dataset Statistics
- Size of training dataset: 60,000 images
- Size of testing dataset: 10,000 images

In [49]:
from keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [50]:
print(train_images.size)
print(train_labels.size)
print(test_images.size)
print(test_labels.size)

47040000
60000
7840000
10000


In [51]:
print(60000*28*28)
print(10000*28*28)

47040000
7840000


In [52]:
print(train_images.shape)
print(train_labels.shape)
print(test_images.shape)
print(test_labels.shape)

(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)


In [53]:
from keras.models import Sequential
from keras.layers import Dense 

model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(28 * 28,)))
model.add(Dense(10, activation='softmax'))

In [54]:
model.compile(optimizer='rmsprop', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

In [55]:
train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype('float32') / 255
print(train_images.shape)

(60000, 784)


In [56]:
import numpy as np
print(np.unique(train_images))

[0.         0.00392157 0.00784314 0.01176471 0.01568628 0.01960784
 0.02352941 0.02745098 0.03137255 0.03529412 0.03921569 0.04313726
 0.04705882 0.05098039 0.05490196 0.05882353 0.0627451  0.06666667
 0.07058824 0.07450981 0.07843138 0.08235294 0.08627451 0.09019608
 0.09411765 0.09803922 0.10196079 0.10588235 0.10980392 0.11372549
 0.11764706 0.12156863 0.1254902  0.12941177 0.13333334 0.13725491
 0.14117648 0.14509805 0.14901961 0.15294118 0.15686275 0.16078432
 0.16470589 0.16862746 0.17254902 0.1764706  0.18039216 0.18431373
 0.1882353  0.19215687 0.19607843 0.2        0.20392157 0.20784314
 0.21176471 0.21568628 0.21960784 0.22352941 0.22745098 0.23137255
 0.23529412 0.23921569 0.24313726 0.24705882 0.2509804  0.25490198
 0.25882354 0.2627451  0.26666668 0.27058825 0.27450982 0.2784314
 0.28235295 0.28627452 0.2901961  0.29411766 0.29803923 0.3019608
 0.30588236 0.30980393 0.3137255  0.31764707 0.32156864 0.3254902
 0.32941177 0.33333334 0.3372549  0.34117648 0.34509805 0.3490196

In [57]:
model.fit(train_images, train_labels, epochs=10, batch_size=128)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f042f341e80>

In [58]:
test_images = test_images.reshape((10000, 28*28))

test_images = test_images.astype('float')/ 255

In [59]:
test_loss, test_accuracy = model.evaluate(test_images, test_labels)

print("test_loss:", test_loss)
print("test_accuracy:", test_accuracy)

test_loss: 0.06602343916893005
test_accuracy: 0.9807999730110168


In [62]:
model.save('model/mnist_v1.h5')

In [63]:
import numpy as np
from tensorflow.keras.preprocessing.image import load_img
import glob
from keras.models import load_model


model = load_model('model/mnist_v1.h5')

def preprocess_image(image_path):
    img = load_img(image_path, target_size=(28, 28), color_mode="grayscale")
    img_array = np.array(img).reshape(-1, 28 * 28) 
    img_array = img_array.astype('float32') / 255.0
    return img_array


# Preprocess all the images
image_files = glob.glob('data/*.JPG')  
images = np.vstack([preprocess_image(img_file) for img_file in image_files])

predictions = model.predict(images)
predicted_classes = np.argmax(predictions, axis=1)

# Print the filename and corresponding predicted class for each image
for img_file, pred_class in zip(image_files, predicted_classes):
    print(f"File: {img_file}, Predicted Class: {pred_class}")

File: data/zero.JPG, Predicted Class: 0
File: data/two.JPG, Predicted Class: 3
File: data/one.JPG, Predicted Class: 3
File: data/nine.JPG, Predicted Class: 3
File: data/five.JPG, Predicted Class: 5
File: data/4.JPG, Predicted Class: 4
File: data/1.JPG, Predicted Class: 8
File: data/5.JPG, Predicted Class: 5
File: data/eight.JPG, Predicted Class: 8
File: data/six.JPG, Predicted Class: 6
File: data/three.JPG, Predicted Class: 3
File: data/8.JPG, Predicted Class: 8
File: data/2.JPG, Predicted Class: 3
File: data/four.JPG, Predicted Class: 4
File: data/7.JPG, Predicted Class: 3
File: data/6.JPG, Predicted Class: 6
File: data/3.JPG, Predicted Class: 3
File: data/9.JPG, Predicted Class: 3
File: data/0.JPG, Predicted Class: 0
File: data/seven.JPG, Predicted Class: 3
