# Save MNIST Numpy Arrays as Images for the CNN

## Background

The original MNIST dataset from the keras library has the data in numpy array format, which is great for the Gabor filter image classifier. However, for the neural network, the fastai learners can only handle image data (not numpy arrays). Therefore, the numpy array dataset needs to be converted to jpg images. This notebook uses the PIL library to convert numpt arrays to images, and the images are sorted into respective folders to streamline the neural network setup.

## Imports

In [8]:
from keras.datasets import mnist #handwritten digit dataset

import numpy as np
from PIL import Image as im
import os

## Prepare Arrays for Export

In [9]:
#load images into proper numpy arrays
(train_X, train_y), (test_X, test_y) = mnist.load_data()

#take a subset of the dataset, as 60000 images takes too long to train
#for this algorithm, 1000 training images and 200 testing images are used
train_X = train_X[0:1000,:,:]
test_X = test_X[0:200,:,:]
train_y = train_y[0:1000]
test_y = test_y[0:200]

export_path = './MNIST_sample'

## Export Arrays as Images

In [10]:
for y_idx, y in enumerate(train_y):
    num_imgs = len(os.listdir(f"{export_path}/train/{y}"))
    train_img = im.fromarray(train_X[y_idx,:,:])
    train_img.save(f"{export_path}/train/{y}/{num_imgs:08d}.jpg")

In [11]:
for y_idx, y in enumerate(test_y):
    num_imgs = len(os.listdir(f"{export_path}/test/{y}"))
    test_img = im.fromarray(test_X[y_idx,:,:])
    test_img.save(f"{export_path}/test/{y}/{num_imgs:08d}.jpg")