The MNIST dataset is an acronym that stands for the Modified National Institute of Standards and Technology dataset.
It is a dataset of 60,000 small square 28×28 pixel grayscale images of handwritten single digits between 0 and 9.


In [None]:
# --- import libraries --------------------------------------------------
from numpy import mean
from numpy import std
from numpy import argmax
from matplotlib import pyplot as plt
from sklearn.model_selection import KFold
from tensorflow.keras.datasets import mnist			#import MNIST dataset
from tensorflow.keras.models import load_model
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.preprocessing.image import load_img
from tensorflow.keras.preprocessing.image import img_to_array
from google.colab import files
import shutil
from google.colab import drive

trai=1      # 0-not to train model...........1- train model again
pred=0      # 0-not to predict new image.....1-predit new image

2. Load train and test dataset
The first element, trainX.shape[0], represents the size or length of the first dimension of the trainX array. Assuming trainX is a NumPy array or a similar data structure, the shape attribute returns a tuple that describes the dimensions of the array. The index [0] is used to access the size of the first dimension specifically. 28,28,1 (single channel for B/W, 28,28,3 for color image)

to_categorical() is being used to convert the labels in testY to categorical format. The to_categorical() function is commonly used in frameworks like Keras or TensorFlow to perform this conversion. It takes an array of numerical labels and returns an array of one-hot encoded vectors representing the same labels in categorical format. Each one-hot vector will have a 1 in the position corresponding to the class label and 0s elsewhere.

In [None]:
def load_dataset():
	(trainX, trainY), (testX, testY) = mnist.load_data()   # load dataset

	#to see size of X(features) and its labels (Y)
	#print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
	#print('Test: X=%s, y=%s' % (testX.shape, testy.shape))
	trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))  # reshape dataset to have a single color channel
	testX = testX.reshape((testX.shape[0], 28, 28, 1))
	trainY = to_categorical(trainY)                        # one hot encode target values
	testY = to_categorical(testY)
	return trainX, trainY, testX, testY

Preparing Data

In [None]:
#prepare the data---- scale pixels  -------------------------------------------------------
def prep_pixels(train, test):
	train_norm = train.astype('float32')  # convert from integers to floats
	test_norm = test.astype('float32')
	train_norm = train_norm / 255.0       # normalized or scaled to range 0-1
	test_norm = test_norm / 255.0
	return train_norm, test_norm          # return normalized images for train and test

**Define model**

Conv2D: This is the function or layer used to add a 2D convolutional layer to the model.

32: This parameter specifies the number of filters or convolutional kernels to be applied in the layer. In this case, there will be 32 filters, and each filter will learn different features from the input data.

(3, 3): This parameter specifies the size of each filter. In this case, the filters will have a size of 3x3 pixels.

activation='relu': This parameter sets the activation function to be applied after the convolution operation. In this case, the Rectified Linear Unit (ReLU) activation function is used, which introduces non-linearity to the model.

kernel_initializer='random_uniform':Initializers define the way to set the initial random weights of Keras layers. This parameter sets the initialization method for the filter weights. In this case, the filter weights are randomly initialized from a uniform distribution.

input_shape=(28, 28, 1): This parameter specifies the shape of the input data that will be fed into the convolutional layer. The input shape is set as (28, 28, 1), indicating that the input images have a height and width of 28 pixels and a single channel (grayscale images). The input shape is important for defining the input dimensions of the first layer in the model.

Overall, this line of code adds a convolutional layer with 32 filters of size 3x3, followed by ReLU activation, random weight initialization, and an input shape of (28, 28, 1) to a neural network model.


**MaxPooling2D**: This is the function or layer used to add a 2D max pooling layer to the model.
(2, 2): This parameter specifies the size of the pooling window or pool size. In this case, the pooling window is 2x2 pixels. The pooling window slides over the input feature maps, and the maximum value within each window is selected.
The MaxPooling2D layer reduces the spatial dimensions of the input feature maps while retaining the most significant features. It divides the feature maps into non-overlapping regions defined by the pool size and selects the maximum value within each region.

By specifying (2, 2) as the pool size, the spatial dimensions of the feature maps will be reduced by a factor of 2. For example, if the input feature maps have dimensions of 28x28, the output feature maps after the pooling layer will have dimensions of 14x14.

Max pooling layers are commonly used in convolutional neural networks to downsample the feature maps, reduce overfitting, and extract the most relevant features. They help in capturing important spatial information while reducing the computational complexity of the model.

Flatten Layer: After the max pooling layer, the flatten layer is used to convert the multidimensional feature maps into a flattened vector. It reshapes the output of the previous layers into a 1D vector, which can be fed into a fully connected layer. The flatten layer preserves the spatial relationships learned by the convolutional layers but transforms the data into a format suitable for traditional dense (fully connected) layers.

The purpose of this sequence is to progressively extract higher-level features from the input data while reducing the spatial dimensions. The convolutional layer detects local patterns, edges, and textures, while the max pooling layer captures the most prominent features. The flatten layer then prepares the data for further processing by fully connected layers, which can learn complex relationships and make predictions.

Dense layer:
In a dense layer, each neuron is connected to every neuron in the previous layer.
A fully connected (dense) layer with 100 units is added. It uses ReLU activation and random uniform weight initialization. The final dense layer with 10 units is added, representing the output layer. It uses the softmax activation function, which is commonly used for multi-class classification problems.

Stochastic Gradient Descent (SGD) optimization algorithm

learning_rate=0.01: The learning rate determines the step size at which the optimizer updates the weights during training. A higher learning rate can lead to faster convergence but may risk overshooting the optimal solution, while a lower learning rate can help converge more accurately but may take longer.

Momentum helps accelerate the learning process by accumulating the past gradients and incorporating them into the current gradient update. It adds a fraction (0.9 in this case) of the previous update to the current update, allowing the optimizer to navigate through flatter regions of the optimization landscape more efficiently.

The learning rate determines the step size, and momentum helps to smooth out the optimization process.

In [None]:
#---- define cnn model --------------------------------------------------------
def define_model():
	model = Sequential()                   # sequential type model
	model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='random_uniform', input_shape=(28, 28, 1))) # input layer (convolutional)
	model.add(MaxPooling2D((2, 2)))        # Downsamples input dimension by taking maximum value over an input window
	model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='random_uniform')) # 2nd (convolutional) layer
	model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='random_uniform')) # 3rd (convolutional) layer
	model.add(MaxPooling2D((2, 2)))        # Downsamples input dimension by taking maximum value over an input window
	model.add(Flatten())                   # Flatten layer
	model.add(Dense(100, activation='relu', kernel_initializer='random_uniform'))            # core layer
	model.add(Dense(10, activation='softmax'))                                           # output layer (dimension=10)
	# compile model


	opt = SGD(learning_rate=0.01, momentum=0.9)                                          # select optimiter type
	model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])  # compile model
	return model

model.fit()Trains the model for a fixed number of epochs.

In order to estimate the performance of a model for a given training run, we can further split the training set into a train and validation dataset. Performance on the train and validation dataset over each run can then be plotted to provide learning curves and insight into how well a model is learning the problem. The Keras API supports this by specifying the “validation_data” argument to the model.fit()

history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=2)
Trains the model for a fixed number of epochs (dataset iterations).

batch_size: Number of samples per gradient update
•	We divide the training set into batches (number of samples). The batch_size is the sample size (number of training instances each batch contains). The number of batches is obtained by:  #of batches=(size of the train dataset/batchsize)+1

Batches and steps. Considering batch_size=128
•	According to the above equation, here we get 469 (60,000 / 128 + 1) batches. We add 1 to compensate for any fractional part.
•	In one epoch, the fit()method process 469 steps. The model parameters will be updated 469 times in each epoch of optimization.
•	The algorithm takes the first 128 training instances in each epoch and updates the model parameters.
•	Then, it takes the next 128 training instances and updates the model parameters. The algorithm will do this process until 469 steps are complete in each epoch.


Epochs: Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided.

verbose:  'auto', 0, 1, or 2. Verbosity mode. 0 = silent, 1 = progress bar, 2 = one line per epoch.
validation_data: Data on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data.

**K-Folds cross-validator.** Provides train/test indices to split data in train/test sets. Split dataset into k consecutive folds
random_state=1: This parameter sets the random seed for reproducibility. By setting a specific random seed, the shuffling and fold splitting process will yield the same results each time the code is executed, which is useful for reproducible experiments.

On the for loop:The test set for each fold will be used to evaluate the model both during each epoch of the training run, so that we can later create learning curves, and at the end of the run, so that we can estimate the performance of the model. As such, we will keep track of the resulting history from each run, as well as the classification accuracy of the fold.

The line _, acc = model.evaluate(testX, testY, verbose=0) evaluates the performance of a trained model on a test dataset and assigns the accuracy value to the variable acc.

model: Refers to the trained neural network model that you want to evaluate.
testX: Represents the input features of the test dataset.
testY: Represents the corresponding target labels of the test dataset.
verbose=0: The verbose parameter controls the verbosity of the evaluation process. Setting it to 0 means no progress or log messages will be displayed during evaluation.
The model.evaluate() function computes the loss and metrics (in this case, accuracy) of the model on the provided test dataset. It returns a list containing the computed loss value and the values of the specified metrics. In this case, the accuracy value is extracted and assigned to the acc variable.

In [None]:
# ----- evaluate a model using k-fold cross-validation ----------------------
def train_evaluate_model(dataX, dataY, n_folds=5):       # n-folds define the number of cycles
 trainX, trainY, testX, testY = load_dataset()           # load dataset
 trainX, testX = prep_pixels(trainX, testX)              # scale pixels
 scores, histories = list(), list()                      # initialize a empty list
 kfold = KFold(n_folds, shuffle=True, random_state=1)    # prepare cross validation
 for train_ix, test_ix in kfold.split(trainX):           # kfold.split(trainX) is a method that give the number of splits
    model = define_model()                               # define model
    trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]           # select rows for train and test
    history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=2)   # Trains the model for a fixed number of epochs (dataset iterations).
    #returns (history=)training loss values and metrics values at successive epochs, as well as validation loss values and validation metrics values (if applicable).
    _, acc = model.evaluate(testX, testY, verbose=0)     # evaluate model
    print('> %.3f' % (acc * 100.0))
    scores.append(acc)                                   # store accuracy value in a list called scores
    histories.append(history)                            # store history value in a list called histories
 model.save('final_model.h5')
 files.download('final_model.h5')
# shutil.copy('/content/final_model.h5','/content/gdrive/My Drive/pruebas/') #save model to drive
 return scores, histories

In [None]:
# -------- plot diagnostic learning curves -----------------------------------
def summarize_model(histories,scores):
 for i in range(len(histories)):
  plt.subplot(2, 1, 1)                 # plot loss
  plt.title('Cross Entropy Loss')
  plt.plot(histories[i].history['loss'], color='blue', label='train')
  plt.plot(histories[i].history['val_loss'], color='orange', label='test')
  plt.subplot(2, 1, 2)                 # plot accuracy
  plt.title('Classification Accuracy')
  plt.plot(histories[i].history['accuracy'], color='blue', label='train')
  plt.plot(histories[i].history['val_accuracy'], color='orange', label='test')
 plt.show()
 print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))   # print summary
 plt.boxplot(scores)  # box and whisker plots of results
 plt.show()

In [None]:
#------ run the test for evaluating a model --------------------------
def run_evaluation():
 trainX, trainY, testX, testY = load_dataset()             # load dataset
 trainX, testX = prep_pixels(trainX, testX)   	           # prepare pixel data
 scores, histories = train_evaluate_model(trainX, trainY)  # evaluate model
 return scores, histories

# summarize_model(histories,scores)                         # summarize performance curves

In [None]:
# --------- load and prepare the new image -----------------------------------
def load_image(filename):
 #  data_folder=Path("sample_data/")    #filename=data_folder/filename
 img = load_img(filename, grayscale=True, target_size=(28, 28)) # load the image
 img = img_to_array(img)                                        # convert to array
 img = img.reshape(1, 28, 28, 1)                                # reshape into a single sample with 1 channel
 img = img.astype('float32')                                    # prepare pixel data
 img = img / 255.0
 return img

In [None]:
# ----- load an image and predict the class --------------------------------
def predict():
 model = load_model('final_model.h5')
 imagen = files.upload()                    # button for choosin image
 img = load_image(list(imagen.keys())[0])   # call load_image function
#  img = load_image('sample1.png')
 predict_value = model.predict(img)         # predict the class
 digit = argmax(predict_value)              # get the max probab class
 print('Predicted digit is:  %i' % (digit))

In [None]:
#-----------------------------------------------------------------------------
#--------------------------- MAIN CODE ---------------------------------------
#-----------------------------------------------------------------------------
#drive.mount('/content/gdrive')              # mount drive
if trai==1:                                  # if we decide to train the model again
	   scores, histories =run_evaluation()        # fit and evaluate model accuracy
# else:

try:
    model = load_model('final_model.h5')
except:                                      # if we decide to load the trainned model from local file
    print('PLEASE SELECT TRAINNED MODEL .H5')
    modelo = files.upload()                    # button for choosin model
    model = load_model(list(modelo.keys())[0]) # load the selected model

summarize_model(histories,scores)
#model. summary()

#shutil.copy('/content/gdrive/MyDrive/pruebas/final_model.h5','/content/')     # load model from drive
#model = load_model('final_model.h5')

# --- new images in grayscale, handwritten digit centered aligned, size is 28×28 pixels (square) --------
print('PLEASE SELECT NEW IMAGE FOR PREDICT DIGIT')
if pred==1:
 predict()                            # run example of 1 photo




Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Epoch 1/10
