## Example of TensorFlow Code

The first example is a sequential model, which is a linear pipelines of neural network layers.

In [5]:
import tensorflow as tf
from tensorflow import keras

NB_CLASSES = 10
RESHAPED = 784

model = tf.keras.models.Sequential()
model.add(keras.layers.Dense(NB_CLASSES,
                             input_shape=(RESHAPED,), 
                             kernel_initializer='zeros', # how do we want to initialize the perceptros.
                                                         # zeros, initalize as zero
                                                         # random_uniform, uniform randomized small values in the range(-0.5, 0.5)
                                                         # random_normal, initialized according to gaussian distribution, with 
                                                         #     0 as the mean, and 0.05 as the standard deviation
                             name='dense_layer',         # all neurons in a layer are connection to all neurons in the previous layer
                             activation='softmax'))

### Defining a simple neural net in TensorFlow

<ul>
    <li>epoch - how long the training should last</li>
    <li>batch_size - is the number of sample you feed a network at a time</li>
    <li>validation sample - amount of data reservered for checking if our network is valid</li>
</ul>

#### One hot encoding - 
encode the data into categorical format, using an array of possible categories

In [13]:
import tensorflow as tf
import numpy as np
from tensorflow import keras

# Network and training parameters
EPOCHS = 200
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10 # number of outputs = in this chase the number of digits
VALIDATION_SPLIT = 0.2 # how much training data is reserved for validation

# Load the MNIST dataset
# Labels have one-hot representation, automatically applied
mnist = keras.datasets.mnist

(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

# X_train is 60000 rows of 28x28 values, resahe it to 600000 x 784
RESHAPED = 784

X_train = X_train.reshape(60000, RESHAPED)
X_test = X_test.reshape(10000, RESHAPED)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

# Normalize inputs to be within [0,1]
X_train /= 255
X_test /= 255
print(X_train.shape[0], "train samples")
print(X_test.shape[0], "test samples")

# One-hot representation of the labels.
Y_train = tf.keras.utils.to_categorical(Y_train, NB_CLASSES)
Y_test = tf.keras.utils.to_categorical(Y_test, NB_CLASSES)

60000 train samples
10000 test samples


<ul>
    <li>Input layer - is 28 by 28 neujrons</li>
    <li>typically values associated with each pixel are normalized in range [0, 1]</li>
    <li>Final layer - a single neuron with the activation function 'softmax', (generalized sigmoid function).  The output is a range (0, 1) over range (-infinity, infinity)</li>
    <li>softmax "squashes" a K-dimensional vector of arbitrary real values into a K-dimensional vector provided by the previous lay with ten neurnons.</li>
</ul>

In [14]:
# Build the model 
model = tf.keras.models.Sequential()
model.add(keras.layers.Dense(NB_CLASSES,
                             input_shape=(RESHAPED,),
                             name='dense_layer',
                             activation='softmax'))

<ul>
    <li><b>Optimizer function - algorithm to update the weights while training the model<b></li>
    <li>MSE (Mean Squared Error - </li>
    <li>binary-crossentropy - </li>
    <li>categorical-crossentropy - </li>
    <li><b>Common Metrics<b></li>
    <li>Accuracy -</li>
    <li>Precision -</li>
    <li>Recall -</li>
</ul>

In [17]:
# Compiling the Model
model.compile(optimizer='SGD',                  # Stochastic Gradient Descent (SGD) optimization algorithm used to reduce mistakes by neural networks after training epoch
              loss='categorical_crossentropy',  # categorical-crossentropy - multiclass logarithmic loss
              metrics=['accuracy'])             # proportion of correct predictions with respect to total

<ul>
    <li>epoch - number of times model is exposed to training set.</li>
    <li>batch size - number of training instances observed before optimizer performs a weight update</li>
</ul>

In [18]:
# Training the model.
model.fit(X_train, Y_train,
          batch_size=BATCH_SIZE, epochs=EPOCHS,
          verbose=VERBOSE, validation_split=VALIDATION_SPLIT)

Epoch 1/200


2024-06-21 03:46:17.367697: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 150528000 exceeds 10% of free system memory.


Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78/200
Epoch 7

<keras.src.callbacks.History at 0x7f0aa5b17710>

In [19]:
# evaluate the mdel
test_loss, test_acc = model.evaluate(X_test, Y_test)
print('Test accuracy: ', test_acc)

Test accuracy:  0.9229000210762024


## Improving the simple net in TensorFlow with hidden layers

An initial improvement is to add additional layers to our network because thses additional neurons might intuitively help to learn more complex patterns in the data.

In [25]:
import tensorflow as tf
from tensorflow import keras

# network and training.
EPOCHS = 50
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10          # number of outputs = number of digits
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2   # how much TRAIN is reserved for VALIDATION

# Loading MNIST dataset.
# Labels have one-hot representation.
mnist = keras.datasets.mnist
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

# X_train is 60000 rows of 28x28 values; we reshape it to 600000 x 784.
RESHAPED = 784

X_train = X_train.reshape(60000, RESHAPED)
X_test = X_test.reshape(10000, RESHAPED)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

# Normalize inputs to be within [0, 1]
X_train, X_test = X_train / 255.0, X_test / 255.0
print(X_train.shape[0], ' = train samples')
print(X_test.shape[0], ' = test samples')

# Labels have one-hot representation
Y_train = tf.keras.utils.to_categorical(Y_train, NB_CLASSES)
Y_test = tf.keras.utils.to_categorical(Y_test, NB_CLASSES)

# Build the model.
model = tf.keras.models.Sequential()
model.add(keras.layers.Dense(N_HIDDEN,
                             input_shape=(RESHAPED,),
                             name='dense_layer', activation='relu'))
model.add(keras.layers.Dense(N_HIDDEN,
                             name='dense_layer_2', activation='relu'))
model.add(keras.layers.Dense(NB_CLASSES,
                             name='dense_layer_3', activation='softmax'))

# Summary of the model.
model.summary()

# Compiling the mode.
model.compile(optimizer='SGD',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Training the model.
model.fit(X_train, Y_train,
          batch_size=BATCH_SIZE, epochs=EPOCHS,
          verbose=VERBOSE, validation_split=VALIDATION_SPLIT)

# Evaluting the model
test_loss, test_acc = model.evaluate(X_test, Y_test)
print('Test accuracy: ', test_acc)

60000  = train samples
10000  = test samples
Model: "sequential_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_layer (Dense)         (None, 128)               100480    
                                                                 
 dense_layer_2 (Dense)       (None, 128)               16512     
                                                                 
 dense_layer_3 (Dense)       (None, 10)                1290      
                                                                 
Total params: 118282 (462.04 KB)
Trainable params: 118282 (462.04 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
Epoch 1/50


2024-06-21 04:44:41.997093: W external/local_tsl/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 150528000 exceeds 10% of free system memory.


Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Test accuracy:  0.9656000137329102


### Further improving the the simple tensorflow

- Add a dropout to the the model, this is a simple means of heling regularizing our neural network

In [None]:
import tensorflow as tf
import numpy as np
from tensorflow import keras

# Networking and training
EPOCHS = 200
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10          # number of outputs = 10 digits
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2   # how much of the dataset to reserve for validation
DROPOUT = 0.3            # rate of drop_outs for out dataser

# Loading our MNIST
# Labels have one-hot encoding
mnist = keras.datasets.mnist
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

# X_train is 60000 rows of 28x28 values; we reshape it to 60000 x 784
RESHAPED = 784

X_train = X_train.reshape(60000, RESHAPED)
X_test = X_test.reshape(10000, RESHAPED)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

# Normalize inputs within [0, 1].
X_train, X_test = X_train / 255.0, X_test / 255.0
print(X_train.shape[0], ': train samples')
print(X_test.shape[0], ': test samples')

# One-hot representations for labels.
Y_train = tf.keras.utils.to_categorical(Y_train, NB_CLASSES)
Y_test = tf.keras.utils.to_categorical(Y_test, NB_CLASSES)

# Building the model.
model = tf.keras.models.Sequential()
model.add(keras.layers.Dense(N_HIDDEN,
                             input_shape=(RESHAPED,),
                             name='dense_layer', activation='relu'))
model.add(keras.layers.Dropout(DROPOUT))
model.add(keras.layers.Dense(N_HIDDEN,
                             name='dense_layer_2', activation='relu'))
model.add(keras.layers.Dropout(DROPOUT))
model.add(keras.layers.Dense(NB_CLASSES,
                             name='dense_layer_3', activation='softmax'))

# Summary of the model.
model.summary()

# Compiling the model.
model.compile(optimizer='SGD',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Training the model.
model.fit(X_train, Y_train,
          batch_size=BATCH_SIZE, epochs=EPOCHS,
          verbose=VERBOSE, validation_split=VALIDATION_SPLIT)


# Evaluating the model.
test_loss, test_acc = model.evaluate(X_test, Y_test)
print('Test accuracy:', test_acc)

60000 : train samples
10000 : test samples
Model: "sequential_16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_layer (Dense)         (None, 128)               100480    
                                                                 
 dropout_8 (Dropout)         (None, 128)               0         
                                                                 
 dense_layer_2 (Dense)       (None, 128)               16512     
                                                                 
 dropout_9 (Dropout)         (None, 128)               0         
                                                                 
 dense_layer_3 (Dense)       (None, 10)                1290      
                                                                 
Total params: 118282 (462.04 KB)
Trainable params: 118282 (462.04 KB)
Non-trainable params: 0 (0.00 Byte)
____________________________________