### Copy and Paste from Tensorflow Tutorial with slight modification

From this link: https://www.tensorflow.org/tutorials

pure copy and paste with comments added

In [2]:
# this line imports the tensorflow module and uses the name 'tf' to reference it
import tensorflow as tf

# tensorflow comes with utility to download common datasets used in novice ML, such as MNIST

mnist = tf.keras.datasets.mnist

# this line tells the code to download the data and cache it
(x_train, y_train),(x_test, y_test) = mnist.load_data()

# this 'normalize' the data from 0 - 255 integer value to 0.0 to 1.0 float values
x_train, x_test = x_train / 255.0, x_test / 255.0

# the model => 
# first layer converts 28x28 array into a single 786 'flattened' input
# the second layer - the hidden layer - has fully connected 512 nodes. It uses 'relu' activation. More on theory later
# there is a 'Dropout' layer that randomly zeros about the percentage of connections between the layers
# the final layer - the output layer - has 10 nodes and uses 'softmax' activation where each node represents
# the probability for each digit.  The sum of all the node value will equal 1.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

# you tell keras/tf to compile the model for use
# adam is one of the quicker fitting optimizer
# sparse_categorical_crossentropy is the 'loss' function to minimize. You are trying to minimize the weights to
# have output match the 'label' tied to the digit
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# the 'fit' command starts the training
model.fit(x_train, y_train, epochs=5)

# the 'evalate' generates how well the model did against the test data. First value is the loss and the second
# value is the accuracy

model.evaluate(x_test, y_test)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.05931941043589031, 0.9819]

### slightly modified
Because Keras can leverage backends other than Tensorflow, you should try to make the "code" more generic

See [keras_ann.ipynb](keras_ann.ipynb) that does the same but provide more information
