In [1]:
import numpy as np
import tensorflow as tf

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Flatten

### ***Neurons for Vision***

The first, Flatten, isn’t a layer of neurons, but an input layer specification. Our
inputs are 28 × 28 images, but we want them to be treated as a series of numeric values.

The next one, Dense, is a layer of neurons, and we’re specifying that we want 128 of
them. This is the middle layer. You’ll often hear such layers described as hidden layers.
Layers that are between the inputs and the outputs aren’t seen by a caller, so the term
“hidden” is used to describe them. We’re asking for 128 neurons to have their internal
parameters randomly initialized. Often the question I’ll get asked at this point is
“Why 128?” This is entirely arbitrary—there’s no fixed rule for the number of neurons
to use. As you design the layers you want to pick the appropriate number of values to
enable your model to actually learn. More neurons means it will run more slowly, as it
has to learn more parameters. More neurons could also lead to a network that is great at
recognizing the training data, but not so good at recognizing data that it hasn’t previously
seen. On the other hand, fewer neurons' means that the model might not have sufficient
parameters to learn.

****note****

***The activation function is code that will execute on each neuron in the layer.
TensorFlow supports a number of them, but a very common one in middle layers is relu, which
stands for rectified linear unit. It’s a simple function that just returns a value if it’s
greater than 0. In this case, we don’t want negative values being passed to the next
layer to potentially impact the summing function, so instead of writing a lot of
if-then code, we can simply activate the layer with relu.***

In [None]:
flatter = Flatten(input_shape=(28, 28))
layer_1 = Dense(128, activation=tf.nn.relu)
layer_2 = Dense(10, activation=tf.nn.softmax)

model = Sequential(layers=[])
