In [None]:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense

## Building the Model -

`Keras` model is built using the `Sequential` class, which represents a linear stack of layers or the `functional Model` class, which is more customizable.
We will use a `Sequential` class. The constructor takes an array of `Keras layers`. Since we are building a standard feedforward network, we only need the `Dense` layer

In [None]:
model = Sequential([
    Dense(64, activation='relu', input_shape=(28 * 28, )),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax'),
])

The first 2 layers have 64 nodes each, using a `ReLU` activation function. The last layer is a softmax output layer with 10 nodes, one for each digit.
Also, we have to mention the input shape, which is 28 X 28 in this case.

## Compiling the Model -

Before training, we have to configure the training process.

- **Optimizer** - We will use `Adam` optimizer.
- **Loss Function** - Since we are using `Softmax` output, we'll use the `cross-entropy loss`. In Keras, `binary_crossentropy` (2 classes) and `categorical_crossentropy` (>2 classes)
- **Metrics** - A list of metrics. Since this is a classification task, we can use `accuracy`.

In [None]:
model.compile(
    optimizer = 'adam',
    loss = 'categorical_crossentropy',
    metrics = ['accuracy']
)