## First look at a neural network
The MNIST dataset comes proloaded in Keras -- an ML library, in the form of four NumPy arrays

In [4]:
from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

### Workflow
1. We'll feed the neural network the training data
2. We'll ask the network to produce predictions for test_images
3. We'll verify whether these predictions match the labels from test_labels

The core building block of neural networks is the *layer*. 

In [5]:
from tensorflow import keras
from tensorflow.keras import layers

model = keras.Sequential([
    layers.Dense(512, activation='relu'),
    layers.Dense(10, activation='softmax')
])

### Compilation
Our model is now ready to be configured for training, so three more things need to be tuned for the *compilation* step.
1. An *optimizer*: how will the model update itself, based on the training data it sees?
2. A *loss function*: how will the model measure performance on the training data?
3. *Metrics* to monitor training and testing: things like accurary

In [6]:
model.compile(optimizer='rmsprop',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

### Preprocessing
We need to make sure the data is in the shape the model expects, and everything is scaled in the [0, 1] interval. 
What does the model expect?
- The model expects the input shape to be (60000, 28*28) for each image, and the pixel values should be scaled to the [0, 1] interval.
- But right now, our model is stored in an array of share (60000, 28, 28) and values are in [0, 255] interval.

In [7]:
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32') / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255

### Fitting the model
Now we're ready to train the model, this is simply done using the `fit()` method.

In [8]:
model.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.8709 - loss: 0.4453
Epoch 2/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.9654 - loss: 0.1230
Epoch 3/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.9788 - loss: 0.0724
Epoch 4/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.9847 - loss: 0.0524
Epoch 5/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 6ms/step - accuracy: 0.9881 - loss: 0.0387


<keras.src.callbacks.history.History at 0x24c39ad0b90>

### Evaluating the model

In [26]:
test_digit = test_images[0].reshape((1, 28 * 28))
print("Real label:", test_labels[0])
prediction = model.predict(test_digit)
print(prediction)
print("Predicted label:", prediction.argmax())

Real label: 7
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 25ms/step
[[3.3879812e-07 2.2193107e-09 1.7063499e-05 2.3700810e-05 7.1208581e-11
  1.6858342e-08 4.7437131e-11 9.9995625e-01 4.2810456e-07 2.1193653e-06]]
Predicted label: 7


In [27]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9740 - loss: 0.0850
Test accuracy: 0.9776999950408936


In [11]:
import pickle
pickle.dump(model, open('nn_model.pkl', 'wb'))
pickle.dump(train_images, open('standardize_model.pkl', 'wb'))