<a href="https://colab.research.google.com/github/abhixshek/chollet-deep-learning/blob/master/chollet_ch2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [37]:
from tensorflow import keras

In [38]:
keras.__version__

'3.8.0'

## The mathematical building blocks of neural networks

The problem we’re trying to solve here is to classify grayscale images of handwritten digits (28 × 28 pixels) into their 10 categories (0 through 9). We’ll use the MNIST
dataset, a classic in the machine learning community, which has been around almost
as long as the field itself and has been intensively studied. It’s a set of 60,000 training
images, plus 10,000 test images, assembled by the National Institute of Standards and
Technology (the NIST in MNIST) in the 1980s. You can think of “solving” MNIST as
the “Hello World” of deep learning—it’s what you do to verify that your algorithms are
working as expected.

In [17]:
from tensorflow.keras.datasets import mnist
# this import statement of mnist module also imports numpy with it. So you don't need to explicity run `import numpy` here

In [18]:
((train_images, train_labels), (test_images, test_labels)) = mnist.load_data()

In [19]:
type(train_images), type(train_labels) # numpy arrays

(numpy.ndarray, numpy.ndarray)

In [20]:
print(train_images.shape)
print(train_labels.shape)
print(test_images.shape)
print(test_labels.shape)

(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)


So this is (m, 28, 28) where `m` is the no. of examples.

In [21]:
train_labels # dtype is uint8

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

In [22]:
train_images[0] # dtype is uint8

In [23]:
train_images[0].dtype

dtype('uint8')

**The network architecture**

In [24]:
from tensorflow.keras import layers
from tensorflow import keras

In [25]:
model = keras.Sequential(
    [layers.Dense(512, activation="relu"),
     layers.Dense(10, activation="softmax")]
)

In [26]:
model.compile(optimizer="rmsprop",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"]
              )

In [27]:
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype("float32") / 255


In [28]:
train_images.shape, train_images.dtype

((60000, 784), dtype('float32'))

**"fitting" the model:**

In [29]:
model.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 10ms/step - accuracy: 0.8763 - loss: 0.4362
Epoch 2/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 8ms/step - accuracy: 0.9666 - loss: 0.1170
Epoch 3/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 8ms/step - accuracy: 0.9792 - loss: 0.0725
Epoch 4/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 9ms/step - accuracy: 0.9850 - loss: 0.0505
Epoch 5/5
[1m469/469[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 9ms/step - accuracy: 0.9886 - loss: 0.0378


<keras.src.callbacks.history.History at 0x7cafaa2b30d0>

**using the model to make predictions**

In [32]:
test_digits = test_images[:15]
predictions = model.predict(test_digits)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 57ms/step


In [33]:
predictions.shape

(15, 10)

In [35]:
predictions[0]
# Each number of index i in that array corresponds to the probability that digit image
# test_digits[0] belongs to class i.

array([2.3604947e-08, 2.0811903e-09, 1.1447105e-06, 1.1723975e-04,
       1.5974409e-11, 1.0248633e-07, 9.0721983e-13, 9.9987793e-01,
       1.4231695e-07, 3.3385757e-06], dtype=float32)

In [36]:
predictions[0].argmax()

np.int64(7)

In [40]:
predictions[0][7] # model predicts that this digit image is of class 7 with a probability of 99.9%

np.float32(0.9998779)

In [42]:
test_labels[0] # test label agrees

np.uint8(7)

In [43]:
test_loss, test_acc = model.evaluate(test_images, test_labels)

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9782 - loss: 0.0727


In [44]:
print("test_loss:", test_loss)
print("test accuracy:", test_acc)

test_loss: 0.059472016990184784
test accuracy: 0.9819999933242798


In [50]:
model.metrics_names

['loss', 'compile_metrics']

**Data representations for neural networks:**

In [49]:
print(train_images.ndim) # 2D array or rank-2 tensor
print(train_labels.ndim) # 1D array

2
1


In [52]:
import numpy as np
a = np.array([12]) #1 dimensional array
b = np.array(12) # 0 dimensional array, i.e. a scalar, rank-0 tensor

In [57]:
b

array(12)

In [55]:
5 + b

np.int64(17)

In [56]:
b + 5

np.int64(17)

In [58]:
b.ndim

0

In [60]:
b.shape # empty tuple

()