<a href="https://colab.research.google.com/github/SyedMuhammadMuhsinKarim/Deep-Learning-with-Python/blob/master/A_first_look_at_a_neural_network.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **A first look at a neural network**

**Note on classes and labels**

In machine learning, 
* a category in a classification problem is called a class. 
* Data points are called samples.
* The class associated with a specific sample is called a label.


In [71]:
# !pip install keras.datasets

Collecting keras.datasets
  Using cached https://files.pythonhosted.org/packages/07/20/9ed10cd3247cc29c362c77c52d820928ab4f955b7e1ba9e77a288b4c5f3c/keras_datasets-0.1.0-py2.py3-none-any.whl
Installing collected packages: keras.datasets
Successfully installed keras.datasets


In [0]:
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

In [0]:
from keras.datasets import mnist 
# MNIST dataset comes preloaded in Keras
# set of four Numpy array

* `train_images` and `train_labels` form the training set, the data that the model will learn from. 
* The model will then be tested on the test set, `test_images` and `test_labels`.

In [0]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

In [0]:
# The Images are encoded as Numpy Array
# the Labels are an anrry of digits. (range: 0 to 9)
# The Images and Labels have one to one corresponding

In [76]:
train_images.shape

(60000, 28, 28)

In [77]:
train_images[0].shape

(28, 28)

In [78]:
len(train_labels)

60000

In [79]:
train_labels # return array of train_label

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

In [80]:
test_images.shape

(10000, 28, 28)

In [81]:
len(test_labels)

10000

In [82]:
test_labels # return array of test label

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

**The network architecture**

Let’s build the network—again, remember that you aren’t expected to understand
everything about this example yet.

In [0]:
from keras import models
from keras import layers

In [0]:
network = models.Sequential()

layer_one = layers.Dense(512, activation='relu', input_shape=(28 * 28,))
layer_two = layers.Dense(10, activation='softmax')

network.add(layer_one)
network.add(layer_two)

 * core building block of neural networks is the layer
 * a data-processing module that
you can think of as a filter for data. Some data goes in, and it comes out in a more useful form.
 *  Here, our network consists of a sequence of two Dense layers, which are densely
connected (also called fully connected) neural layers
 * The second (and last) layer is a
10-way softmax layer, which means it will return an array of 10 probability scores (summing to 1). Each score will be the probability that the current digit image belongs to
one of our 10 digit classes.

**The compilation step**

To make the network ready for training, we need to pick three more things, as part
of the compilation step:
* A loss function—How the network will be able to measure its performance on
the training data, and thus how it will be able to steer itself in the right direction.
* An optimizer—The mechanism through which the network will update itself
based on the data it sees and its loss function.
* Metrics to monitor during training and testing—Here, we’ll only care about accuracy (the fraction of the images that were correctly classified).
The exact purpose of the loss function and the optimizer will be made clear throughout the next two chapters.



In [0]:
network.compile(
    optimizer='rmsprop', 
    loss='categorical_crossentropy',
    metrics=['accuracy'])

* Before training, we’ll preprocess the data by reshaping it into the shape the network
expects and scaling it so that all values are in the [0, 1] interval. 
* Previously, our training images, for instance, were stored in an array of shape (60000 image, 28 row, 28 col) of type
uint8 with values in the [0, 255] interval. 
* We transform it into a float32 array of
shape (60000, 28 * 28) with values between 0 and 1.

**Preparing the image data**

In [86]:
# Train Image
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32') / 255

# Test Image
test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255

print(train_images.shape)

(60000, 784)


**Preparing the labels**

We also need to categorically encode the labels

In [0]:
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

We’re now ready to train the network, which in Keras is done via a call to the network’s fit method—we fit the model to its training data:


In [88]:
network.fit(train_images, train_labels, epochs=5, batch_size=128)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f1e8cee8be0>

Two quantities are displayed during training: the loss of the network over the training
data, and the accuracy of the network over the training data.
 We quickly reach an accuracy of 0.989 (98.9%) on the training data. Now let’s
check that the model performs well on the test set, too:

```
>>> test_loss, test_acc = network.evaluate(test_images, test_labels)
>>> print('test_acc:', test_acc)

test_acc: 0.9884
```


In [89]:
test_loss, test_acc = network.evaluate(test_images, test_labels)



In [90]:
print('test_acc:', test_acc)

test_acc: 0.9803


The **test-set accuracy** turns out to be **98.03%**—that’s quite a bit lower than the training
set accuracy. 

This **gap between training accuracy and test accuracy** is an example of
**overfitting**: the fact that machine-learning models tend to perform worse on new data
than on their training data. 
Overfitting is a central topic in chapter 3.
 
This concludes our first example—you just saw how you can build and train a neural network to classify handwritten digits in less than 20 lines of Python code. 

In the
next chapter, I’ll go into detail about every moving piece we just previewed and clarify
what’s going on behind the scenes. You’ll learn about tensors, the data-storing objects
going into the network; tensor operations, which layers are made of; and gradient
descent, which allows your network to learn from its training examples. 
