<a href="https://colab.research.google.com/github/lamiaehana/Udacity-Intro-to-TenserFlow-for-deep-learning/blob/master/Lesson5_Going_Further_on_CNN_Note.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Softmax and Sigmoid
In the previous Colab, we used the following the CNN architecture:

In [0]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),

    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),

    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),

    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(2, activation='softmax')
])

Notice that our last layer (our classifier) consists of a **Dense** layer with **2** output units and a **softmax** activation function, as seen below:

In [0]:
tf.keras.layers.Dense(2, activation='softmax')

Another popular approach when working with binary classification problems, is to use a classifier that consists of a **Dense** layer with **1** output unit and a **sigmoid** activation function, as seen below:

In [0]:
 tf.keras.layers.Dense(1, activation='sigmoid')

Either of these two options will work well in a binary classification problem. However, you should keep in mind, that if you decide to use a **sigmoid** activation function in your classifier, you will also have to change the **loss** parameter in the **model.compile()** method, from **'sparse_categorical_crossentropy'** to **'binary_crossentropy'**, as shown below:

In [0]:
model.compile(optimizer='adam', 
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Other Techniques to Prevent Overfitting
In this lesson we saw three different techniques to prevent overfitting:


*   **Early Stopping:** In this method, we track the loss on the validation set during the training phase and use it to determine when to stop training such that the model is accurate but not overfitting.
*   **Image Augmentation:** Artificially boosting the number of images in our training set by **applying random image transformations** to the existing images in the training set.
*   **Dropout:** Removing a random selection of a fixed number of neurons in a neural network during training.

However, these are not the only techniques available to prevent overfitting. You can read more about these and other techniques in the link below:
[Memorizing is not learning! — 6 tricks to prevent overfitting in machine learning](https://https://hackernoon.com/memorizing-is-not-learning-6-tricks-to-prevent-overfitting-in-machine-learning-820b091dc42)

# Summary
In this lesson we learned how Convolutional Neural Networks work with color images and saw various techniques that we can use to <u>avoid overfitting</u> . The main key points of this lesson are:

CNNs with RGB Images of Different Sizes:
*   **Resizing:** When working with images of different sizes, you must resize all the images to the same size so that they can be fed into a CNN.
*   **Color Images:** Computers interpret color images as 3D arrays.
*   **RGB Image:** Color image composed of 3 color channels: Red, Green, and Blue.
*   **Convolutions:** When working with RGB images we convolve each color channel with its own convolutional filter. Convolutions on each color channel are performed in the same way as with grayscale images, i.e. by performing element-wise multiplication of the convolutional filter (kernel) and a section of the input array. The result of each convolution is added up together with a bias value to get the convoluted output.
*   **Max Pooling:** When working with RGB images we perform max pooling on each color channel using the same window size and stride. Max pooling on each color channel is performed in the same way as with grayscale images, i.e. by selecting the max value in each window.
*   **Validation Set:** We use a validation set to check how the model is doing during the training phase. Validation sets can be used to perform Early Stopping to prevent overfitting and can also be used to help us compare different models and choose the best one.

Methods to Prevent Overfitting:

*   **Early Stopping:** In this method, we track the loss on the validation set during the training phase and use it to determine when to stop training such that the model is accurate but not overfitting.
*   **Image Augmentation:** Artificially boosting the number of images in our training set by applying random image transformations to the existing images in the training set.
*   **Dropout:** Removing a random selection of a fixed number of neurons in a neural network during training.

You also created and trained a Convolutional Neural Network to classify images of Dogs and Cats with and without Image Augmentation and Dropout. You were able to see that Image Augmentation and Dropout greatly reduces overfitting and improves accuracy. As an exercise, you were able to apply everything you learned in this lesson to create your own CNN to classify images of flowers.









