<a href="https://cognitiveclass.ai"><img src = "https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/Logos/organization_logo/organization_logo.png" width = 400> </a>

<h1 align=center><font size = 5>Peer Review Final Assignment</font></h1>

## Introduction


In this lab, you will build an image classifier using the VGG16 pre-trained model, and you will evaluate it and compare its performance to the model we built in the last module using the ResNet50 pre-trained model. Good luck!

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>    

1. <a href="#item41">Download Data 
2. <a href="#item42">Part 1</a>
3. <a href="#item43">Part 2</a>  
4. <a href="#item44">Part 3</a>  

</font>
    
</div>

<a id="item41"></a>

## Download Data

Use the <code>wget</code> command to download the data for this assignment from here: https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0321EN/data/concrete_data_week4.zip

Use the following cells to download the data.

In [1]:
# in Anaconda
import wget
data_file = wget.download('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0321EN/data/concrete_data_week4.zip')

In [2]:
# in Anaconda
from zipfile import ZipFile
with ZipFile(data_file, 'r') as zipObj:
   # Extract all the contents of zip file in current directory
   zipObj.extractall()

After you unzip the data, you fill find the data has already been divided into a train, validation, and test sets.

<a id="item42"></a>

## Part 1

In this part, you will design a classifier using the VGG16 pre-trained model. Just like the ResNet50 model, you can import the model <code>VGG16</code> from <code>keras.applications</code>.

You will essentially build your classifier as follows:
1. Import libraries, modules, and packages you will need. Make sure to import the *preprocess_input* function from <code>keras.applications.vgg16</code>.
2. Use a batch size of 100 images for both training and validation.
3. Construct an ImageDataGenerator for the training set and another one for the validation set. VGG16 was originally trained on 224 × 224 images, so make sure to address that when defining the ImageDataGenerator instances.
4. Create a sequential model using Keras. Add VGG16 model to it and dense layer.
5. Compile the mode using the adam optimizer and the categorical_crossentropy loss function.
6. Fit the model on the augmented data using the ImageDataGenerators.

Use the following cells to create your classifier.

In [6]:
# 1. Import Libraries and Packages
import keras
from keras.models import Sequential
#from keras.models import load_model
from keras.layers import Dense
from keras.preprocessing.image import ImageDataGenerator
from keras.applications import VGG16
from keras.applications.vgg16 import preprocess_input

In [7]:
# 2. Define Global Constants
num_classes = 2
image_resize = 224
batch_size_training = 100
batch_size_validation = 100

In [8]:
# 3. Construct an ImageDataGenerator 
data_generator = ImageDataGenerator(
    preprocessing_function=preprocess_input,
)

In [9]:
# for training & validation, can use clase_mode as negative/positive images are distributed into separate subderectories
train_generator = data_generator.flow_from_directory(
    'concrete_data_week4/train',
    target_size=(image_resize, image_resize),
    batch_size=batch_size_training,
    class_mode='categorical')

Found 30001 images belonging to 2 classes.


In [10]:
validation_generator = data_generator.flow_from_directory(
    'concrete_data_week4/valid',
    target_size=(image_resize, image_resize),
    batch_size=batch_size_validation,
    class_mode='categorical')

Found 9501 images belonging to 2 classes.


In [11]:
# 4. Build model
model_vgg = Sequential()
model_vgg.add(VGG16(
    include_top=False,
    pooling='avg',
    weights='imagenet',
    ))

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5


In [12]:
# 5. Compile model
model_vgg.add(Dense(num_classes, activation='softmax'))
model_vgg.layers[0].trainable = False # do not train layer[0] which is pre trained VGG16 (all except top layer)
model_vgg.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
dense (Dense)                (None, 2)                 1026      
Total params: 14,715,714
Trainable params: 1,026
Non-trainable params: 14,714,688
_________________________________________________________________


In [13]:
model_vgg.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

In [14]:
# 6. Fit model
# Typically, steps is the number of images divided by the batch size (already in generator)
steps_per_epoch_training = len(train_generator)
steps_per_epoch_validation = len(validation_generator)
num_epochs = 2

In [18]:
print(steps_per_epoch_training)
print(steps_per_epoch_validation)

301
96


In [19]:
# fit_generator() is depreciated, fit() is used instead and generator is used insted of X and Y
fit_history_vgg = model_vgg.fit(
    train_generator,
    steps_per_epoch=steps_per_epoch_training,
    epochs=num_epochs,
    validation_data=validation_generator,
    validation_steps=steps_per_epoch_validation,
    verbose=1,
)

Epoch 1/2
Epoch 2/2


In [20]:
# Returned "history" object holds a record of the loss values and metric values during training
fit_history_vgg.history

{'loss': [0.14675107598304749, 0.02152850665152073],
 'accuracy': [0.9462018013000488, 0.9950668215751648],
 'val_loss': [0.028304990381002426, 0.016339825466275215],
 'val_accuracy': [0.9933691024780273, 0.9961056709289551]}

In [21]:
model_vgg.save('classifier_vgg_model.h5')

<a id="item43"></a>

## Part 2

In this part, you will evaluate your deep learning models on a test data. For this part, you will need to do the following:

1. Load your saved model that was built using the ResNet50 model. 
2. Construct an ImageDataGenerator for the test set. For this ImageDataGenerator instance, you only need to pass the directory of the test images, target size, and the **shuffle** parameter and set it to False.
3. Use the **evaluate_generator** method to evaluate your models on the test data, by passing the above ImageDataGenerator as an argument. You can learn more about **evaluate_generator** [here](https://keras.io/models/sequential/).
4. Print the performance of the classifier using the VGG16 pre-trained model.
5. Print the performance of the classifier using the ResNet pre-trained model.


Use the following cells to evaluate your models.

In [39]:
# 1. Load ResNet50
#model_res = tf.keras.models.load_model('classifier_resnet_model.h5')
# or
#from keras.models import load_model
#model_res = load_model('classifier_resnet_model.h5')
model_res = keras.models.load_model('classifier_resnet_model.h5')

In [33]:
# 2. Test Generator

# use class_mode='categorical' as normally we pass inputs and targets in testing
test_generator = data_generator.flow_from_directory(
    'concrete_data_week4/test',
    target_size=(image_resize, image_resize),
    batch_size=1,
    class_mode='categorical',
    shuffle=False)
"""
# option with class_mode=None ???
test_generator = data_generator.flow_from_directory(
    'concrete_data_week4/test',
    target_size=(image_resize, image_resize),
    batch_size=1,
    class_mode=None,
    shuffle=False)
"""

Found 500 images belonging to 2 classes.


"\n# option with class_mode=None\ntest_generator = data_generator.flow_from_directory(\n    'concrete_data_week4/test',\n    target_size=(image_resize, image_resize),\n    batch_size=1,\n    class_mode=None,\n    shuffle=False)\n"

In [34]:
# 3. Evaluate
# Typically, steps is the number of images divided by the batch size (already in generator)
steps_testing = len(test_generator)

In [35]:
print(steps_testing)

500


In [36]:
# evaluate_generator() is depreciated, evaluate() is used instead and generator is used insted of X and Y
eval_results_vgg = model_vgg.evaluate(
    test_generator,
    steps = steps_testing,
    verbose=1,
)



In [40]:
# for ResNet
data_generator_res = ImageDataGenerator(
    preprocessing_function=keras.applications.resnet50.preprocess_input,
)

In [41]:
test_generator_res = data_generator_res.flow_from_directory(
    'concrete_data_week4/test',
    target_size=(image_resize, image_resize),
    batch_size=1,
    class_mode='categorical',
    shuffle=False)

Found 500 images belonging to 2 classes.


In [42]:
eval_results_res = model_res.evaluate(
    test_generator_res,
    steps = steps_testing,
    verbose=1,
)



In [43]:
# 4. Results for VGG
print("test loss, test acc:", eval_results_vgg)

test loss, test acc: [0.01485713291913271, 0.9959999918937683]


In [44]:
# 5. Results for ResNet
print("test loss, test acc:", eval_results_res)

test loss, test acc: [0.001972020138055086, 1.0]


<a id="item44"></a>

## Part 3

In this model, you will predict whether the images in the test data are images of cracked concrete or not. You will do the following:

1. Use the **predict_generator** method to predict the class of the images in the test data, by passing the test data ImageDataGenerator instance defined in the previous part as an argument. You can learn more about the **predict_generator** method [here](https://keras.io/models/sequential/).
2. Report the class predictions of the first five images in the test set. You should print something list this:

<center>
    <ul style="list-style-type:none">
        <li>Positive</li>  
        <li>Negative</li> 
        <li>Positive</li>
        <li>Positive</li>
        <li>Negative</li>
    </ul>
</center>

Use the following cells to make your predictions.

In [45]:
# Let's create dedicated generator
predict_generator = data_generator.flow_from_directory(
    'concrete_data_week4/test',
    target_size=(image_resize, image_resize),
    batch_size=1,
    class_mode=None,
    shuffle=False)

Found 500 images belonging to 2 classes.


class_mode: 

One of "categorical", "binary", "sparse", "input", or None. Default: "categorical". 
Determines the type of label arrays that are returned:
- "categorical" will be 2D one-hot encoded labels,
- "binary" will be 1D binary labels, "sparse" will be 1D integer labels,
- "input" will be images identical to input images (mainly used to work with autoencoders).
- If None, no labels are returned (the generator will only yield batches of image data, which is useful to use with model.predict_generator()).

Please note that in case of class_mode None, the data still needs to reside in a subdirectory of directory for it to work correctly.

In [46]:
# 1. Predictions

# we can use test_generator, just needs to be reset.
#test_generator.reset()

# in this example, useing dedicated generator

predict_generator.reset()
# Typically, steps is the number of images divided by the batch size (already in generator)
steps_predicting = len(predict_generator)

In [47]:
print(steps_predicting)

500


In [48]:
# evaluate_generator() is depreciated, evaluate() is used instead and generator is used insted of X and Y
predict_results_vgg = model_vgg.predict(
    predict_generator,
    steps = steps_predicting,
    verbose=1,
)



In [49]:
# 2. Report prediction results
# let's use NumPy and Pandas for assigning lables and result presentation
import numpy as np
import pandas as pd

In [50]:
# prediction is probability between 0 and 1
predict_class_indices=np.argmax(predict_results_vgg,axis=1)

In [51]:
# lablel of the class
labels = (train_generator.class_indices)
labels = dict((v,k) for k,v in labels.items())
predict_lables = [labels[k] for k in predict_class_indices]

In [52]:
filenames=test_generator.filenames
predict_results=pd.DataFrame({"Filename":filenames,
                      "Predictions":predict_lables})

In [54]:
predict_results.head()

Unnamed: 0,Filename,Predictions
0,negative\19751.jpg,negative
1,negative\19752.jpg,negative
2,negative\19753.jpg,negative
3,negative\19754.jpg,negative
4,negative\19755.jpg,negative


### Thank you for completing this lab!

This notebook was created by Alex Aklson.

This notebook is part of a course on **Coursera** called *AI Capstone Project with Deep Learning*. If you accessed this notebook outside the course, you can take this course online by clicking [here](https://cocl.us/DL0321EN_Coursera_Week4_LAB1).

<hr>

Copyright &copy; 2020 [IBM Developer Skills Network](https://cognitiveclass.ai/?utm_source=bducopyrightlink&utm_medium=dswb&utm_campaign=bdu). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/).