<a href="https://colab.research.google.com/github/shcho11/03.IBM_AIEngineering_TensorFlow/blob/main/202308_AIEngineering_finalproject.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>    

1. <a href="#item41">Download Data
2. <a href="#item42">Part 1</a>
3. <a href="#item43">Part 2</a>  
4. <a href="#item44">Part 3</a>  

</font>
    
</div>

## Download Data

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import os
os.getcwd()

'/content'

In [None]:
!wget https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0321EN/data/concrete_data_week4.zip

--2023-08-22 14:45:53--  https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0321EN/data/concrete_data_week4.zip
Resolving s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)... 67.228.254.196
Connecting to s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)|67.228.254.196|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 261483817 (249M) [application/zip]
Saving to: ‘concrete_data_week4.zip’


2023-08-22 14:46:00 (34.3 MB/s) - ‘concrete_data_week4.zip’ saved [261483817/261483817]



In [None]:
!unzip concrete_data_week4.zip

In [None]:
!ls -a

After you unzip the data, you fill find the data has already been divided into a train, validation, and test sets.

## Part 1

In this part, you will design a classifier using the VGG16 pre-trained model. Just like the ResNet50 model, you can import the model <code>VGG16</code> from <code>keras.applications</code>.

You will essentially build your classifier as follows:
1. Import libraries, modules, and packages you will need. Make sure to import the *preprocess_input* function from <code>keras.applications.vgg16</code>.
2. Use a batch size of 100 images for both training and validation.
3. Construct an ImageDataGenerator for the training set and another one for the validation set. VGG16 was originally trained on 224 × 224 images, so make sure to address that when defining the ImageDataGenerator instances.
4. Create a sequential model using Keras. Add VGG16 model to it and dense layer.
5. Compile the mode using the adam optimizer and the categorical_crossentropy loss function.
6. Fit the model on the augmented data using the ImageDataGenerators.

Use the following cells to create your classifier.

In [None]:
import os
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

#import skillsnetwork
from keras.preprocessing.image import ImageDataGenerator
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.applications import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.models import load_model

### Define Global Constants
1. We are obviously dealing with two classes, so *num_classes* is 2.
2. The VGG16 model was built and trained using images of size (224 x 224). Therefore, we will have to resize our images from (227 x 227) to (224 x 224).
3. We will training and validating the model using batches of 100 images.

In [None]:
num_classes = 2
image_resize = 224
batch_size_training = 100
batch_size_validation = 100

### Construct ImageDataGenerator Instances
- In order to instantiate an ImageDataGenerator instance, we will set the **preprocessing_function** argument to *preprocess_input* which we imported from **keras.applications.resnet50** in order to preprocess our images the same way the images used to train ResNet50 model were processed.

In [None]:
data_generator = ImageDataGenerator(
    preprocessing_function=preprocess_input,
)

- **flow_from_directory** method to get the training images as follows

In [None]:
train_generator = data_generator.flow_from_directory(
    'concrete_data_week4/train',
    target_size=(image_resize, image_resize),
    batch_size=batch_size_training,
    class_mode='categorical')

Found 30001 images belonging to 2 classes.


In [None]:
validation_generator = data_generator.flow_from_directory(
    'concrete_data_week4/valid',
    target_size=(image_resize, image_resize),
    batch_size=batch_size_validation,
    class_mode='categorical')

Found 9501 images belonging to 2 classes.


### Build, Compile and Fit Model
- In this section, we will start building our model. We will use the Sequential model class from Keras.

In [None]:
model = Sequential()

In [None]:
model.add(VGG16(
    include_top=False,
    pooling='avg',
    weights='imagenet',
    ))

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5


- Then, we will define our output layer as a Dense layer, that consists of two nodes and uses the Softmax function as the activation function.

In [None]:
model.add(Dense(num_classes, activation='softmax'))

In [None]:
model.layers

[<keras.engine.functional.Functional at 0x79ea44996ad0>,
 <keras.layers.core.dense.Dense at 0x79ea44997850>]

- You can see that our model is composed of two sets of layers. The first set is the layers pertaining to VGG16 and the second set is a single layer, which is our Dense layer that we defined above.

In [None]:
model.layers[0].layers

[<keras.engine.input_layer.InputLayer at 0x79ea46b43f70>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x79ea45791660>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x79ea45791db0>,
 <keras.layers.pooling.max_pooling2d.MaxPooling2D at 0x79ea45792da0>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x79ea45793b50>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x79ea45793c10>,
 <keras.layers.pooling.max_pooling2d.MaxPooling2D at 0x79ea44968eb0>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x79ea457937c0>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x79ea4496a830>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x79ea449697b0>,
 <keras.layers.pooling.max_pooling2d.MaxPooling2D at 0x79ea4496be80>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x79ea4496bfa0>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x79ea449693f0>,
 <keras.layers.convolutional.conv2d.Conv2D at 0x79ea44981b70>,
 <keras.layers.pooling.max_pooling2d.MaxPooling2D at 0x79ea44982c20>,
 <keras.layers.convolutional.con

- Since the VGG16 model has already been trained, then we want to tell our model not to bother with training the ResNet part, but to train only our dense output layer. To do that, we run the following.

In [None]:
model.layers[0].trainable = False

- And now using the summary attribute of the model, we can see how many parameters we will need to optimize in order to train the output layer.

In [None]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 vgg16 (Functional)          (None, 512)               14714688  
                                                                 
 dense (Dense)               (None, 2)                 1026      
                                                                 
Total params: 14,715,714
Trainable params: 1,026
Non-trainable params: 14,714,688
_________________________________________________________________


- Next we compile our model using the adam optimizer.

In [None]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

- Before we are able to start the training process, with an ImageDataGenerator, we will need to define how many steps compose an epoch. Typically, that is the number of images divided by the batch size. Therefore, we define our steps per epoch as follows:

In [None]:
steps_per_epoch_training = len(train_generator)
steps_per_epoch_validation = len(validation_generator)
num_epochs = 2

In [None]:
print(steps_per_epoch_training)
print(steps_per_epoch_validation)

301
96


- Finally, we are ready to start training our model. Unlike a conventional deep learning training were data is not streamed from a directory, with an ImageDataGenerator where data is augmented in batches, we use the fit_generator method.

In [None]:
fit_history = model.fit_generator(
    train_generator,
    steps_per_epoch=steps_per_epoch_training,
    epochs=num_epochs,
    validation_data=validation_generator,
    validation_steps=steps_per_epoch_validation,
    verbose=1,
)

  fit_history = model.fit_generator(


Epoch 1/2
Epoch 2/2


In [None]:
model.save('classifier_vgg16_model.h5')

In [None]:
!cp classifier_vgg16_model.h5 drive/MyDrive/classifier_vgg16_model2.h5

## Part 2

In this part, you will evaluate your deep learning models on a test data. For this part, you will need to do the following:

1. Load your saved model that was built using the ResNet50 model.
2. Construct an ImageDataGenerator for the test set. For this ImageDataGenerator instance, you only need to pass the directory of the test images, target size, and the **shuffle** parameter and set it to False.
3. Use the **evaluate_generator** method to evaluate your models on the test data, by passing the above ImageDataGenerator as an argument. You can learn more about **evaluate_generator** [here](https://keras.io/models/sequential/).
4. Print the performance of the classifier using the VGG16 pre-trained model.
5. Print the performance of the classifier using the ResNet pre-trained model.


Use the following cells to evaluate your models.

In [None]:
model_resnet = load_model('/content/drive/MyDrive/classifier_resnet_model2.h5')

In [None]:
batch_size_test = 100

test_generator = data_generator.flow_from_directory(
    'concrete_data_week4/test',
    target_size=(image_resize, image_resize),
    batch_size=batch_size_test,
    )

Found 500 images belonging to 2 classes.


In [None]:
score = model.evaluate_generator(test_generator, verbose=1)

  score = model.evaluate_generator(test_generator, verbose=1)




In [None]:
print("Loss VGG16: ", score[0])
print("Accuracy VGG16: ", score[1])

Loss VGG16:  0.021432356908917427
Accuracy VGG16:  0.9959999918937683


In [None]:
score_resnet = model_resnet.evaluate_generator(test_generator, verbose=1)

print("Loss ResNet50: ", score_resnet[0])
print("Accuracy ResNet50: ", score_resnet[1])

  score_resnet = model_resnet.evaluate_generator(test_generator, verbose=1)


Loss ResNet50:  0.0026115465443581343
Accuracy ResNet50:  1.0


WoW.. Accuray socre of the ResNet50 shows 1.0. This does not necessarily mean 'Better' though.

## Part 3

In this model, you will predict whether the images in the test data are images of cracked concrete or not. You will do the following:

1. Use the **predict_generator** method to predict the class of the images in the test data, by passing the test data ImageDataGenerator instance defined in the previous part as an argument. You can learn more about the **predict_generator** method [here](https://keras.io/models/sequential/).
2. Report the class predictions of the first five images in the test set. You should print something list this:

<center>
    <ul style="list-style-type:none">
        <li>Positive</li>  
        <li>Negative</li>
        <li>Positive</li>
        <li>Positive</li>
        <li>Negative</li>
    </ul>
</center>

Use the following cells to make your predictions.

In [None]:
predict = model.predict_generator(test_generator, verbose=1)

  predict = model.predict_generator(test_generator, verbose=1)




In [None]:
for i in range(5) :
  print(f"VGG16 Predict {i+1} : {predict[i]}")

VGG16 Predict 1 : [0.95831394 0.04168607]
VGG16 Predict 2 : [0.9955711  0.00442899]
VGG16 Predict 3 : [0.013183   0.98681706]
VGG16 Predict 4 : [1.5037456e-04 9.9984956e-01]
VGG16 Predict 5 : [4.7154215e-05 9.9995279e-01]


In [None]:
def predictionv1(predict, int) :

  results = []

  for i in range(int) :
    item = predict[i][1]
    if item > 0.5 :
      results.append("Negative")
    else :
      results.append("Positive")

  return results

In [None]:
def predictionv2(predict, int) :

  for i in range(int) :
    if predict[i][0] > predict[i][1]:
      print("Positive ({}% certainty)".format(round(predict[i][0] * 100, 1)))
    elif predict[i][0] < predict[i][1]:
      print("Negative ({}% certainty)".format(round(predict[i][1] * 100, 1)))
    else:
      print("Unsure (prediction split 50–50)")

In [None]:
predictionv1(predict,5)

['Positive', 'Positive', 'Negative', 'Negative', 'Negative']

In [None]:
predictionv2(predict,5)

Positive (95.8% certainty)
Positive (99.6% certainty)
Negative (98.7% certainty)
Negative (100.0% certainty)
Negative (100.0% certainty)


In [None]:
predict_resnet = model_resnet.predict(test_generator, verbose=1)

for i in range(5) :
  print(f"ResNet50 Predict {i+1} : {predict_resnet[i]}")

ResNet50 Predict 1 : [2.421712e-05 9.999758e-01]
ResNet50 Predict 2 : [2.3056782e-04 9.9976939e-01]
ResNet50 Predict 3 : [9.9981481e-01 1.8526161e-04]
ResNet50 Predict 4 : [9.9974674e-01 2.5328551e-04]
ResNet50 Predict 5 : [9.5287534e-05 9.9990475e-01]


In [None]:
predictionv1(predict_resnet,5)

['Negative', 'Negative', 'Positive', 'Positive', 'Negative']

In [None]:
predictionv2(predict_resnet,5)

Negative (100.0% certainty)
Negative (100.0% certainty)
Positive (100.0% certainty)
Positive (100.0% certainty)
Negative (100.0% certainty)
