<a href="https://colab.research.google.com/github/PatrickP-Student/-enel645Team25Assignment4/blob/model-davis/assignment04.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Assignment 04 - Garbage Bin Classification problem 


This assignment is the continuation of assignment 01, where the teams were tasked with designing a garbage classification system.

For the first assignmnet 4,068 images were collected. With the following distribution:

- **Blue bin** (recyclable): 2,398 images
- **Green bin** (compostable):      826 images
- **Black bin** (landfill):         844 images 


For this assignment, your team needs to develop/implement/code the garbage classification model. You are free to use any technique seen in class that you want (*e.g.*, CNNs, transfer learning, etc.). You will have access only to the development set. The TAs will run your code on the test set to extract the accuracy and confusion matrix metrics.

The development set can be downloaded here:

- [OneDrive](https://uofc-my.sharepoint.com/:u:/g/personal/roberto_medeirosdeso_ucalgary_ca/EYEMTmqSm9RGodAIQDKB5lwBp2xyWtNm8qQ0wj7JV2XiPA?e=1xhhDh) - Link expires March 10th, 2021.
- [GDrive](https://drive.google.com/file/d/1-q56xKd4yEsFo5xwz5Rd1Zn_lyCXLaMU/view?usp=sharing)

The data has already been pre-processed for you. Images were resized to 512 x 400 pixels and converted to PNG. Be mindful that a considerable number of samples in the development set may have been incorrectly labelled. Your team is free to fix some of the labels if you think this will help to develop your model. [See what goes where](https://www.calgary.ca/uep/wrs/what-goes-where/default.html) to get information about the labels.


The Jupyter Notebook should be divided into two parts: 1. Model development; 2. Model testing. The model development will be run by you, while the model testing will be run by the TAs when grading the assignment.


The deliverables for this assignment are:

1. This jupyter-notebook completed with your solution. 
    - Name the notebook as enel645_assignment04_team_(team number).ipynb
2. The weights of your best model after training. 
    - Name the weights' file as team_(team number)_garbage.h5 


Submit the two files (notebook + models' weights) to your dropbox on the course D2L page.
    
You are free to add extra cells of text and code to this notebook. You are free to use the TALC cluster to train your model, but remember that your code should be submitted as a Jupyter Notebook and not ".py" file.

Please include a short description of what each team member did in the assignment at the end of the notebook. Also include the consensus score between 0 and 3 for each team member. This score will be used to adjust the final grade of each student. Students developing the project individually do not need this description and score.

You are being assessed based on:

1. Code execution - 20% 
2. Clarity of the code (e.g., easy to follow, has pertinent comments, etc.) - 20%
3. Proper usage of the techniques seen in class - 30% 
4. Accuracy of the models  - 30%


## 1. Model development;

In [None]:
# Develop your model here feel free to add additional cells
# Comment and justify your choices as much as you can.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf

In [3]:
# specify the paths to the training and validation datasets
train_path = '/content/drive/My Drive/Colab Notebooks/Dataset/Train'
val_path = '/content/drive/My Drive/Colab Notebooks/Dataset/Val'

In [4]:
from keras.preprocessing.image import ImageDataGenerator

# specify our model constant settings
batch_size = 32
seed = 17
input_size_3d = (512, 400, 3)
input_size_2d = (512, 400)

In [5]:
# augment, load, and encode our training data set classes
train_data_gen = ImageDataGenerator(rotation_range=20, 
                                    horizontal_flip=True, 
                                    height_shift_range=0.1)

train_generator = train_data_gen.flow_from_directory(train_path,
                                                    batch_size=batch_size,
                                                    seed=seed,
                                                    target_size=input_size_2d,
                                                    class_mode='categorical')

Found 3391 images belonging to 3 classes.


In [6]:
# augment, load, and encode our validation data set classes
val_data_gen = ImageDataGenerator(rotation_range=20, 
                                  horizontal_flip=True, 
                                  height_shift_range=0.1)

val_generator = train_data_gen.flow_from_directory(val_path,
                                                  batch_size=batch_size,
                                                  seed=seed,
                                                  target_size=input_size_2d,
                                                  class_mode='categorical')

Found 377 images belonging to 3 classes.


In [7]:
# lets look at our data in terms of image shape, number of channels, and dimensions
print('train classes:',train_generator.class_indices)
print('validation classes:',val_generator.class_indices)

h, w, r = train_generator.image_shape
print('There are', train_generator.samples, 'images for training the model')
print('~', round(train_generator.samples/train_generator.num_classes,0), 'images per category')
print('The shape of each image is', train_generator.image_shape)
print('The width is', w)
print('The height is', h)
print('And each pixel has a value for each component of RGB for a total of', r, 'channels')

train classes: {'black': 0, 'blue': 1, 'green': 2}
validation classes: {'black': 0, 'blue': 1, 'green': 2}
There are 3391 images for training the model
~ 1130.0 images per category
The shape of each image is (512, 400, 3)
The width is 400
The height is 512
And each pixel has a value for each component of RGB for a total of 3 channels


In [8]:
# lets look at our train validation split
num_train_samples = int(train_generator.samples)
num_val_samples = int(val_generator.samples)

print(num_train_samples, 'images in the train data set')
print(num_val_samples, 'images in the validation data set')
print('Split is, train:', round(num_train_samples/(num_train_samples + num_val_samples), 2)*100, '% validation:', round(num_val_samples/(num_train_samples + num_val_samples), 2)*100, '%')

3391 images in the train data set
377 images in the validation data set
Split is, train: 90.0 % validation: 10.0 %


### 1.2 Model Callbacks

In [9]:
model_name = "densenet_garbageBin_cnn.h5"
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience = 20)

monitor = tf.keras.callbacks.ModelCheckpoint(model_name, monitor='val_loss',\
                                             verbose=0,save_best_only=True,\
                                             save_weights_only=True,\
                                             mode='min')
# Learning rate schedule
def scheduler(epoch, lr):
    if epoch%4 == 0 and epoch!= 0:
        lr = lr/2
    return lr

lr_schedule = tf.keras.callbacks.LearningRateScheduler(scheduler,verbose = 0)

### 1.3 Transfer Learning with Keras

In [10]:
# Transfer learning using DenseNet121 model
base_model = tf.keras.applications.DenseNet121(
    weights='imagenet',  # Load weights pre-trained on ImageNet.
    input_shape=input_size_3d,
    include_top=False) 

base_model.trainable = False

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/densenet/densenet121_weights_tf_dim_ordering_tf_kernels_notop.h5


### 1.4 Adding Prediction Layer

In [11]:
input_image = tf.keras.Input(shape=input_size_3d)
x1 = base_model(input_image, training=False)
x2 = tf.keras.layers.Flatten()(x1)
out = tf.keras.layers.Dense(3, activation='softmax')(x2)
model = tf.keras.Model(inputs = input_image, outputs = out)

print(model.summary())

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         [(None, 512, 400, 3)]     0         
_________________________________________________________________
densenet121 (Functional)     (None, 16, 12, 1024)      7037504   
_________________________________________________________________
flatten (Flatten)            (None, 196608)            0         
_________________________________________________________________
dense (Dense)                (None, 3)                 589827    
Total params: 7,627,331
Trainable params: 589,827
Non-trainable params: 7,037,504
_________________________________________________________________
None


### 1.5 Training

In [12]:
model.compile(optimizer=tf.keras.optimizers.Adam(lr = 1e-4), loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(train_generator, steps_per_epoch=3391/batch_size,
          epochs=100, verbose=1,
          callbacks=[early_stop, monitor, lr_schedule],
          validation_data=val_generator, validation_steps=377/batch_size)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100


<tensorflow.python.keras.callbacks.History at 0x7fca9532d050>

### 1.6 Unfreeze base model and train

In [13]:
base_model = tf.keras.applications.DenseNet121(
    weights='imagenet',  # Load weights pre-trained on ImageNet.
    input_shape=input_size_3d,
    include_top=False) 

base_model.trainable = True

input_image = tf.keras.Input(shape=input_size_3d)
x1 = base_model(input_image, training=True)
x2 = tf.keras.layers.Flatten()(x1)
out = tf.keras.layers.Dense(3, activation='softmax')(x2)
model = tf.keras.Model(inputs = input_image, outputs = out)

print(model.summary())


model.compile(optimizer=tf.keras.optimizers.Adam(lr = 1e-6), loss='categorical_crossentropy', metrics=['accuracy'])
model.load_weights(model_name)


Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_4 (InputLayer)         [(None, 512, 400, 3)]     0         
_________________________________________________________________
densenet121 (Functional)     (None, 16, 12, 1024)      7037504   
_________________________________________________________________
flatten_1 (Flatten)          (None, 196608)            0         
_________________________________________________________________
dense_1 (Dense)              (None, 3)                 589827    
Total params: 7,627,331
Trainable params: 7,543,683
Non-trainable params: 83,648
_________________________________________________________________
None


ValueError: ignored

In [None]:
model.fit(train_generator, steps_per_epoch=3391/batch_size,
          epochs=100, verbose=1,
          callbacks=[early_stop, monitor, lr_schedule],
          validation_data=val_generator, validation_steps=377/batch_size)

## 2. Model Testing

In [None]:
# You are free to adapt this portion of the code, but you should 
# compute the test accuracy and show the images that 
# were classified incorrectly
test_data_dir = "/media/roberto/f5da97cf-b92d-484c-96e9-15766931cebe/Garbage-classification/Dataset-curated/Resized/Test/"

model.load_weights(model_name)

test_datagen = tf.keras.preprocessing.image.ImageDataGenerator(rescale = 1/255.0) 
test_generator = test_datagen.flow_from_directory(test_data_dir,
                                                    target_size=(img_height, img_width),
                                                    batch_size= batch_size,shuffle = False)
nb_samples = len(glob.glob(test_data_dir + "*/*"))
model.evaluate(test_generator)

In [None]:
img = []
true_label = []
pred_label = []
for ii in range(nb_samples//batch_size):
    Xbatch,Ybatch = test_generator.__getitem__(ii)
    Ybatch = Ybatch.argmax(axis = 1)
    Ypred = model.predict(Xbatch).argmax(axis = 1)
    wrong_indexes = np.where(Ypred != Ybatch)[0]
    for ii in wrong_indexes:
        img.append(Xbatch[ii])
        true_label.append(Ybatch[ii])
        pred_label.append(Ypred[ii])

columns = 4
rows = len(img)//columns + 1    
plt.figure(figsize = (32,64))
for ii in range(len(img)):
    plt.subplot(rows,columns,ii+1)
    plt.imshow(img[ii], cmap = "gray")
    plt.axis("off")
    plt.title("Label: %s, predicted: %ss" %(class_names[true_label[ii]]\
                                            ,class_names[pred_label[ii]]))
plt.show()

##  Team members participtaion
(include the description of what each team member did and the consensus score for each team member)

- **Arya Stark** helped design the model and write the code for fully connected model (**score 3**)
- **Luke Skywalker** helped design helped to implement the data augmentation module (**score 3**)
- ...