 - Deep Learning 

- Code for all the tasks must be written in this notebook (you do not need to submit any other files).
- The output of all cells must be present in the version of the notebook you submit.
- The university honor code should be maintained. Any violation, if found, will result in disciplinary action. 

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
!ls "/content/drive/My Drive"

In [None]:
# !unzip -e "/content/drive/My Drive/UCMerced_LandUse.zip" -d "/content/drive/My Drive"

In [1]:
import numpy as np
from numpy import array
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn')
import seaborn as sns
from keras.models import load_model
from keras.applications import vgg16
import glob
import cv2
import os
import seaborn
from keras.models import Model
from keras.optimizers import Adam
from sklearn.metrics import confusion_matrix
from keras.layers import Flatten, Dense, Dropout, Input,Conv1D,Reshape
from keras import backend as K
from keras.callbacks import ModelCheckpoint, EarlyStopping, Callback

Using TensorFlow backend.


## Overview

In this assignment you will be exploring a few important concepts used in the deep learning projects:
- Working with satellite imagery data
- Dataset annotation
- Fine-tuning / Transfer Learning
- Unsupervised feature representation with Autoencoder
- Comparison of end-to-end trained model with finetuned model

We will be using two datasets, the links are provided to you. You will also be working with three pretrained models, which have been provided to you. You are **highly** encouraged to explore the datasets and model architectures in order to get the most out of this assignment. 

**_Datasets:_**
- Brick Kiln (Nepal) - available [here](https://drive.google.com/drive/folders/1dQEA0fxepVnELPnz-gAAYFb9hJSV4Azc)
- UC Merced Land Use - available [here](http://weegee.vision.ucmerced.edu/datasets/landuse.html)

**_Pretrained Models:_** 
Can be found [here](https://drive.google.com/open?id=1Ekvk3JUW3eI5sgITxzNklXrJXnCfTKBu)
- ResNet18 pretrained on Brick Kiln (Lahore) - available as `InceptionResNet-v2-2classes`
- Autoencoder pretrained on GT Cross View and fine tuned on UC Merced - available as `encoder_gt`
- VGG16 pretrained on ImageNet - available in `keras.applications` (consult relevant documentation)

## Task 1

Let's start with a binary classification problem. 

The Brick Kiln (Nepal) dataset you have been given consists of 100 tiles at zoom level 17. A script to break up these tiles into 64 sub-tiles of zoom 20 has also been given to you. Your job is to:
- Split 100 images into 6400 images using the script
- Manually annotate the dataset by moving the kiln pictures into one folder and non-kiln picutures into other folder.
- Code up a generator to properly load the images and corresponding binary labels into a model. You have to resize images into 224X224X3

*Scale images between 0 and 1 and apply mean subtraction in the generator*

*Each of you has been given unique 100 tiles, so for the love of God do not get annotated data from someone else.*

In [2]:
def preprocessing_meanShift (images):
    #Add code here
    processed_images=images
    imgs=np.stack(images)
    mean = np.mean(images[:,:,:,0]),np.mean(images[:,:,:,1]),np.mean(images[:,:,:,2])
    std = np.std(images[:,:,:,0]),np.std(images[:,:,:,1]),np.std(images[:,:,:,2])
#     print ("mean is : ",mean)
    processed_images=(processed_images-mean)/std
    return processed_images


In [None]:
kiln_path = 'kilns/'
non_kiln_path = 'non_kilns/'
batch_size=50
input_shape=(256,256,3)

kiln_paths =  glob.glob(kiln_path +'*'+'.jpg')
kiln_labels = [1 for i in range(len(kiln_paths))]     #Assigned 1 to kiln images
non_kiln_paths =  glob.glob(non_kiln_path +'*'+'.jpg')
non_kiln_labels = [0 for i in range(len(non_kiln_paths))]   # Assigned 0 to the non_kiln images
print(non_kiln_paths)
num_classes=2
all_paths = non_kiln_paths + kiln_paths
all_labels =  non_kiln_labels + kiln_labels 
# print(all_labels)

y_true = np.zeros((len(all_labels), 2))
# print(len(all_labels))


def brick_kiln_generator(all_paths,all_labels, y_true,bath_size=128):
    batch_start = 0
    num_classes=2
    batch_end = batch_size
    n=len(all_labels)
    indexes=np.arange(0,n,batch_size)
    np.random.shuffle(indexes)
#     print("indexes: ",indexes)
    
    if n % batch_size != 0:
        indexes = indexes[:-1] 
    
    
    while True:
        index=0
        for b_start in indexes:
                
            batch_image_paths = all_paths[b_start: b_start+batch_size]
            batch_labels =  all_labels[b_start: b_start+batch_size]
            
            batch_images=np.zeros((batch_size, *input_shape))
            batch_labels_enc = np.zeros((batch_size , num_classes))
            
            count=0
            for path in batch_image_paths:
                
                img=cv2.cvtColor(cv2.imread(path),cv2.COLOR_BGR2RGB)
#                 img = cv2.resize(img,(224,224))        #Image resized as 224x224x3
        
#                 print("image is: ",img.shape)

                img.astype(float)
                batch_images[count,:,:,:]=img
                
                labels= np.zeros((num_classes,),dtype='int')
                labels[batch_labels[count]]=1
                batch_labels_enc[count,:]=labels
#                 print(index)
                y_true[index,:]=labels      # This appends the true labels
                count+=1
                index+=1
                
            batch_images = preprocessing_meanShift(batch_images)
#             y_true += batch_labels_enc
            yield(batch_images,batch_labels_enc)
            print("batch has been yielded")

# brick_kiln_generator(all_paths,all_labels,batch_size)
        

## Task 2

Now you will evaluate performance of a pretrained (on Brick Kiln Lahore dataset) ResNet18 model using the generator made in Task1. You will:
- Obtain predictions for the entire dataset
- Construct a binary confusion matrix and visualize it as a heatmap

*You can use scikit-learn's `metrics.confusion_matrix` function. Consult the relevant documentation.*

In [None]:
# get confusion matrix usign built-in function 
model = load_model("/content/drive/My Drive/InceptionResNet-v2-2classes.h5")
model.summary()


In [None]:
testGen = brick_kiln_generator(all_paths,all_labels,y_true,batch_size)
y_pred=model.predict_generator(testGen, steps=len(all_paths)//batch_size,verbose=1)
# result=model.predict(testGen, batch_size, verbose=1, steps=len(all_paths)//batch_size)


In [None]:
print(len(y_pred),len(y_true))
print(len(all_paths))

In [None]:
def decodeArray(y_true,y_pred):    #Decode y_true and y_pred into a 1d array
    dy_true=[]    #Decoding into a id array
    dy_pred=[]  #Decoding into a id array
    
    for i in range(len(y_pred)):
        if(y_pred[i][0]==True):
            dy_pred.append(0)
        else:
            dy_pred.append(1)
        
    for i in range(len(y_true)):
        if(y_true[i][0]==True):
            dy_true.append(0)
        else:
            dy_true.append(1)
    
    return dy_true,dy_pred

In [None]:
y_pred = (y_pred > 0.5) 
        
dy_true,dy_pred = decodeArray(y_true,y_pred)

# print("result achieved is: ", y_true[6500:6600])
cm=confusion_matrix(dy_true[0:6586],dy_pred[0:6586])
seaborn.heatmap(cm,annot=True)
print(cm)

## Task 3

Next you will employ Transfer Learning and finetune the pretrained ResNet18 model you used in Task2 to better fit the Brick Kiln (Nepal) dataset. You will:
- Freeze everything except the FC layers and train it using the generator from Task1 (using appropriate hyperparameters)
- Construct a binary confusion matrix and visualize it as a heatmap
- Compare this confusion matrix with the one made in Task2

In [None]:
class EarlyStoppingByLossVal(Callback):
    def __init__(self, monitor='val_loss', value=0.00001, verbose=0):
        super(Callback, self).__init__()
        self.monitor = monitor
        self.value = value
        self.verbose = verbose

    def on_epoch_end(self, epoch, logs={}):
        current = logs.get(self.monitor)
        if current is None:
            warnings.warn("Early stopping requires %s available!" % self.monitor, RuntimeWarning)

        if current < self.value:
            if self.verbose > 0:
                print("Epoch %05d: early stopping THR" % epoch)
            self.model.stop_training = True
callbacks = [
    EarlyStoppingByLossVal(monitor='val_loss', value=0.002, verbose=1),
    # EarlyStopping(monitor='val_loss', patience=2, verbose=0),
]

In [None]:
#fine tuning. Dense layers are task related(classification). Freeze conv layer. Lr and parameters shall be less
#Train till convergence
resnet18_pretrained = model
# resnet18_pretrained.layers.pop()
# task3=resnet18_pretrained(weights=None, include_top=False,input_shape=input_shape, classes=2)
# resnet18_pretrained.summary()
for layer in resnet18_pretrained.layers[0:-1]:
    layer.trainable = False

In [None]:
im = Input(shape=input_shape)
l = resnet18_pretrained(im)
# print(l.shape)
# l = Flatten()(l)
# l = Dense(1024, activation='relu')(l)
# l = Dropout(0.5)(l)
# output = Dense(num_classes, activation='softmax')(l)

custom_model = Model(im, l)
# resnet18_pretrained.compile(...)

In [None]:
custom_model.summary()
adam=Adam(lr=0.001)
custom_model.compile(optimizer=adam,
              loss='mse')

In [None]:
epochs=2
train_gen=brick_kiln_generator(all_paths,all_labels,y_true,batch_size)
hist1 = custom_model.fit_generator(train_gen, epochs=epochs, steps_per_epoch=len(all_paths)//batch_size,verbose=1, callbacks=callbacks)


In [None]:
testGen = brick_kiln_generator(all_paths,all_labels,y_true,batch_size)
y_pred_cust=custom_model.predict_generator(testGen, steps=len(all_paths)//batch_size,verbose=1)

In [None]:
y_pred_cust = (y_pred_cust > 0.5) 
        
dy_true,dy_pred = decodeArray(y_true,y_pred_cust)

# print("result achieved is: ",result.shape, y_true.shape)
cm=confusion_matrix(dy_true,dy_pred)
seaborn.heatmap(cm,annot=True)
print(cm)

## Task 4

Now we will look at a multiclass classification problem.

The UC Merced Land Use dataset consists of 21 classes, ranging from airplanes to forests to tennis courts. Let's add kilns to it since you worked so hard to annotate the dataset in Task1. You will:
- Download the dataset and add a new folder (following the already existing folder structure) corresponding to brick kilns
- Code up a generator to properly load the images and corresponding 22-class labels into a model. You have to resize images into 224X224X3 for VGG16

*Scale images between 0 and 1 and apply mean subtraction in the generator*

In [None]:
# classesDir = "drive/My Drive/UCMerced_LandUse/Images"
# cdir = os.getcwd()
# os.chdir(cdir+"/"+classesDir)

In [7]:
# print(cdir)
classesDir = "drive/My Drive/UCMerced_LandUse/Images/"
# os.chdir(cdir+classesDir)
classes= glob.glob(classesDir+ '*')
print("classes: ",classes)
# os.chdir(cdir)
num_classes=len(classes)
inputShape = (224,224,3)
batch_size=79

allPaths = []
allLabels = []
count=0
for cdir in classes:
    path = glob.glob(cdir+"/"+'*')
    allPaths+=path
    labels= [count for i in range(len(path))]
    allLabels+=labels
    count+=1;
# print(allPaths)
# print(allLabels)
y_true = np.zeros((len(allLabels), num_classes))

def land_use_generator(allPaths,allLabels, y_true,num_classes,inpShape,resize=True,bath_size=128):
    batch_start = 0
    batch_end = batch_size
    n=len(allLabels)
    indexes=np.arange(0,n,batch_size)
    np.random.shuffle(indexes)
#     print("indexes: ",indexes)
    
    if n % batch_size != 0:
        indexes = indexes[:-1] 
    
    
    while True:
        index=0
        for b_start in indexes:
                
            batch_image_paths = allPaths[b_start: b_start+batch_size]
            batch_labels =  allLabels[b_start: b_start+batch_size]
            
            batch_images=np.zeros((batch_size, *inpShape))
            batch_labels_enc = np.zeros((batch_size , num_classes))
            
            count=0
            for path in batch_image_paths:
#                 print(path)
                img=cv2.cvtColor(cv2.imread(path),cv2.COLOR_BGR2RGB)
    
                if(resize==True):                          # Will only do resize for part 4 not 6
                    img = cv2.resize(img,(inpShape[0],inpShape[1]))        #Image resized as 224x224x3
        
#                 print("image is: ",img.shape)

#                 img.astype(float)
#                 print(img.shape)
                batch_images[count,:,:,:]=img
                
                labels= np.zeros((num_classes,),dtype='int')
                labels[batch_labels[count]]=1
                batch_labels_enc[count,:]=labels
#                 print(index)
                y_true[index,:]=labels      # This appends the true labels
                count+=1
                index+=1
                
            batch_images = preprocessing_meanShift(batch_images)
#             y_true += batch_labels_enc
            yield(batch_images,batch_labels_enc)
            print("batch has been yielded")
        

classes:  []


## Task 5

Next you will again employ Transfer Learning and finetune the pretrained (on ImageNet) VGG16 to better fit the modified Land Use dataset. You will:
- Change the number of nodes in the last FC layer according to the number of classes i.e. 22 
- Freeze everything except the FC layers and train it using the generator from Task4 (using appropriate hyperparameters)
- Construct a multi-class confusion matrix and visualize it as a heatmap

In [None]:
#Now different task plus data set(unlike perv). Remove last dense layer and add new layer. Activation softmax and out 
#nuber of classes. 


vgg_imagenet = vgg16.VGG16(include_top=False, weights='imagenet')
for l in vgg_imagenet.layers:
#     l.freeze = True
    l.trainable = False
# vgg_imagenet.summary()
# add new FC layers here


im = Input(shape=inputShape)
l = vgg_imagenet(im)
l = Flatten()(l)
l = Dense(512, activation='relu')(l)
l = Dropout(0.2)(l)
output = Dense(num_classes, activation='softmax')(l)

vggModel = Model(im, output)
print("model created")
# print summary and compile
vggModel.summary()
vggModel.compile(optimizer='adam',loss='mse')



In [None]:
epochs=2
train_gen2=land_use_generator(allPaths,allLabels,y_true,num_classes,(224,224,3),True,batch_size)
print(len(allPaths))
hist2 = vggModel.fit_generator(train_gen2, epochs=epochs, steps_per_epoch=len(allPaths)//batch_size,verbose=1)


In [None]:
pred_gen=land_use_generator(allPaths,allLabels,y_true,num_classes,(224,224,3),True,batch_size)
y_pred=vggModel.predict_generator(pred_gen, steps=len(allPaths)//batch_size,verbose=1)

# y_pred_cust = (y_pred_cust > 0.5) 
        
# dy_true,dy_pred = decodeArray(y_true,y_pred_cust)

# # print("result achieved is: ",result.shape, y_true.shape)
# cm=confusion_matrix(dy_true,dy_pred)
# seaborn.heatmap(cm,annot=True)
# print(cm)

In [None]:
# print(y_pred[1900:1910])

x= [[0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0. ,0.],
   [0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0. ,0.]]

# x=array(x)

# x=x.flatten()
# print(x.shape)

# cm=confusion_matrix(x,[0,1],[1,2,3,4,5,6,7,8,9,10])
# seaborn.heatmap(cm,annot=True)
# print(cm)

## Task 6

Now you will make use of Unsupervised Representation Learning as studied in class. You have been provided with a pretrained autoencoder (just the encoder part) and you will use it to obtain deep features for the modified UC Merced Land Use dataset. You will have to:
- Obtain predictions for the entire dataset
- Save then in an appropriate fashion

*Keep in mind that this model takes input of shape 256X256X3 so you need to resize the images before feeding them into this model*

*Try to think about how you could use the generator from Task4 to create another generator which would yield encoded features along with labels instead of raw images*

In [4]:
#Encoder != auto-encoder. 
#Prediction is the deep features.

encoder_model = load_model("encoder_gt.h5")
encoder_model.summary()




__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            (None, 256, 256, 3)  0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 128, 128, 20) 2960        input_2[0][0]                    
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 128, 128, 20) 80          conv2d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 128, 128, 20) 0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
conv2d_2 (



In [8]:
encoder_model.compile(optimizer="adam",loss='mse')       # This will create predictions on all land
pred_gen=land_use_generator(allPaths,allLabels,y_true,num_classes,(256,256,3),True,batch_size)
enc_output=encoder_model.predict_generator(pred_gen, steps=len(allPaths)//batch_size,verbose=1)


In [9]:
# Same generator but only difference is it also produce 
classesDir = "UCMerced_LandUse/Images/"       
# os.chdir(cdir+classesDir)
classes= glob.glob(classesDir+ '*')
print("classes: ",classes)
# os.chdir(cdir)
num_classes=len(classes)
print(num_classes)
inputShape = (224,224,3)
batch_size=79

allPaths = []
allLabels = []
count=0
for cdir in classes:
    path = glob.glob(cdir+"/"+'*')
    allPaths+=path
    labels= [count for i in range(len(path))]
    allLabels+=labels
    count+=1;
# print(allPaths)
# print(allLabels)
y_true = np.zeros((len(allLabels), num_classes))

def land_use_generator2(allPaths,allLabels, y_true,num_classes,inpShape,encoder_model,resize=True,bath_size=128):
    batch_start = 0
    batch_end = batch_size
    n=len(allLabels)
    indexes=np.arange(0,n,batch_size)
    np.random.shuffle(indexes)
#     print("indexes: ",indexes)
    
    if n % batch_size != 0:
        indexes = indexes[:-1] 
    
    
    while True:
        index=0
        for b_start in indexes:
                
            batch_image_paths = allPaths[b_start: b_start+batch_size]
            batch_labels =  allLabels[b_start: b_start+batch_size]
            
            batch_images=np.zeros((batch_size, *inpShape))
            batch_labels_enc = np.zeros((batch_size , num_classes))
            
            count=0
            for path in batch_image_paths:
#                 print(path)
                img=cv2.cvtColor(cv2.imread(path),cv2.COLOR_BGR2RGB)
    
                if(resize==True):                          # Will only do resize for part 4 not 6
                    img = cv2.resize(img,(inpShape[0],inpShape[1]))        #Image resized as 224x224x3
        
#                 print("image is: ",img.shape)

                img.astype(float)
#                 print(img.shape)
                batch_images[count,:,:,:]=img
                
                labels= np.zeros((num_classes,),dtype='int')
                labels[batch_labels[count]]=1
                batch_labels_enc[count,:]=labels
#                 print(index)
                y_true[index,:]=labels      # This appends the true labels
                count+=1
                index+=1
                
            batch_images = preprocessing_meanShift(batch_images)
            print(batch_images.shape)
            encoded_images=encoder_model.predict_on_batch(batch_images)
            print(encoded_images.shape)
#             y_true += batch_labels_enc
            yield(encoded_images,batch_labels_enc)
            print("batch has been yielded")

classes:  []
0


## Task 7

Now you will train a classifier from scratch to discriminate the 22 classes based on the deep features you extracted in Task6. You will:
- Train a classifier with the following architecture
> 1D conv 3x1 -> 1D conv 3x1 -> FC 256 -> FC 22
- Construct a multiclass confusion matrix and visualize it as a heatmap
- Compare this confusion matrix with the one made in Task5

*The input to this model will be the deep feature tensor obtained in Task6, so use appropriate input shape*

In [3]:
# Those features are 
# Compare with task 5

input_im = Input(shape=(8, 8, 20), name='input_im')

reshaped=Reshape((64,20))(input_im)

conv1 = Conv1D(20, kernel_size=(3), strides = (1),  activation='relu',name='my_conv1')(reshaped)
conv2 = Conv1D(20, kernel_size=(3), strides = (1), activation='relu',name='my_conv2')(conv1)
dense1 = Dense(256, activation='relu',name='mydense')(conv2)
print(dense1.shape)
# reshaped=Reshape((num_classes,20))(input_im)
output_class = Dense(num_classes, activation='softmax',name='mydense_2')(dense1)
print(output_class.shape)
myModel = Model(inputs=input_im, outputs=output_class)
myModel.summary()


(?, 60, 256)


NameError: name 'num_classes' is not defined

In [7]:
epochs = 2
encoder_model.compile(optimizer='adam',loss='mse')       # This will create predictions on all land
myModel.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])
encoded_gen=land_use_generator2(allPaths,allLabels,y_true,num_classes,(256,256,3),encoder_model,True,batch_size)
hist3 = myModel.fit_generator(encoded_gen, epochs=epochs, steps_per_epoch=len(allPaths)//batch_size,verbose=1)


Epoch 1/2
(79, 256, 256, 3)


ValueError: Tensor Tensor("activation_21/Relu:0", shape=(?, 8, 8, 20), dtype=float32) is not an element of this graph.

## Task 8

Now you will explore another use of the deep features extracted in Task6. Content Based Image Retrieval (CBIR) is the task of searching for visually similar images from a dataset. *Think Google image search.* This concept can obviously be applied on other forms of data like text, audio or video as well. In this task you will:
- Implement a function which will take three inputs and returns a list of visually similar images. The inputs would be
> An image from the dataset `im` <br />
The number of search results to return `n` (no more than 5) <br />
A string representing the distance metric used for comparisons `dist`
- Visualize search result images, by looking for the appropriate image
- Use some images to compare the effects of these distance metrics on the output
> Euclidean <br />
Cosine <br />
Mahalanobis

*Make sure the query image is the first search result for your function*

*Look up the documentation for Scipy's `spatial.distance` module. It is your best friend in this task.*

*If you made a generator in Task6, you can very easily use it in this task as well*

In [None]:
#

def euc_dist(y_true, y_pred):
    return K.sqrt(K.sum(K.square(y_true - y_pred), axis=-1, keepdims=True))

def cosine_distance(vests):
    x, y = vests
    x = K.l2_normalize(x, axis=-1)
    y = K.l2_normalize(y, axis=-1)
    return -K.mean(x * y, axis=-1, keepdims=True)




def cbir(im, n, dist):
    pass