# 911-Crime detection using ResNet50 Model

###### By Project ATMA Team

## Introduction

For this model, we decided to use the ResNet50 (Deep Residual Networks with 50 layers) to help us determine whether a given short video input can be classified as an active crime.

In this python notebook, we will explore the use of features in order to re-train a model's trained weights (in the dense layers and some layers from the original model) to recognize crimes without the need to train the full model from scratch (which usually requires a large amount of data).

ResNet has an exotic architecture also called "network on network architecture". Such micro-architecture modules refer to the building blocks that make up the network. Together with the standard layers, a macho-architecture is formed and "residual learning" is introduced. Ever since introduced by He et al., ResNets have demostrated that deep networks can be trained with a standard SGD (Stochastic Gradient Descent) optimizer.

"Deep convolutional neural networks have led to a series of breakthroughs for image classification. Many other visual recognition tasks have also greatly benefited from very deep models. Over the years there is a trend to go more deeper to solve more complex tasks and to also increase/improve the classification/recognition accuracy. But, as we go deeper, the training of neural network becomes difficult; the accuracy saturates and even degrades. Residual Learning tries to solve both these problems.
In general, in a deep convolutional neural network, several layers are stacked and are trained to the task at hand. The network learns several low/mid/high level features at the end of its layers. In residual learning, instead of trying to learn some features, we try to learn some residual. Residual can be simply understood as subtraction of feature learned from input of that layer. ResNet does this using shortcut connections (directly connecting input of nth layer to some (n+x)th layer. It has proved that training this form of networks is easier than training simple deep convolutional neural networks and also the problem of degrading accuracy is resolved." Writen By Kartik Ordugo, https://www.quora.com/What-is-the-deep-neural-network-known-as-%E2%80%9CResNet-50%E2%80%9D

ResNets take activations from one layer and feed it into another layer much deeper in the network. This is called "Skip connections". they work because the identity function is easy for residual blocks to learn, as the same input is used and transferred into a deeper layer and in the case that the weights/bias fails to change the input (by applying weight/bias decay), the relu goes back to the skipped input. Thereby learning the identity function.

* Deep Residual Learning for Image Recognition by He et al.
    - https://arxiv.org/abs/1512.03385
* Identity Mappings in Deep Residual Networks by He et al.
    - https://arxiv.org/abs/1603.05027
* Youtube videos explaining Residual Networks by Andrew Ng
    - ResNets https://www.youtube.com/watch?time_continue=1&v=K0uoBKBQ1gA
    - Why ResNets work? https://www.youtube.com/watch?v=GSsKdtoatm8
    - Network in Network architecture https://www.youtube.com/watch?v=9EZVpLTPGz8

###### Below is an image of a residual module (Left) next to an updated residual module (Right) that uses pre-activation.

Demostrated in 2016 in a follow up paper (see above), identity mappings helps the ResNets achieve higher accuracy.

In [1]:
# Modules to display images
from IPython.display import Image
from IPython.core.display import HTML, display
# Display two images
# display(HTML("<table><tr><td><img src='images/imagenet_resnet_residual.png'></td><td><img src='images/imagenet_resnet_residual_identity.png'></td></tr></table>"))
Image(url= "images/imagenet_resnet_residual_identity.png")

###### ResNet50 Architecture Graph

Click the link below for a detailed graph of the ResNet50 architecture

http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006

###### Flowchart

In Progress

##### Data directory structure

##### ImageNet

What is ImageNet?

ImageNet is formally a project aimed at (manually) labeling and categorizing images into almost 22,000 separate object categories for the purpose of computer vision research.

When it comes to image classification, the ImageNet challenge is the de facto benchmark for computer vision classification algorithms — and the leaderboard for this challenge has been dominated by Convolutional Neural Networks and deep learning techniques since 2012. The state-of-the-art pre-trained networks included in the Keras core library represent some of the highest performing Convolutional Neural Networks on the ImageNet challenge over the past few years. These networks also demonstrate a strong ability to generalize to images outside the ImageNet dataset via transfer learning, such as feature extraction and fine-tuning.

The goal of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is to train a model that can correctly classify an input image into 1,000 separate object categories. Models are trained on ~1.2 million training images with another 50,000 images for validation and 100,000 images for testing.

These 1,000 image categories represent object classes that we encounter in our day-to-day lives, such as species of dogs, cats, various household objects, types of vehicles, and much more. You can find the full list of object categories in the ILSVRC challenge here.
http://image-net.org/challenges/LSVRC/2014/browse-synsets

###### This dataset is what the ResNet50 model is trained on. It is good to know the classification labels from that dataset in order for us to work with the pre-trained transfer values. Our primary goal is to take note of the most prominent labels that come with our data type and get some insights into what the model is noticing.

###### ImageNet classified synsets useful for ATMA

* Letter opener, paper knife, paperknife - 1170 images
* Assault rifle, assault gun - 1172 images
* Revolver, six-gun, six-shooter - 1223 images
* Sweatshirt - 1174 images
* Jersey, T-shirt, tee shirt - 1331 images
* revolver, six-gun, six-shooter
* hatchet
* cleaver, meat cleaver, chopper
* guillotine
* rifle
* lighter, light, igniter, ignitor
* holster
* matchstick

###### Imports

In [None]:
# Utilities
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import keras
import pandas
import os
import sys

# keras.layers
from keras.layers import (
    Input,
    Activation,
    Dense,
    Flatten
)
from keras.layers.convolutional import (
    Conv2D,
    MaxPooling2D,
    AveragePooling2D
)
from keras.layers.merge import add
from keras.layers.normalization import BatchNormalization

#others
from keras.regularizers import l2
from keras.optimizers import *
from keras.applications import *
from keras.models import (
    Model,
    load_model
)
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
from keras.preprocessing import image
from keras.callbacks import ModelCheckpoint, EarlyStopping
from keras.applications.resnet50 import preprocess_input, decode_predictions



###### Hyperparameters - for Fine-Tuning

In [None]:
num_classes = 2 # Crime, No Crime
last_block_layer_of_base_model = 126
img_width, img_height = 224, 224 # default parameters for ResNet50 is 224x224
num_channels = 3 # 3 color channels for the frames (RBG)
batch_size = 32 # we can try 4,8,32,64,128,256,..
num_epochs = 50 # number of iterations the algorithm gets trained
nadam_lr = 1e-5 # for nadam optimizer
learning_rate = 0.045 # for sgd optimizer
learning_rate_decay = 0.94 # every two seconds
momentum = 0.9 # momentum used for the sgd optimizer
transformation_ratio = .05 # how aggressive will the data augmentation/transformation be

# Original ResNet50 Model - we will use it for testing purposes

In [None]:
original_model = keras.applications.resnet50.ResNet50(include_top=True, weights='imagenet')

###### Predicts an image using the original model. Prints out the predicted results

In [None]:
def predict(img_path):
    img = image.load_img(img_path, target_size=(img_widh, img_height))
    plt.imshow(img)
    plt.show()
    img = image.img_to_array(img) # converts the image to a numpy array
    img = np.expand_dims(img, axis=0) # adds a dimension to the image (s1,s2,channels) -> (samples,s1,s2,ch)
                                      # this is bcus Keras works with batches of images. The first added dimension is used for that.
    img = preprocess_input(img) # sets image to the format the model requires
    
    predictions = original_model.predict(img)
    decoded_labels = decode_predictions(predictions)[0]
    # decode_predictions returns a tuple (class, description, probability)
    # of the top predictions specified
    for image_id, class_name, score in decoded_labels:
        print("{2:>6.2%} : {1}({0})".format(image_id class_name, score) )

## Test the model with many different crime images. Any patterns that ResNet50 specifically recognizes? Anything that it easily identifies? Anything that it misses? How many predicted labels should we consider? Why?

# Stopping point for the task given to students

###### Helper Functions

Functions from Hvass-labs Tutorial #10: Fine-tuning https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/10_Fine-Tuning.ipynb

In [None]:
# Function used to plot at most 9 images in a 3x3 grid
# with the corresponding true and predicted classes below

def plot_images(images, cls_true, cls_pred=None, smooth=True):

    assert len(images) == len(cls_true)

    # Create figure with sub-plots.
    fig, axes = plt.subplots(3, 3)
    # axes becomes a 3x3 matrix with one axes in each element

    # Adjust vertical spacing.
    if cls_pred is None:
        hspace = 0.3
    else:
        hspace = 0.6 # extra spacing for the class predicted values
    fig.subplots_adjust(hspace=hspace, wspace=0.3)

    # Interpolation type.
    if smooth:
        interpolation = 'spline16'
    else:
        interpolation = 'nearest'

    for i, ax in enumerate(axes.flat): # flattens the 3x3 matrix into a 9x1 vector
        # There may be less than 9 images, ensure it doesn't crash.
        if i < len(images):
            # Plot image.
            ax.imshow(images[i],
                      interpolation=interpolation)

            # Name of the true class.
            cls_true_name = class_names[cls_true[i]]

            # Show true and predicted classes. If predicted value doesnt exist, it doesn't add it on the xlabel
            if cls_pred is None:
                xlabel = "True: {0}".format(cls_true_name)
            else:
                # Name of the predicted class.
                cls_pred_name = class_names[cls_pred[i]]

                xlabel = "True: {0}\nPred: {1}".format(cls_true_name, cls_pred_name)

            # Show the classes as the label on the x-axis.
            ax.set_xlabel(xlabel)
        
        # Remove ticks from the plot.
        ax.set_xticks([])
        ax.set_yticks([])
    
    # Ensure the plot is shown correctly with multiple plots
    # in a single Notebook cell.
    plt.show()

In [None]:
# Plots the example errors (images) that were mis-classified
# (uses plot_images)

def plot_example_errors(cls_pred):
    # cls_pred is an array of the predicted class-number for
    # all images in the test-set.

    # Boolean array whether the predicted class is incorrect.
    incorrect = (cls_pred != cls_test)

    # Get the file-paths for images that were incorrectly classified.
    image_paths = np.array(image_paths_test)[incorrect]

    # Load the first 9 images.
    images = load_images(image_paths=image_paths[0:9])
    
    # Get the predicted classes for those images.
    cls_pred = cls_pred[incorrect]

    # Get the true classes for those images.
    cls_true = cls_test[incorrect]
    
    # Plot the 9 images we have loaded and their corresponding classes.
    # We have only loaded 9 images so there is no need to slice those again.
    plot_images(images=images,
                cls_true=cls_true[0:9],
                cls_pred=cls_pred[0:9])

In [None]:
# prints the confusion matrix

# Import a function from sklearn to calculate the confusion-matrix.
from sklearn.metrics import confusion_matrix

def print_confusion_matrix(cls_pred):
    # cls_pred is an array of the predicted class-number for
    # all images in the test-set.

    # Get the confusion matrix using sklearn.
    cm = confusion_matrix(y_true=cls_test,  # True class for test-set.
                          y_pred=cls_pred)  # Predicted class.

    print("Confusion matrix:")
    
    # Print the confusion matrix as text.
    print(cm)
    
    # Print the class-names for easy reference.
    for i, class_name in enumerate(class_names):
        print("({0}) {1}".format(i, class_name))

In [None]:
# Plots the example errors and the confusion matrix 
# (uses plot_example_errors and plot_confusion_matrix)

def example_errors():
    # The Keras data-generator for the test-set must be reset
    # before processing. This is because the generator will loop
    # infinitely and keep an internal index into the dataset.
    # So it might start in the middle of the test-set if we do
    # not reset it first. This makes it impossible to match the
    # predicted classes with the input images.
    # If we reset the generator, then it always starts at the
    # beginning so we know exactly which input-images were used.
    generator_test.reset()
    
    # Predict the classes for all images in the test-set.
    y_pred = new_model.predict_generator(generator_test,
                                         steps=steps_test)

    # Convert the predicted classes from arrays to integers. (picks the highest score (class prediction) for each image)
    cls_pred = np.argmax(y_pred,axis=1)

    # Plot examples of mis-classified images.
    plot_example_errors(cls_pred)
    
    # Print the confusion matrix.
    print_confusion_matrix(cls_pred)

In [None]:
# Loads the images (as numpy arrays) from the directory into memory

def load_images(image_paths):
    # Load the images from disk.
    images = [plt.imread(path) for path in image_paths]

    # Convert to a numpy array and return it.
    return np.asarray(images)

In [None]:
# plots the history of the recorded accuracy and loss from the training iterations

def plot_training_history(history):
    # Get the classification accuracy and loss-value
    # for the training-set.
    acc = history.history['categorical_accuracy']
    loss = history.history['loss']

    # Get it for the validation-set (we only use the test-set).
    val_acc = history.history['val_categorical_accuracy']
    val_loss = history.history['val_loss']

    # Plot the accuracy and loss-values for the training-set.
    plt.plot(acc, linestyle='-', color='b', label='Training Acc.')
    plt.plot(loss, 'o', color='b', label='Training Loss')
    
    # Plot it for the test-set.
    plt.plot(val_acc, linestyle='--', color='r', label='Test Acc.')
    plt.plot(val_loss, 'o', color='r', label='Test Loss')

    # Plot title and legend.
    plt.title('Training and Test Accuracy')
    plt.legend()

    # Ensure the plot shows correctly.
    plt.show()

###### Acquiring the data from a zip file

In [1]:
# pwd
# ^ gets us the home directory

'''# Unzip data files into directory path given
import zipfile
# 'pwd' gets home folder where notebook opened. Very useful to get paths
import zipfile
zip_ref = zipfile.ZipFile('/home/ivargaswhs88/sdata.zip','r')
# extracts what is in the zip file, which is already a folder called sdata
# so there is no need to create a new directory
zip_ref.extractall('/home/ivargaswhs88')
zip_ref.close()
'''

"# Unzip data files into directory path given\nimport zipfile\n# 'pwd' gets home folder where notebook opened. Very useful to get paths\nimport zipfile\nzip_ref = zipfile.ZipFile('/home/ivargaswhs88/sdata.zip','r')\n# extracts what is in the zip file, which is already a folder called sdata\n# so there is no need to create a new directory\nzip_ref.extractall('/home/ivargaswhs88')\nzip_ref.close()\n"

###### Acquiring the data path directories for each set (training, validation, test)

In [None]:
data_dir_path = os.path.abspath('/home/ivargaswhs88/sdata')
train_dir_path = os.path.join(os.path.abspath(data_dir_path), 'train')
validation_dir_path = os.path.join(os.path.abspath(data_dir_path), 'validation')
test_dir_path = os.path.join(os.path.abspath(data_dir_path), 'test')

# validation for real model we can simply have one full training set
# and use a random validation block of close

In [None]:
'''
Knifey Dataset
just in case we need to use it for secondary testing
Importing the knifey dataset used in the Hvass-Labs tutorials (8,9) 
import knifey
knifey.maybe_download_and_extract()
knifey.copy_files()
train_dir = knifey.train_dir
test_dir = knifey.test_dir
'''

###### Preprocessing the Data 
Training Set

In [None]:
# function used to randomize the image parameters
train_genFunction = ImageDataGenerator(rescale=1. / 255)
# data generator that uses above function and applies it to the training files
train_generator = train_genFunction.flow_from_directory(train_dir_path,
                                                        target_size=(img_width, img_height),
                                                        batch_size=batch_size,
                                                        color_mode='rgb',
                                                        class_mode='categorical',
                                                        horizontal_flip=True,
                                                        vertical_flip=True,
                                                        shuffle=True)
# Additional arguments
# rotation_range=transformation_ratio,
# shear_range=transformation_ratio,
# zoom_range=transformation_ratio,
# cval=transformation_ratio,

train_iterations = train_generator.n / batch_size
# one epoch is when an entire dataset is passed through a NN only one
# batch size is the number of training examples in a single batch
# iterations are the number of batches needed to complete one epoch

# the data generator takes in:
    # The directory of the data
    # gets a small batch size of files
    # resizes them to the target_size
# it spits out a batch of images with different parameters

Validation Set

In [None]:
validation_genFunction = ImageDataGenerator(rescale=1. / 255)
validation_generator = validation_genFunction.flow_from_directory(validation_dir_path,
                                                             target_size=(img_width, img_height),
                                                             batch_size=batch_size,
                                                             color_mode='rgb',          
                                                             class_mode='categorical',
                                                             shuffle=False)
validation_iterations = validation_generator.n / batch_size

Test Set

In [None]:
test_genFunction = ImageDataGenerator(rescale=1. / 255)
test_generator = test_genFunction.flow_from_directory(test_dir_path,
                                                     target_size=(img_width, img_height),
                                                     batch_size=1,
                                                     shuffle=False)
test_iterations = test_generator.n / batch_size

In [None]:
# gets the class numbers for all 3 datasets
cls_train = train_generator.classes
cls_validation = validation_generator.classes
cls_test = test_generator.classes

In [None]:
# list of class names exported from the directory
# this is why it is important to name the directories carefully
class_names = list(train_generator.class_indices.keys())
class_names

In [None]:
num_classes = train_generator.num_classes
num_classes

###### Plot a few images to see that the data is exported well

In [None]:
# Load the first images from the train-set.
images = load_images(image_paths=image_paths_train[0:9])

# Get the true classes for those images.
cls_true = cls_train[0:9]

# Plot the images and labels using our helper-function above.
plot_images(images=images, cls_true=cls_true, smooth=True)

# ATMA Model

### Fine Tuning Part 1 - Dense Layers weights (Transfer Learning)

###### Loading the model (Incomplete)

In [5]:
base_model = keras.applications.resnet50.ResNet50(include_top=False, weights='imagenet')
# ARGS:
# include_top = False -> we will not get the last two fully connected layers
# weights = 'imagenet' -> we will get the weights of the model after being trained by the given dataset

# Show the model's architecture summary. The name and types of its layers
# base_model.summary()

# the output shape of the Base model.
# base_model.output_shape



In [None]:
base_model.summary()
# get a summary. Get the name of last layer and put it below
last_conv_layer = model.get_layer('nameOfLayer')
# last_conv_layer.output to see what it outputs 
    # has to be a 4D vector (AllInputs, width, height, # of channels)
    # (AllInputs, 7, 7, 2048)
    # (AllInputs, 14, 14, 2048)

Freeze all ResNet50 convolutional model layers so we only fine-tune the weights of the last added layers we will create. Once those are fine-tuned, then we can fine-tune some of the deeper convolutional layers.

This prevents a lot of errors that may propagate from the randomized weights of the layers we just created.

In [None]:
# Freezing the layers
basel_model.trainable = False

for layer in base_model.layers:
    layer.trainable = False

# We don't know if the 'trainable' boolean in the meta-layer 'base_model.trainable' overrides all the layer trainable booleans.
# Therefore we change both

###### Model Completion with Keras Functional model

In [6]:
# Finishing up the architecture
x = base_model.output
x = GlobalAveragePooling2D()(x) # add a pooling layer. turning 2048 features into 1024
'''
We do not need this global average pooling if the last layer of the conv. network we get is already pooled.
'''
x = Flatten()(x) # Flattens the 4-D layer into a 2-D layer
x = Dense(1024, activation='relu')(x) # a fc player with relu non-linear activation
# here is where we would add a dropout layer. But ResNet50 does not really need dropout layers.
predictions = Dense(num_classes, activation='softmax')(x) # a logistic layer with the number of classes and softmax to normalize the outputs

In [None]:
# Defining the model start and end points. Basically creates the model base architecture
model = Model(inputs=base_model.input, outputs=predictions)

In [8]:
# Model Compilation --- must be done after freezing the layers
# Complies the model for all the changes to take effect
# this connects the whole model together and ready for use
optimizer_transfer = NAdam(lr=nadam_lr)
loss = 'categorical_crossentropy'
metrics = ['accuracy']
model.compile(optimizer=optimizer_transfer,
              loss=loss,
              metrics=metrics)

In [None]:
# train the transfer weights on the new data for a few epochs
model.fit_generator(generator=train_generator,
                    epochs=num_epochs,
                    steps_per_epoch=100, # means we one "epoch" will be epochs*100 for one actual full "epoch"
                    validation_data=validation_generator,
                    validation_steps=validation_iterations)

In [None]:
# evaluate the model's performance
result = model.evaluate_generator(test_generator, steps=test_iterations)

# Prints the accuracy
print("{0:.2%}".format(result[1]))
# What happens if we print out [0]? 


# Create a function that plots the results for analysis

### Fine Tuning Part 2 - Resnet50 Convolutional Layers weights

At this point, the weights of the dense layers have been fine tuned, we can now unfreeze some of the top layers and fine tune their weights

In [None]:
# Unfreezing the layers
base_model.trainable = True
    
# Get the layer number needed for fine-tune
for layer in base_model.layers['minimum layer number from which we will fine-tune the model':]:
    layer.trainable = True

In [16]:
# Creates new directory if it does not exist, in the joined path of the train_data_dir path
# os.makedirs(os.path.join(os.path.abspath(train_data_dir), '../preview'), exist_ok=True)

"train_datagen = ImageDataGenerator(rescale=1. / 255,\n                   rotation_range=transformation_ratio,\n                   shear_range=transformation_ratio,\n                   zoom_range=transformation_ratio,\n                   cval=transformation_ratio,\n                   horizontal_flip=True,\n                   vertical_flip=True)\n\nvalidation_datagen = ImageDataGenerator(rescale=1. / 255)\n\nvalidation_generator = validation_datagen.flow_from_directory(validation_data_dir,\n                          target_size=(img_width, img_height),\n                          batch_size=batch_size,\n                          class_mode='categorical')\n\n\n# Creates new directory if it does not exist, in the joined path of the train_data_dir path\n\nos.makedirs(os.path.join(os.path.abspath(train_data_dir), '../preview'), exist_ok=True)\n\n\n# the data generator takes in:\n    # The directory of the data\n    # gets a small batch size of files\n    # resizes them to the target_size\n# 

In [None]:
optimizer_finetune = NAdan(lr=1e-7)
# the learning rate is much smaller. The weights must change much more carefully

# recompile the model for the changes to take effect
model.compile(optimizer=optimizer_finetune,
              loss=loss,
              metrics=metrics)

In [None]:
# train the full model with two  on the new data for a few epochs
model.fit_generator(generator=train_generator,
                    epochs=num_epochs,
                    steps_per_epoch=100, # means we one "epoch" will be epochs*100 for one actual full "epoch"
                    validation_data=validation_generator,
                    validation_steps=validation_iterations)

In [None]:
result = model.evaluate_generator(test_generator, steps=test_iterations)
print("{0:.2%}".format(result[1]))

In [None]:
# save model as json_file
    model_json = model.to_json()
    with open(os.path.join(os.path.abspath(model_path), 'model.json'), 'w') as json_file:
        json_file.write(model_json)

# ResNet50 additional resources

* Deep Residual Networks https://github.com/KaimingHe/deep-residual-networks
* Graph: http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006
* Keras ResNet50 Implementation https://github.com/raghakot/keras-resnet
* https://www.pyimagesearch.com/2017/03/20/imagenet-vggnet-resnet-inception-xception-keras/
* https://www.quora.com/What-is-the-deep-neural-network-known-as-%E2%80%9CResNet-50%E2%80%9D

# Pickling

In [17]:
'''import dill

# Save session
dill.dump_session('saved_sessions/testPickle.db')

# Load session
# dill.load_session('saved_sessions/testPickle.db')'''

"import dill\n\n# Save session\ndill.dump_session('saved_sessions/testPickle.db')\n\n# Load session\n# dill.load_session('saved_sessions/testPickle.db')"