# NOTE: make sure to fully annotate any changes you've made that way we can keep track of what has changed. Code no longer used should be commented out, not deleted.


# 911-Crime detection using ResNet50 Model


###### By Project ATMA Team

# Intro

For this model, we decided to use the ResNet50 (Deep Residual Networks with 50 layers) to help us determine whether a given short video input can be classified as an active crime.

In this python notebook, we will explore the use of Transfer Values (features) in order to re-train a model's trained weights to recognize crimes without the need to train the full model from scratch (which usually requires a large amount of data).

##### More about ResNet50

ResNet has an exotic architecture also called "network on network architecture". Such micro-architecture modules refer to the building blocks that make up the network. Together with the standard layers, a macho-architecture is formed and "residual learning" is introduced. Ever since introduced by He et al., ResNets have demostrated that deep networks can be trained with a standard SGD (Stochastic Gradient Descent) optimizer.

Writen By Kartik Ordugo,
https://www.quora.com/What-is-the-deep-neural-network-known-as-%E2%80%9CResNet-50%E2%80%9D

"Deep convolutional neural networks have led to a series of breakthroughs for image classification. Many other visual recognition tasks have also greatly benefited from very deep models. So, over the years there is a trend to go more deeper, to solve more complex tasks and to also increase /improve the classification/recognition accuracy. But, as we go deeper; the training of neural network becomes difficult and also the accuracy starts saturating and then degrades also. Residual Learning tries to solve both these problems.

In general, in a deep convolutional neural network, several layers are stacked and are trained to the task at hand. The network learns several low/mid/high level features at the end of its layers. In residual learning, instead of trying to learn some features, we try to learn some residual. Residual can be simply understood as subtraction of feature learned from input of that layer. ResNet does this using shortcut connections (directly connecting input of nth layer to some (n+x)th layer. It has proved that training this form of networks is easier than training simple deep convolutional neural networks and also the problem of degrading accuracy is resolved."

ResNets take activations from one layer and feed it into another layer much deeper in the network. This is called "Skip connections". they work because the identity function is easy for residual blocks to learn, as the same input is used and transferred into a deeper layer and in the case that the weights/bias fails to change the input (by applying weight/bias decay), the relu goes back to the skipped input. Thereby learning the identity function.

* Deep Residual Learning for Image Recognition by He et al.
    - https://arxiv.org/abs/1512.03385
* Identity Mappings in Deep Residual Networks by He et al.
    - https://arxiv.org/abs/1603.05027
* Youtube videos explaining Residual Networks by Andrew Ng
    - ResNets https://www.youtube.com/watch?time_continue=1&v=K0uoBKBQ1gA
    - Why ResNets work? https://www.youtube.com/watch?v=GSsKdtoatm8
    - Network in Network architecture https://www.youtube.com/watch?v=9EZVpLTPGz8

###### Below is an image of a residual module (Left) next to an updated residual module (Right) that uses pre-activation.

Demostrated in 2016 in a follow up paper (also given above), identity mappings helps the ResNets achieve higher accuracy.

In [1]:
# Modules to display images
from IPython.display import Image
from IPython.core.display import HTML, display
# Display two images
# display(HTML("<table><tr><td><img src='images/imagenet_resnet_residual.png'></td><td><img src='images/imagenet_resnet_residual_identity.png'></td></tr></table>"))
Image(url= "images/imagenet_resnet_residual_identity.png")

###### ResNet50 Architecture Graph

Click the link below for a detailed graph of the ResNet50 architecture

http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006

##### Data directory structure

###### We can also collect videos of spoons, forks, and knives and use that as our data

##### ImageNet

What is ImageNet?

ImageNet is formally a project aimed at (manually) labeling and categorizing images into almost 22,000 separate object categories for the purpose of computer vision research.


The goal of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is to train a model that can correctly classify an input image into 1,000 separate object categories. Models are trained on ~1.2 million training images with another 50,000 images for validation and 100,000 images for testing.

These 1,000 image categories represent object classes that we encounter in our day-to-day lives, such as species of dogs, cats, various household objects, vehicle types, and much more. You can find the full list of object categories in the ILSVRC challenge here.
http://image-net.org/challenges/LSVRC/2014/browse-synsets

###### This dataset is what the ResNet50 model is trained on. Therefore, it is good to know the classification labels from that dataset in order for us to work with the pre-trained transfer values.

When it comes to image classification, the ImageNet challenge is the de facto benchmark for computer vision classification algorithms — and the leaderboard for this challenge has been dominated by Convolutional Neural Networks and deep learning techniques since 2012. The state-of-the-art pre-trained networks included in the Keras core library represent some of the highest performing Convolutional Neural Networks on the ImageNet challenge over the past few years. These networks also demonstrate a strong ability to generalize to images outside the ImageNet dataset via transfer learning, such as feature extraction and fine-tuning.

###### ImageNet Classification classes

* Letter opener, paper knife, paperknife - 1170 images
* Assault rifle, assault gun - 1172 images
* Revolver, six-gun, six-shooter - 1223 images
* Sweatshirt - 1174 images
* Jersey, T-shirt, tee shirt - 1331 images
0. ALL SYNSETS BELOW, NEED TO HAVE NUMBER OF IMAGES.
1. revolver, six-gun, six-shooter
2. hatchet
3. cleaver, meat cleaver, chopper
4. guillotine
6. rifle
7. lighter, light, igniter, ignitor
8. holster
9. matchstick

# Tasks
* Find a way to connect GCP VM to Git
* Create a python module that converts videos to frames (recommended to use ffmpeg)


# Bugs / Issues
## 1. 
test_generator = test_genFunction.flow_from_directory(test_dir_path,
                                                 target_size=(img_width, img_height),
                                                 batch_size=1,
                                                 shuffle=False)
                                                 
Output: Found 9 images belonging to 3 classes.
###### However, there are only two classes inside the 'test' directory. The issue does not arise with the training or the validation directory. What might the problem be?
 

# Model

###### Imports

In [2]:
# Utilities
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import keras
import pandas
import os
import sys

from keras.layers import *
from keras.optimizers import *
from keras.applications import *
from keras.models import Model
from keras.models import load_model
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ModelCheckpoint, EarlyStopping

# preprocess_input is a fn used to process the image for ResNet50
# decode_predictions gives the top predictions (given as an arg.)
# Given as a tuple (class, description, probability)
# preds = model.predict(x)
# print('Predicted:', decode_predictions(preds, top=3)[0])
# Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), ... (other two predictions)
from keras.applications.resnet50 import preprocess_input, decode_predictions


'''
# 
from keras.applications import ResNet50
from keras.preprocessing import image
--- to use in : 
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
'''

'''
from __future__ import division

import six
from keras.models import Model
from keras.layers import (
    Input,
    Activation,
    Dense,
    Flatten
)
from keras.layers.convolutional import (
    Conv2D,
    MaxPooling2D,
    AveragePooling2D
)
from keras.layers.merge import add
from keras.layers.normalization import BatchNormalization
from keras.regularizers import l2
from keras import backend as K
'''

Using TensorFlow backend.


'\nfrom __future__ import division\n\nimport six\nfrom keras.models import Model\nfrom keras.layers import (\n    Input,\n    Activation,\n    Dense,\n    Flatten\n)\nfrom keras.layers.convolutional import (\n    Conv2D,\n    MaxPooling2D,\n    AveragePooling2D\n)\nfrom keras.layers.merge import add\nfrom keras.layers.normalization import BatchNormalization\nfrom keras.regularizers import l2\nfrom keras import backend as K\n'

In [3]:
# check version of anything. In this case, we check tensorflow
tf.__version__

'1.11.0'

##### Hyperparameters

In [4]:
num_classes = 10
last_block_layer_of_base_model = 126
img_width, img_height = 299, 299 # default parameters for ResNet50 is 224x224
num_channels = 3 # 3 color channels for the frames (RBG)
batch_size = 32 # we can try 4,8,32,64,128,256,..
num_epochs = 50 # number of iterations the algorithm gets trained
learning_rate = 0.045 # for sgd optimizer
learning_rate_decay = 0.94 # every two seconds
momentum = 0.9 # momentum used for the sgd optimizer
transformation_ratio = .05 # how aggressive will the data augmentation/transformation be

###### Loading the model (Incomplete)

In [5]:
base_model = keras.applications.resnet50.ResNet50(include_top=False, weights='imagenet')
# ADD "input_shape=(img_width, img_height, num_channels)" when we want to specify the shape of the frame input
# The default input size for this model is 224x224.

# ARGS:
# include_top = False -> we will not get the last two fully connected layers
# weights = 'imagenet' -> we will get the weights of the model after being trained by the given dataset

# Show the model's architecture
# base_model.summary()

# the output shape of the Base model.
# base_model.output_shape



###### Model Completion with Keras Functional model

In [6]:
# Finishing up the architecture
x = base_model.output
x = GlobalAveragePooling2D()(x) # add a pooling layer. turning 2048 features into 1024
x = Dense(1024, activation='relu')(x) # a fc player with relu non-linear activation
predictions = Dense(num_classes, activation='softmax')(x) # a logistic layer with the number of classes and softmax to normalize the outputs


In [7]:
# Defining the model start and end points
model = Model(inputs=base_model.input, outputs=predictions)

In [8]:
# Model Compilation
model.compile(optimizer='nadam',
              loss='categorical_crossentropy',
             metrics=['accuracy'])

# Data

###### Acquiring the data from the zip file

In [9]:
# pwd
# ^ gets us the home directory

In [10]:
'''# Unzip data files into directory path given
import zipfile
# 'pwd' gets home folder where notebook opened. Very useful to get paths
zip_ref = zipfile.ZipFile('/home/ivargaswhs88/sdata.zip','r')
# extracts what is in the zip file, which is already a folder called sdata
# so there is no need to create a new directory
zip_ref.extractall('/home/ivargaswhs88')
zip_ref.close()'''

"# Unzip data files into directory path given\nimport zipfile\n# 'pwd' gets home folder where notebook opened. Very useful to get paths\nzip_ref = zipfile.ZipFile('/home/ivargaswhs88/sdata.zip','r')\n# extracts what is in the zip file, which is already a folder called sdata\n# so there is no need to create a new directory\nzip_ref.extractall('/home/ivargaswhs88')\nzip_ref.close()"

###### Acquiring the data path directories for each set (training, validation, test)

In [11]:
data_dir_path = os.path.abspath('/home/ivargaswhs88/sdata')
train_dir_path = os.path.join(os.path.abspath(data_dir_path), 'train')
validation_dir_path = os.path.join(os.path.abspath(data_dir_path), 'validation')
test_dir_path = os.path.join(os.path.abspath(data_dir_path), 'test')

# validation for real model we can simply have one full taining set and use a random validation block of close

###### Preprocessing the Data 
###### In this case, all arguments are "false" so that the data we use for testing purposes isn't messed with.

###### Training Set

In [12]:
# function used to randomize the image parameters
train_genFunction = ImageDataGenerator(rescale=1. / 255)
# data generator that uses above function and applies it to the training files
train_generator = train_genFunction.flow_from_directory(train_dir_path,
                                                              target_size=(img_width, img_height),
                                                              batch_size=batch_size,
                                                              color_mode='rgb',
                                                              class_mode='categorical',
                                                              shuffle=False)

Found 1500 images belonging to 2 classes.


###### Validation Set

In [13]:
validation_genFunction = ImageDataGenerator(rescale=1. / 255)

validation_generator = validation_genFunction.flow_from_directory(validation_dir_path,
                                                             target_size=(img_width, img_height),
                                                             batch_size=batch_size,
                                                             color_mode='rgb',          
                                                             class_mode='categorical',
                                                             shuffle=False)

Found 500 images belonging to 2 classes.


###### Test Set

In [14]:
test_genFunction = ImageDataGenerator(rescale=1. / 255)
test_generator = test_genFunction.flow_from_directory(test_dir_path,
                                                     target_size=(img_width, img_height),
                                                     batch_size=1,
                                                     shuffle=False)


Found 9 images belonging to 3 classes.


In [15]:
'''
Importing the knifey dataset used in the Hvass-Labs tutorials (8,9) 
import knifey
knifey.maybe_download_and_extract()
knifey.copy_files()
train_dir = knifey.train_dir
test_dir = knifey.test_dir'''

'\nImporting the knifey dataset used in the Hvass-Labs tutorials (8,9) \nimport knifey\nknifey.maybe_download_and_extract()\nknifey.copy_files()\ntrain_dir = knifey.train_dir\ntest_dir = knifey.test_dir'

###### fine-tune model...

###### Data generation / parameters randomizaton...
Still unsure if such pre-processing for the frames will ultimately help our model. How would we go about it?

Ideas:
    - Every video input must be excluded/separated
    - preprocessing won't be for each frame, but instead for each set of frames that correspond to a full video. Basically the video will be proprocessed and not individual frames.
        
###### Functions to (randomly) change image parameters

###### Not sure if this will help, unless we have a lot of data

In [16]:
# functions that will be used to change the image parameters for randomness
# Note: the validation data generator only rescales the images for obvious reasons
'''train_datagen = ImageDataGenerator(rescale=1. / 255,
                   rotation_range=transformation_ratio,
                   shear_range=transformation_ratio,
                   zoom_range=transformation_ratio,
                   cval=transformation_ratio,
                   horizontal_flip=True,
                   vertical_flip=True)

validation_datagen = ImageDataGenerator(rescale=1. / 255)

validation_generator = validation_datagen.flow_from_directory(validation_data_dir,
                          target_size=(img_width, img_height),
                          batch_size=batch_size,
                          class_mode='categorical')


# Creates new directory if it does not exist, in the joined path of the train_data_dir path

os.makedirs(os.path.join(os.path.abspath(train_data_dir), '../preview'), exist_ok=True)


# the data generator takes in:
    # The directory of the data
    # gets a small batch size of files
    # resizes them to the target_size
# it spits out a batch of images with different parameters

train_generator = train_datagen.flow_from_directory(train_data_dir,
                    target_size=(img_width, img_height),
                    batch_size=batch_size,
                    class_mode='categorical')
'''

"train_datagen = ImageDataGenerator(rescale=1. / 255,\n                   rotation_range=transformation_ratio,\n                   shear_range=transformation_ratio,\n                   zoom_range=transformation_ratio,\n                   cval=transformation_ratio,\n                   horizontal_flip=True,\n                   vertical_flip=True)\n\nvalidation_datagen = ImageDataGenerator(rescale=1. / 255)\n\nvalidation_generator = validation_datagen.flow_from_directory(validation_data_dir,\n                          target_size=(img_width, img_height),\n                          batch_size=batch_size,\n                          class_mode='categorical')\n\n\n# Creates new directory if it does not exist, in the joined path of the train_data_dir path\n\nos.makedirs(os.path.join(os.path.abspath(train_data_dir), '../preview'), exist_ok=True)\n\n\n# the data generator takes in:\n    # The directory of the data\n    # gets a small batch size of files\n    # resizes them to the target_size\n# 

# ResNet50 additional resources

* Deep Residual Networks https://github.com/KaimingHe/deep-residual-networks
* Graph: http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006
* Keras ResNet50 Implementation https://github.com/raghakot/keras-resnet
* https://www.pyimagesearch.com/2017/03/20/imagenet-vggnet-resnet-inception-xception-keras/
* https://www.quora.com/What-is-the-deep-neural-network-known-as-%E2%80%9CResNet-50%E2%80%9D

### Notes

* ffmpeg to convert videos to frames
* Fine tuning of top layers (in keras.io website)
* 24-30 frames per second is eye friendly

# Pickling

In [17]:
'''import dill

# Save session
dill.dump_session('saved_sessions/testPickle.db')

# Load session
# dill.load_session('saved_sessions/testPickle.db')'''

"import dill\n\n# Save session\ndill.dump_session('saved_sessions/testPickle.db')\n\n# Load session\n# dill.load_session('saved_sessions/testPickle.db')"