# Keras RetinaNet implementation with AvA dataset

Jupyter notebook providing steps to train a Keras/Tensorflow model for object detection with custom dataset.

The code for the design of the network is by [Fizyr implementation](https://github.com/fizyr/keras-retinanet) of RetinaNet in Keras.

Annotations and Classes csv files are needed to make this code work


# Environment Setup
Download and install in Colab required packages and import libraries.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [2]:
!git clone https://github.com/fizyr/keras-retinanet.git

Cloning into 'keras-retinanet'...
remote: Enumerating objects: 9, done.[K
remote: Counting objects: 100% (9/9), done.[K
remote: Compressing objects: 100% (8/8), done.[K
remote: Total 6205 (delta 1), reused 3 (delta 1), pack-reused 6196[K
Receiving objects: 100% (6205/6205), 13.48 MiB | 12.92 MiB/s, done.
Resolving deltas: 100% (4196/4196), done.


In [3]:
%cd keras-retinanet/

!pip install .

/content/keras-retinanet
Processing /content/keras-retinanet
Collecting keras-resnet==0.2.0
  Downloading https://files.pythonhosted.org/packages/76/d4/a35cbd07381139dda4db42c81b88c59254faac026109022727b45b31bcad/keras-resnet-0.2.0.tar.gz
Building wheels for collected packages: keras-retinanet, keras-resnet
  Building wheel for keras-retinanet (setup.py) ... [?25l[?25hdone
  Created wheel for keras-retinanet: filename=keras_retinanet-1.0.0-cp36-cp36m-linux_x86_64.whl size=168069 sha256=3e22008147bb2e9ecd7d207364994e3d9d8f6ba575faed861f6b15dbf64e9202
  Stored in directory: /root/.cache/pip/wheels/b2/9f/57/cb0305f6f5a41fc3c11ad67b8cedfbe9127775b563337827ba
  Building wheel for keras-resnet (setup.py) ... [?25l[?25hdone
  Created wheel for keras-resnet: filename=keras_resnet-0.2.0-py2.py3-none-any.whl size=20487 sha256=f059773422cabc214cd62c1187ce71ebf53646d05472464bfc9971f3e09d050a
  Stored in directory: /root/.cache/pip/wheels/5f/09/a5/497a30fd9ad9964e98a1254d1e164bcd1b8a5eda36197ec

In [4]:
!python setup.py build_ext --inplace

running build_ext
cythoning keras_retinanet/utils/compute_overlap.pyx to keras_retinanet/utils/compute_overlap.c
  tree = Parsing.p_module(s, pxd, full_module_name)
building 'keras_retinanet.utils.compute_overlap' extension
creating build
creating build/temp.linux-x86_64-3.6
creating build/temp.linux-x86_64-3.6/keras_retinanet
creating build/temp.linux-x86_64-3.6/keras_retinanet/utils
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.6m -I/usr/local/lib/python3.6/dist-packages/numpy/core/include -c keras_retinanet/utils/compute_overlap.c -o build/temp.linux-x86_64-3.6/keras_retinanet/utils/compute_overlap.o
In file included from [01m[K/usr/local/lib/python3.6/dist-packages/numpy/core/include/numpy/ndarraytypes.h:1832:0[m[K,
                 from [01m[K/usr/local/lib/python3.6/dist-packages/numpy/core/include/numpy/ndarrayobject.h:12[m[K,
          

In [5]:
import os
import shutil
import zipfile
import urllib
import xml.etree.ElementTree as ET
import numpy as np
import csv
import pandas
from google.colab import drive
from google.colab import files

# Making Dataset

The dataset files can be uploaded by local machine or taken for example from Google Drive

The format of the classes file must be:

```

class name, class id
class name, class id
class name, class id

...
```

The format of the annotations file must be:

```

path/to/image1.jpg, x min, y min, x max, y max, action name
path/to/image2.jpg, x min, y min, x max, y max, action name
path/to/image3.jpg, x min, y min, x max, y max, action name

...
```

In [6]:
DATASET_DIR = 'dataset'
ANNOTATIONS_FILE = 'AVA_annotations_v2.csv'
CLASSES_FILE = 'ava_actions_v3.csv'

# Training Model

Download pretrained model and run training.

In the next cell it is possible to choose one option between:

1.   download Resnet50 pretrained model
2.   download a custom pretrained model, to continue previous training epochs (from google drive)

It is also possible to export the trained model then, and save it in drive.


In [7]:
import urllib.request

PRETRAINED_MODEL = './snapshots/_pretrained_model.h5'

#### OPTION 1: DOWNLOAD INITIAL PRETRAINED MODEL FROM FIZYR ####
URL_MODEL = 'https://github.com/fizyr/keras-retinanet/releases/download/0.5.1/resnet50_coco_best_v2.1.0.h5'
urllib.request.urlretrieve(URL_MODEL, PRETRAINED_MODEL)

#### OPTION 2: DOWNLOAD CUSTOM PRETRAINED MODEL FROM GOOGLE DRIVE. CHANGE DRIVE_MODEL VALUE. USE THIS TO CONTINUE PREVIOUS TRAINING EPOCHS ####
#drive.mount('/content/gdrive')
#DRIVE_MODEL = '/content/drive/My Drive/Sapienza/Elective_Pirri/keras-retinanet/snapshots/resnet50_pascal_02.h5'
#shutil.copy(DRIVE_MODEL, PRETRAINED_MODEL)


print('Downloaded pretrained model to ' + PRETRAINED_MODEL)

Downloaded pretrained model to ./snapshots/_pretrained_model.h5


In [8]:
annotation_file = "/content/AVA_annotations_v2.csv"
class_file = "/content/ava_actions_v3.csv"
dataset_file = "/content/dataset_v2.zip"
correct_folder = "/content/keras-retinanet"

shutil.copy(annotation_file, correct_folder)
shutil.copy(class_file, correct_folder)
shutil.copy(dataset_file, correct_folder)

'/content/keras-retinanet/dataset_v2.zip'

In [9]:
with zipfile.ZipFile('dataset_v2.zip', 'r') as zip_ref:
  zip_ref.extractall()

In [10]:
!keras_retinanet/bin/train.py --random-transform --weights {PRETRAINED_MODEL} --epochs 50 csv AVA_annotations_v2.csv ava_actions_v3.csv

2020-10-19 15:19:15.167099: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
Creating model, this may take a second...
2020-10-19 15:19:17.673962: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-10-19 15:19:17.742175: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-19 15:19:17.742989: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:00:04.0 name: Tesla K80 computeCapability: 3.7
coreClock: 0.8235GHz coreCount: 13 deviceMemorySize: 11.17GiB deviceMemoryBandwidth: 223.96GiB/s
2020-10-19 15:19:17.743045: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-10-19 15:19:18.049873: 

In [None]:
#!rm -rf logs

In [11]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

In [None]:
%tensorboard --logdir logs

In [None]:
COLAB_MODEL = './snapshots/resnet50_csv_50.h5'
DRIVE_DIR = '/content/drive/My Drive/Sapienza/Retinanet_Project/pretrained models/AVA_models/'
shutil.copy(COLAB_MODEL, DRIVE_DIR)

# Inference
Run inference with uploaded image on trained model.

In [None]:
THRES_SCORE = 0.7

In [None]:
# show images inline
%matplotlib inline

# automatically reload modules when they have changed
%reload_ext autoreload
%autoreload 2

# import keras
import keras

# import keras_retinanet
from keras_retinanet import models
from keras_retinanet.utils.image import read_image_bgr, preprocess_image, resize_image
from keras_retinanet.utils.visualization import draw_box, draw_caption
from keras_retinanet.utils.colors import label_color

# import miscellaneous modules
import matplotlib.pyplot as plt
import cv2
import os
import numpy as np
import time

# set tf backend to allow memory to grow, instead of claiming everything
import tensorflow as tf

def get_session():
    config = tf.compat.v1.ConfigProto()
    config.gpu_options.allow_growth = True
    return tf.compat.v1.Session(config=config)

# use this environment flag to change which GPU to use
#os.environ["CUDA_VISIBLE_DEVICES"] = "1"

# set the modified tf session as backend in keras
tf.compat.v1.keras.backend.set_session(get_session())

In [None]:
model_path = os.path.join('snapshots', sorted(os.listdir('snapshots'), reverse=True)[0])
print(model_path)

# load retinanet model
model = models.load_model(model_path, backbone_name='resnet50')
model = models.convert_model(model)

# load label to names mapping for visualization purposes
labels_to_names = pandas.read_csv(CLASSES_FILE,header=None).T.loc[0].to_dict()

snapshots/resnet50_csv_50.h5


In [None]:
print(labels_to_names[9])

listen to (a person)


In [None]:
def img_inference(img_path):
  image = read_image_bgr(img_infer)

  # copy to draw on
  draw = image.copy()
  draw = cv2.cvtColor(draw, cv2.COLOR_BGR2RGB)

  # preprocess image for network
  image = preprocess_image(image)
  image, scale = resize_image(image)

  # process image
  start = time.time()
  boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0))
  print("processing time: ", time.time() - start)

  # correct for image scale
  boxes /= scale

  # visualize detections
  for box, score, label in zip(boxes[0], scores[0], labels[0]):
      # scores are sorted so we can break
      if score < THRES_SCORE:
          break

      color = label_color(label)

      b = box.astype(int)
      draw_box(draw, b, color=color)

      caption = "{} {:.3f}".format(labels_to_names[label], score)
      draw_caption(draw, b, caption)

  plt.figure(figsize=(10, 10))
  plt.axis('off')
  plt.imshow(draw)
  plt.show()

In [None]:

uploaded = files.upload()
img_infer = list(uploaded)[0]

print('Running inference on: ' + img_infer)
img_inference(img_infer)