## Tensorflow Object Detection API

This approach uses transfer learning from the `faster_rcnn_resnet50_coco` (check the tf [model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md)), which is pretrained with the [COCO dataset](http://cocodataset.org).

- To install Tensorflow Object Detection API clone the [repo](https://github.com/tensorflow/models/) (we'll use code from the `research/object_detection` folder) and follow [these](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/installation.md) installation instructions.
- I've used Windows (sort of a pain) but is possible. Some of the instructions to install the API in windows (prior to this I had an anaconda environment with python 3 and tensorflow-gpu configured): https://medium.com/@rohitrpatil/how-to-use-tensorflow-object-detection-api-on-windows-102ec8097699.

- This is based heavily on this blog post: https://medium.com/practical-deep-learning/a-complete-transfer-learning-toolchain-for-semantic-segmentation-3892d722b604. Code [here](https://github.com/fera0013/TransferLearningToolchain).

- This blog post was also useful: https://towardsdatascience.com/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9.


This notebook used the [faster_rcnn_resnet50_coco](http://download.tensorflow.org/models/object_detection/faster_rcnn_resnet50_coco_2018_01_28.tar.gz) model. The model binaries are not commited. To run this notebook you will have to download the model, unzip it and copy the following files to the `src/model/faster_rcnn_resnet50_coco_2018_01_28` folder:

- `model.ckpt.data-00000-of-00001`
- `model.ckpt.index`
- `model.ckpt.meta`

TODO: Automate this process within the notebook :-) 


In [1]:
import sys


# This is the path of the tf models repo. 
# I had to do this since my Anaconda installation on windows didn't use
# the PYTHONPATH. If your python environment uses PYTHONPATH add these
# paths to it (using the appropiate models path).
sys.path.append('C:\\Users\\bones\\models')
sys.path.append('C:\\Users\\bones\\models\\research')
sys.path.append('C:\\Users\\bones\\models\\research\\slim')


import glob, pylab, pandas as pd
import pydicom, numpy as np
import matplotlib.pyplot as plt
from collections import namedtuple, OrderedDict
import io
import os
import cv2
from PIL import Image
import tensorflow as tf
from sklearn.model_selection import train_test_split
from object_detection.utils import config_util
from object_detection.utils import label_map_util
from object_detection.legacy import evaluator
from offline_eval_map_corloc import _generate_filenames
from object_detection.metrics import tf_example_parser
from object_detection.utils import dataset_util
from object_detection.legacy import trainer
import utils

ModuleNotFoundError: No module named 'pydicom'

### Converting data to TFRecords

o use the object detection API is neccesary to convert the data to TFrecords, a somewhat obscure tensorflow format. Is very useful to read this to understand it first: https://planspace.org/20170323-tfrecords_for_humans/.

In [2]:
# Should we move this to a common file?
def parse_data(df):
    """
    Method to read a CSV file (Pandas dataframe) and parse the 
    data into the following nested dictionary:

      parsed = {
        
        'patientId-00': {
            'dicom': path/to/dicom/file,
            'label': either 0 or 1 for normal or pnuemonia, 
            'boxes': list of box(es)
        },
        'patientId-01': {
            'dicom': path/to/dicom/file,
            'label': either 0 or 1 for normal or pnuemonia, 
            'boxes': list of box(es)
        }, ...

      }

    """
    # --- Define lambda to extract coords in list [y, x, height, width]
    extract_box = lambda row: [row['y'], row['x'], row['height'], row['width']]

    parsed = {}
    for n, row in df.iterrows():
        # --- Initialize patient entry into parsed 
        pid = row['patientId']
        if pid not in parsed:
            parsed[pid] = {
                'dicom': '../input/stage_1_train_images/%s.dcm' % pid,
                'label': row['Target'],
                'boxes': []}

        # --- Add box if opacity is present
        if parsed[pid]['label'] == 1:
            parsed[pid]['boxes'].append(extract_box(row))

    return parsed

In [3]:
'''
    data -> a data entry generated by parse_data
    filename -> the image filename that will be stored (inside the tf record). 
                This must have a jpg extension since we're encoding to jpg in the tfrecord.
'''
def create_tf_example(data, filename):
    dcm_data = pydicom.read_file(data['dicom'])
    encoded_jpg = cv2.imencode('.jpg', dcm_data.pixel_array)[1].tostring()
    
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = filename.encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    for box in data['boxes']:
        xmin, ymin, w, h = box
        xmax, ymax = xmin + w, ymin + h
        xmins.append(xmin / width)
        xmaxs.append(xmax / width)
        ymins.append(ymin / height)
        ymaxs.append(ymax / height)
        
        # We only have one class..
        classes_text.append('pneumonia'.encode('utf8'))
        classes.append(1)

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    
    return tf_example

In [31]:
'''
    images_metadata -> dictionary generated by parse_data
    base_path -> the path of the folder where the records will be stored.
    filename -> the name of the .record file
'''
def create_tf_records(images_metadata, base_path, filename, force=False):
    
    if not os.path.exists(base_path):
        os.mkdir(base_path)
    
    output_path = os.path.join(base_path, filename)
    if os.path.isfile(output_path):
        print("%s already exists" % output_path)
        
        if force is True:
            print("Skipping creation of the Tfrecord")
        else:
            print("Overwriting existing file")
            
    else:
        print("%s does not exist. Proceeding to create" % output_path)
        
    with tf.python_io.TFRecordWriter(output_path) as writer:
        for idx, patientId in enumerate(images_metadata):
            tf_example = create_tf_example(images_metadata[patientId], '%s.jpg' % patientId)
            writer.write(tf_example.SerializeToString())
            
            if (idx + 1) % 100 == 0:
                print("%d proccessed so far" % (idx + 1))
            
        print("Created in %s" % output_path)

Reading the data and splitting into train and test. Both datasets will be stored in different tfrecords.

In [32]:
df = pd.read_csv('../input/stage_1_train_labels.csv')
train_df, test_df = train_test_split(df, test_size=0.33, random_state=42)
print("Train df shape", train_df.shape)
print("Test df shape", test_df.shape)

Train df shape (19422, 6)
Test df shape (9567, 6)


Now is time to create the TFRecords. This will create two files `validation.record` and `train.record` under the `input/stage_1_train_images/tf_records/` directory. Note that this can take some time (~30 min in my laptop), but this must be run only once.

If by whatever reason you need to override existing records, set `force=True`

In [33]:
%%time

# To overwrite use force=True
force = False
base_path = '../input/stage_1_train_images/tf_records/'

# Load info from csv to a dictionary
train_images_metadata = parse_data(train_df)
test_images_metadata = parse_data(test_df)

# Create TF records in disk
create_tf_records(test_images_metadata, base_path, 'validation.record', force=force)
create_tf_records(train_images_metadata, base_path, 'train.record', force=force)

../input/stage_1_train_images/tf_records/validation.record does not exist. Proceeding to create
100 proccessed so far
200 proccessed so far
300 proccessed so far
400 proccessed so far
500 proccessed so far
600 proccessed so far
700 proccessed so far
800 proccessed so far
900 proccessed so far
1000 proccessed so far
1100 proccessed so far
1200 proccessed so far
1300 proccessed so far
1400 proccessed so far
1500 proccessed so far
1600 proccessed so far
1700 proccessed so far
1800 proccessed so far
1900 proccessed so far
2000 proccessed so far
2100 proccessed so far
2200 proccessed so far
2300 proccessed so far
2400 proccessed so far
2500 proccessed so far
2600 proccessed so far
2700 proccessed so far
2800 proccessed so far
2900 proccessed so far
3000 proccessed so far
3100 proccessed so far
3200 proccessed so far
3300 proccessed so far
3400 proccessed so far
3500 proccessed so far
3600 proccessed so far
3700 proccessed so far
3800 proccessed so far
3900 proccessed so far
4000 proccessed 

## Train Model

It seems like the object detection api have some bugs regarding to Python 3. We have to do these changes in the models repo to make this work with Python 3.

--------------

Go to `research/object_detection/inference/detection_inference.py` and change 

`with tf.gfile.Open(inference_graph_path, 'r')`  to 

`"with tf.gfile.Open(inference_graph_path, 'rb') as graph_def_file:`

This is waiting to be fixed [here](https://github.com/tensorflow/models/pull/5065)

--------------


Go to `research/object_detection/metrics/tf_example_parser.py` in the parser method of the StringParser class change and change

`return "".join(tf_example.features.feature[self.field_name]` to

`return b"".join(tf_example.features.feature[self.field_name]`

--------------

Go to `research/object_detection/utils/object_detection_evaluation.py` in object_detection_evaluation add this in line 476: 
`groundtruth_dict[standard_fields.InputDataFields.groundtruth_group_of] != None and`

--------------

Go to `research/object_detection/utils/object_detection_evaluation.py` and change

`category_name = unicode(category_name, 'utf-8')` to

`category_name = str(category_name, 'utf-8')`

--------------

Now to train the model we have to call the `train.py` script with the train directory and with the pipeline.config file. The Pipeline config file defines parameters for training/optimization and feature extraction. It also defines the paths of the training and validation datasets.

In [None]:
!python train.py --logtostderr --train_dir=models\faster_rcnn_resnet50_coco_2018_01_28\train_dir --pipeline_config_path=models\faster_rcnn_resnet50_coco_2018_01_28\pipeline.config

After training we have to create the inference graph to make inferences:

In [None]:
!python export_inference_graph.py --input_type image_tensor --pipeline_config_path models\faster_rcnn_resnet50_coco_2018_01_28\pipeline.config --trained_checkpoint_prefix models\faster_rcnn_resnet50_coco_2018_01_28\train_dir\model.ckpt-0 --output_directory models\faster_rcnn_resnet50_coco_2018_01_28\fine_tuned_model

Now we'll use the inference graph to compute inferences in the validation set. You'll need to install pycocotools:

On Linux, run pip install git+https://github.com/waleedka/cocoapi.git#egg=pycocotools&subdirectory=PythonAPI

On Windows, run pip install git+https://github.com/philferriere/cocoapi.git#egg=pycocotools^&subdirectory=PythonAPI

In [None]:
!python -m infer_detections --input_tfrecord_paths=../input/stage_1_train_images/tf_records/validation.record --output_tfrecord_path=models\faster_rcnn_resnet50_coco_2018_01_28\inference --inference_graph=models\faster_rcnn_resnet50_coco_2018_01_28\fine_tuned_model\frozen_inference_graph.pb --discard_image_pixels

OPTIONAL: Run Metrics against validation set

This will compute the open image v2 detection metric against the validation set.
(https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/evaluation_protocols.md)

In [None]:
!python offline_eval_map_corloc.py --eval_dir=models\faster_rcnn_resnet50_coco_2018_01_28\validation_eval_metrics --eval_config_path=models\faster_rcnn_resnet50_coco_2018_01_28\validation_eval_metrics\validation_eval_config.pbtxt --input_config_path=models\faster_rcnn_resnet50_coco_2018_01_28\validation_eval_metrics\validation_input_config.pbtxt

In [48]:
df = pd.read_csv('models/faster_rcnn_resnet50_coco_2018_01_28/validation_eval_metrics/metrics.csv')
print(df['OpenImagesV2_Precision/mAP@0.5IOU'][0])
print(df)

                   OpenImagesV2_Precision/mAP@0.5IOU  0.0
0  OpenImagesV2_PerformanceByCategory/AP@0.5IOU/b...  0.0


## Computing RSNA score

Finally, we'll use the inferences of the validation set to compute the RSNA score that is used in the Kaggle leader board.
Interestingly, there is **a lot** of sampes without boxes to predict, these were not added because the Target column was = 0, which means no pneumonia. 

If we use the samples without pneumonia to compute the scores we got a pretty high score (~0.7), but this is because the implementation of the RSNA score gives 1.0 if both the prediction and the dataset have an empty list of bounding boxes. This is a sort of ambigous case, it needs to be defined, otherwise it will produce a division by zero.

In [5]:
configs = config_util.get_configs_from_multiple_files(
  eval_input_config_path="models/faster_rcnn_resnet50_coco_2018_01_28/validation_eval_metrics/validation_input_config.pbtxt",
  eval_config_path="models/faster_rcnn_resnet50_coco_2018_01_28/validation_eval_metrics/validation_eval_config.pbtxt")

eval_config = configs['eval_config']
input_config = configs['eval_input_configs'][0]

# Set to true to include images from the dataset where there are no bounding boxes to predict
count_empty_ground_truth = True

input_paths = input_config.tf_record_input_reader.input_path

categories = label_map_util.create_categories_from_labelmap(
    input_config.label_map_path)

skipped_images = 0
processed_images = 0

predictions = []
confidences = []
ground_truth = []

for input_path in _generate_filenames(input_paths):
    tf.logging.info('Processing file: {0}'.format(input_path))
    
    record_iterator = tf.python_io.tf_record_iterator(path=input_path)
    data_parser = tf_example_parser.TfExampleDetectionAndGTParser()
    
    for string_record in record_iterator:

        tf.logging.log_every_n(tf.logging.INFO, 'Processed %d images...', 1000, processed_images)
        processed_images += 1

        example = tf.train.Example()
        example.ParseFromString(string_record)
        decoded_dict = data_parser.parse(example)     
        
        if decoded_dict:
            if not count_empty_ground_truth and len(decoded_dict['groundtruth_boxes']) == 0:
                continue
                
            record_confidences = np.array(decoded_dict['detection_scores']).reshape(-1,)
            records_ground_truths = []
            record_predictions = []

            for x1, y1, x2, y2 in decoded_dict['detection_boxes']:
                w, h = x2 - x1, y2 - y1
                assert(w >= 0 and h >= 0)

                record_predictions.append([x1, y1, w, h])

            for x1, y1, x2, y2 in decoded_dict['groundtruth_boxes']:
                w, h = x2 - x1, y2 - y1
                assert(w >= 0 and h >= 0)

                records_ground_truths.append([x1, y1, w, h])
            
            ground_truth.append(np.array(records_ground_truths))
            predictions.append(np.array(record_predictions))
            confidences.append(record_confidences)
            
        else:
            skipped_images += 1
            tf.logging.info('Skipped images: {0}'.format(skipped_images))

INFO:tensorflow:Processing file: C:/Users/bones/kaggle/rsna/src/models/faster_rcnn_resnet50_coco_2018_01_28/inference
INFO:tensorflow:Processed 0 images...
INFO:tensorflow:Processed 1000 images...
INFO:tensorflow:Processed 2000 images...
INFO:tensorflow:Processed 3000 images...
INFO:tensorflow:Processed 4000 images...
INFO:tensorflow:Processed 5000 images...
INFO:tensorflow:Processed 6000 images...
INFO:tensorflow:Processed 7000 images...
INFO:tensorflow:Processed 8000 images...
INFO:tensorflow:Processed 9000 images...


In [6]:
print(utils.mean_rsna_metric(predictions, confidences, ground_truth))

0.7184064438881027
