<a href="https://colab.research.google.com/github/victor-roris/ML-learning/blob/master/ComputerVision/DeepLearning_ComputerVision_AutoML_EfficientDet_Custom_DataSpartan_Dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# EfficientDet - Custom DataSpartan dataset

Notebook to train a EfficientDet model ([link](https://github.com/google/automl/tree/master/efficientdet)) with custom dataset generated in the AIDocs project of DataSpartan. This dataset is about document segmentation and it is accesible from my profesional drive account (this is private and not public available).

This notebook is an adaptation from: [tutorial notebooks 1](https://github.com/google/automl/blob/master/efficientdet/tutorial.ipynb)

[github](https://github.com/google/automl/tree/master/efficientdet)

In [1]:
from IPython.display import Image, clear_output  # to display images

## Install EfficientDet

In [2]:
%%capture
#@title
import os
import sys
import tensorflow.compat.v1 as tf

# Download source code.
if "efficientdet" not in os.getcwd():

  # Clone the efficientdet project
  !git clone --depth 1 https://github.com/google/automl

  # Modify console working path to the efficientdet root folder
  os.chdir('automl/efficientdet')       
  sys.path.append('.')            

  # Install requirements
  !pip install -r requirements.txt
  !pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
else:
  !git pull

## Download dataset

In [3]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
# Export dataset filepath
dataset_folderpath = "/content/dataset/"

In [5]:
!rm -R {dataset_folderpath}

rm: cannot remove '/content/dataset/': No such file or directory


In [6]:
from pathlib import Path
def mkdir(sfolderpath):
  folderpath = Path(sfolderpath)
  folderpath.mkdir(parents=True, exist_ok=True)

mkdir(dataset_folderpath)

In [7]:
!tar -xvf /content/drive/My\ Drive/DATASPARTAN/PROYECTOS/AI-DOCUMENTS/SEGMENTATION/export..segmentation..2020.10.05.tar.gz -C {dataset_folderpath}

clear_output()

!ls {dataset_folderpath}

export_coco.json  segmentation	  yolo_data.yaml
image		  sub_annotation  yolo_labels


## Train

### Convert dataset to tfrecord

https://github.com/google/automl/tree/master/efficientdet/dataset

In [8]:
!rm tfrecord -R

rm: cannot remove 'tfrecord': No such file or directory


In [9]:
!mkdir tfrecord
!PYTHONPATH=".:$PYTHONPATH"  python dataset/create_coco_tfrecord.py  \
  --image_dir=/content/dataset/image/ \
  --image_info_file=/content/dataset/export_coco.json \
  --object_annotations_file=/content/dataset/export_coco.json \
  --output_file_prefix=tfrecord/ds --num_shards=100

2020-10-07 15:44:30.371348: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
I1007 15:44:32.027569 140179327129472 create_coco_tfrecord.py:285] writing to output path: tfrecord/ds
I1007 15:44:35.018472 140179327129472 create_coco_tfrecord.py:215] Building bounding box index.
I1007 15:44:35.034357 140179327129472 create_coco_tfrecord.py:226] 25 images are missing bboxes.
I1007 15:44:35.106595 140179327129472 create_coco_tfrecord.py:323] On image 0 of 3968
I1007 15:44:35.358397 140179327129472 create_coco_tfrecord.py:323] On image 100 of 3968
I1007 15:44:35.583385 140179327129472 create_coco_tfrecord.py:323] On image 200 of 3968
I1007 15:44:35.833041 140179327129472 create_coco_tfrecord.py:323] On image 300 of 3968
I1007 15:44:36.055526 140179327129472 create_coco_tfrecord.py:323] On image 400 of 3968
I1007 15:44:36.390229 140179327129472 create_coco_tfrecord.py:323] On image 500 of 3968
I1007 15:44:36.635937 1401793271

We calculate the number of images per epoch

In [10]:
# We users should use all shards ds-*-of-00100.tfrecord.
file_pattern = 'ds-*-of-00100.tfrecord' 
# file_pattern = 'ds-00000-of-00100.tfrecord' 
images_per_epoch = 57 * len(tf.io.gfile.glob('tfrecord/' + file_pattern))
images_per_epoch = images_per_epoch // 8 * 8  # round to 64.
print('images_per_epoch = {}'.format(images_per_epoch))

images_per_epoch = 5696


## Training

In [11]:
MODEL = 'efficientdet-d0'

In [12]:
# Train efficientdet from scratch with backbone checkpoint.
backbone_name = {
    'efficientdet-d0': 'efficientnet-b0',
    'efficientdet-d1': 'efficientnet-b1',
    'efficientdet-d2': 'efficientnet-b2',
    'efficientdet-d3': 'efficientnet-b3',
    'efficientdet-d4': 'efficientnet-b4',
    'efficientdet-d5': 'efficientnet-b5',
    'efficientdet-d6': 'efficientnet-b6',
    'efficientdet-d7': 'efficientnet-b6',
}[MODEL]


# generating train tfrecord is large, so we skip the execution here.
import os
if backbone_name not in os.listdir():
  !wget https://storage.googleapis.com/cloud-tpu-checkpoints/efficientnet/ckptsaug/{backbone_name}.tar.gz
  !tar xf {backbone_name}.tar.gz

!rm /tmp/model_dir -R
!mkdir /tmp/model_dir
# key option: use --backbone_ckpt rather than --ckpt.
# Don't use ema since we only train a few steps.
!python main.py --mode=train_and_eval \
    --training_file_pattern=tfrecord/{file_pattern} \
    --validation_file_pattern=tfrecord/{file_pattern} \
    --model_name={MODEL} \
    --model_dir=/tmp/model_dir/{MODEL}-scratch  \
    --backbone_ckpt={backbone_name} \
    --train_batch_size=4 \
    --eval_batch_size=4 --eval_samples={images_per_epoch}  \
    --num_examples_per_epoch={images_per_epoch}  --num_epochs=1  \
    --hparams="num_classes=6,moving_average_decay=0,mixed_precision=true,max_instances_per_image=1000"

--2020-10-07 15:45:00--  https://storage.googleapis.com/cloud-tpu-checkpoints/efficientnet/ckptsaug/efficientnet-b0.tar.gz
Resolving storage.googleapis.com (storage.googleapis.com)... 64.233.189.128, 108.177.97.128, 108.177.125.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|64.233.189.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 39302973 (37M) [application/gzip]
Saving to: ‘efficientnet-b0.tar.gz’


2020-10-07 15:45:01 (54.9 MB/s) - ‘efficientnet-b0.tar.gz’ saved [39302973/39302973]

rm: cannot remove '/tmp/model_dir': No such file or directory
2020-10-07 15:45:02.871415: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
I1007 15:45:05.011347 140371753760640 main.py:228] {'name': 'efficientdet-d0', 'act_type': 'swish', 'image_size': (512, 512), 'target_size': None, 'input_rand_hflip': True, 'jitter_min': 0.1, 'jitter_max': 2.0, 'autoaugment_policy': None, 'use_au

### Save trained model

In [13]:
!ls /tmp/model_dir/efficientdet-d0-scratch/

archive
best_objective.txt
checkpoint
config.yaml
eval
events.out.tfevents.1602085529.e9931d098751
graph.pbtxt
model.ckpt-1100.data-00000-of-00001
model.ckpt-1100.index
model.ckpt-1100.meta
model.ckpt-1200.data-00000-of-00001
model.ckpt-1200.index
model.ckpt-1200.meta
model.ckpt-1300.data-00000-of-00001
model.ckpt-1300.index
model.ckpt-1300.meta
model.ckpt-1400.data-00000-of-00001
model.ckpt-1400.index
model.ckpt-1400.meta
model.ckpt-1424.data-00000-of-00001
model.ckpt-1424.index
model.ckpt-1424.meta


In [14]:
savedmodeldir = '/content/savedmodel'
!rm  -rf savedmodeldir
!rm  -rf {savedmodeldir}
!python model_inspect.py --runmode=saved_model --model_name=efficientdet-d0 \
  --hparams="num_classes=6,moving_average_decay=0"   \
  --ckpt_path=/tmp/model_dir/efficientdet-d0-scratch --saved_model_dir={savedmodeldir}

2020-10-07 17:13:40.744184: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-10-07 17:13:42.932123: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2200000000 Hz
2020-10-07 17:13:42.932364: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1eb8bc0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-10-07 17:13:42.932412: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-10-07 17:13:42.935723: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-10-07 17:13:43.010349: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-07 17:13:43.011245: I tensorflow/compiler/xla/servic

## Inference

In [15]:
# Prepare image
image_url =  'https://github.com/ibm-aur-nlp/PubLayNet/raw/master/examples/PMC3576793_00004.jpg'
img_path = '/content/img.png' 
!wget {image_url} -O {img_path}

--2020-10-07 17:14:11--  https://github.com/ibm-aur-nlp/PubLayNet/raw/master/examples/PMC3576793_00004.jpg
Resolving github.com (github.com)... 52.192.72.89
Connecting to github.com (github.com)|52.192.72.89|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/ibm-aur-nlp/PubLayNet/master/examples/PMC3576793_00004.jpg [following]
--2020-10-07 17:14:11--  https://raw.githubusercontent.com/ibm-aur-nlp/PubLayNet/master/examples/PMC3576793_00004.jpg
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 327048 (319K) [image/jpeg]
Saving to: ‘/content/img.png’


2020-10-07 17:14:12 (5.05 MB/s) - ‘/content/img.png’ saved [327048/327048]



In [16]:
# Prepare visualization settings
min_score_thresh = 0.35
max_boxes_to_draw = 200 

In [17]:
# Then run saved_model_infer to do inference.
# Notably: batch_size, image_size must be the same as when it is exported.
serve_image_out = '/content/serve_image_out'
!mkdir {serve_image_out}

!python model_inspect.py --runmode=saved_model_infer \
  --saved_model_dir={savedmodeldir} \
  --model_name={MODEL}  --input_image={img_path}  \
  --output_image_dir={serve_image_out} \
  --min_score_thresh={min_score_thresh}  --max_boxes_to_draw={max_boxes_to_draw}

2020-10-07 17:14:17.360807: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-10-07 17:14:19.510334: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2200000000 Hz
2020-10-07 17:14:19.510631: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x14a4d80 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-10-07 17:14:19.510679: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-10-07 17:14:19.512883: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-10-07 17:14:19.583271: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-10-07 17:14:19.584240: I tensorflow/compiler/xla/servic