In [None]:
# Author  : Abhishek Dutta <adutta@robots.ox.ac.uk>
# Date    : 2022-07-14
#
# Version History
#     2022-07-25 : Workshop at ADHO Digital Humanities - 2022 (Tokyo) https://dh2022.adho.org/workshops-and-tutorials/wt-07
#     2022-08-16 : JPNP: Adjustments to run in SageMaker Studio Labs for DH + BH conference 2022 https://dcsco-op.org/dhbh/

# Early Printed Book Illustration Detection Using Object Detectors

In this tutorial, we describe the process to create a book illustration detector that can automatically detect illustrations in images containing early printed book pages. Such an illustration detector has enabled the [visual analysis of chapbooks printed in Scotland](https://www.robots.ox.ac.uk/~vgg/research/chapbooks/). The book illustration detector presented in this tutorial is trained using the [chapbooks dataset](https://data.nls.uk/data/digitised-collections/chapbooks-printed-in-scotland/) published in the public domain by the National Library of Scotland (NLS).

This tutorial is organised as follows. First, we download and install all the required tools in this interactive python notebook. Next, we demonstrate an existing (i.e. pre-trained) illustration detector taken from the [VGG Chapbooks](https://www.robots.ox.ac.uk/~vgg/research/chapbooks/) project. The process of creating such an automatic illustration detector is described next. Finally, we describe some advanced, but optional, learning exercise which demonstrated the impact of the training sample on performance of automatic illustration detectors.


## 1. Download and Install the Required Tools
The illustration detector developed in the [VGG Chapbooks Project](https://www.robots.ox.ac.uk/~vgg/research/chapbooks/) is based on the [EfficientDet](https://github.com/google/automl/tree/master/efficientdet) object detector. The VGG Chapbooks Project code repository contains all the data, pre-trained object detector and tools required in this tutorial. Therefore, we download the [code repository](https://gitlab.com/vgg/nls-chapbooks-illustrations/) and setup the environment in this Jupyter (Sagemaker Studio Lab) document. This setup is essential for all the remaining sections of this tutorial and therefore must be executed before running commands from any other section.

Install dependancies into the Sagemaker environment (up to 5 minutes).

In [None]:
%conda install tensorflow opencv

Install project code.

In [None]:

## Download VGG Chapbooks project code repository and setup environment
import os
import sys
import tensorflow.compat.v1 as tf
import cv2
import datetime
import json

BASEDIR = %pwd

if 'nls-chapbooks-illustrations' not in os.getcwd():
  !git clone --recurse-submodules https://gitlab.com/vgg/nls-chapbooks-illustrations.git
  os.chdir('nls-chapbooks-illustrations/automl/efficientdet')
  !git pull origin master  # update EfficientDet code to the latest version
  %pip install -r requirements.txt
  %pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

## Errors building wheel for cocoapi

## Create folders used to store data (images, annotations, etc.) for this tutorial
DATA_DIR = BASEDIR + '/sample_data/chapbooks/'
DEMO_DIR = os.path.join(DATA_DIR, 'demo')
DET_DIR = os.path.join(DATA_DIR, 'demo', 'detection-results')
TRAIN_DIR = os.path.join(DATA_DIR, 'train')

if not os.path.exists(DET_DIR):
  os.makedirs(DET_DIR)
if not os.path.exists(TRAIN_DIR):
  os.makedirs(TRAIN_DIR)

Define a utility function to show images:

In [None]:
from matplotlib import pyplot as plt

## We define a utility function that will be used throughout this tutorial
def show_thumbnail(img_fn, tsize=500):
  '''
  Show a thumbnail sized version of an image in Colab
  '''
  img = cv2.imread(img_fn)
  w, h, c = img.shape
  if w > tsize or h > tsize:
    if w > h:
      new_width = tsize
      new_height = int( (w/h) * new_width )
    else:
      new_height = tsize
      new_width = int( (h/w) * new_height )

    resized_img = cv2.resize( img, (new_width, new_height) )
    img2 = cv2.cvtColor(resized_img, cv2.COLOR_BGR2RGB)
    #cv2_imshow(resized_img)

  else:
    img2 = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    
  # Use pyplot instead of cv2_imgshow
  #  https://gist.github.com/mstfldmr/45d6e47bb661800b982c39d30215bc88
  #  https://stackoverflow.com/questions/36367986/
  plt.figure(figsize=(15,15))
  plt.imshow(img2)
  plt.xticks([]), plt.yticks([])
  plt.show()


## 2. Demo of an Automatic Book Illustration Detector

In this section, we demonstrate the automatic illustration detection capabilities developed in the [VGG Chapbooks](https://www.robots.ox.ac.uk/~vgg/research/chapbooks/) project. First, we download a [test image](https://gitlab.com/vgg/nls-chapbooks-illustrations/-/blob/master/data/images/test_images/BL_compultensian-polyglot-bible-g_11955_title_page.jpg). It is possible to chose a different test image by enter the URL of that image in the text input box shown in the right hand side.

In [None]:
## Download test image
image_url =  'https://gitlab.com/vgg/nls-chapbooks-illustrations/-/raw/master/data/images/test_images/BL_tyndales-new-testament-1526-c_188_a_17_f001r.jpg'#@param
test_image_filename = 'test_image.jpg'
test_image_path = os.path.join(DEMO_DIR, 'test_image.jpg')
!wget {image_url} -O {test_image_path}

show_thumbnail(test_image_path)

Next, we apply the pretrained Illustration Detector to this test image.

In [None]:
## Apply illustration detector to test image
os.chdir(BASEDIR + '/nls-chapbooks-illustrations/tools')
!PYTHONPATH={BASEDIR}/nls-chapbooks-illustrations/automl/efficientdet/ python detect-illustration.py \
  --model-name=efficientdet-d0 \
  --saved-model-dir={BASEDIR}/nls-chapbooks-illustrations/data/efficientdet/saved_model/v1/  \
  --hparams={BASEDIR}/nls-chapbooks-illustrations/data/efficientdet/hparams.yaml \
  --input-image={test_image_path} \
  --output-image-dir={DET_DIR} \
  --output-json-fn={DET_DIR}/metadata.json

Finally, we visualise the detection results and show the confidence (a value of 1.0 implies 100% confidence) of these detections.

In [None]:
## Show detection results
!ls -l {DET_DIR}
show_thumbnail( os.path.join(DET_DIR, 'test_image.jpg'), tsize=800 )
with open( os.path.join(DET_DIR, 'metadata.json'), 'r' ) as f:
  d = json.load(f)
  print( json.dumps(d, indent=4) )

## 3. Creating an Automatic Book Illustration Detector?
In this section, we describe the process of creating an automatic book illustration detector which involves creating manually annotated examples of the object (i.e. book illustrations) and a training process in which an object detector learns to identify these objects in an image using the manually annotated samples. The training process is fully automatic. Therefore, the only laborious part of this process is the manual annotation of object instances. To reduce the workload, we have provided samples of manual annotations and learners are required to only manually annotate 5 images. The process is described below.

### 3.1 Create Manually Annotated Dataset
To train an object detector, we need examples of how the object appears in an image. Since we are creating a book illustration detector, we collect some images of book pages containing an illustration and manually annotate (i.e. draw a rectangular box) the location of these illustrations.

For this tutorial, we have prepared a set of 25 images that contains an illustration and are taken from the NLS Chapbooks Dataset. Here are the steps to view and create the required manual annotations.

1. Download the [nls-chapbooks-25.zip (9MB)](https://www.robots.ox.ac.uk/~vgg/research/chapbooks/dh2022/data/nls-chapbooks-25.zip) file which contains the following.

  * 25 images from Chapbooks for training in `img/` folder
  * Manual annotations of **only 20 training images** in `train.json` file (remaining 5 manual annotations should to be done by the learner)
  * 50 images from Chapbooks for testing in `img/` folder
  * Manual annotations of all 50 test images in `test.json` file
  * List Annotator (LISA) application `lisa.html` to create new manual annotations and view existing annotations

2. Open `lisa.html` file in a web browser such as Firefox or Chrome. Safari and Internet Explorer are not recommended.)

3. Click "Browse" (or Choose File) in the "Load Existing Project" section and select the training annotations contained in the `train.json` LISA project.

4. Draw a rectangular bounding box around illustration of 5 chapbook images that are missing manual annotations. To draw a bounding box around an illustration, press your mouse or trackpad button and drag your pointer over the illustration.

5. After all the manual annotations are created, press `Ctrl` + `S` (i.e. hold Control key and press the `S` key) and save the annotations as `train25.json` in the same folder as the 'nls-chapbooks-25' folder that you previously downloaded. .

```

The manual annotations are complete. Let us now prepare the training and testing image dataset in this environment. We first download a copy of [nls-chapbooks-25.zip](https://www.robots.ox.ac.uk/~vgg/research/chapbooks/dh2022/data/nls-chapbooks-25.zip) file.

In [None]:
os.chdir(TRAIN_DIR)
if not os.path.exists( os.path.join(TRAIN_DIR, 'nls-chapbooks-25.zip') ):
  !wget https://www.robots.ox.ac.uk/~vgg/research/chapbooks/dh2022/data/nls-chapbooks-25.zip
  !unzip nls-chapbooks-25.zip
!ls nls-chapbooks-25

The `train.json` file extracted from the [nls-chapbooks-25.zip](https://www.robots.ox.ac.uk/~vgg/research/chapbooks/dh2022/data/nls-chapbooks-25.zip) file contains manual annotations for only 20 images in the training dataset. You can now upload the `train25.json` file that you had saved earlier to this environment. Click on the File Browser (folder icon) on the left hand side panel of this notebook. In the folder tree view, click "sample_data -> chapbooks -> train -> nls-chapbooks-25". Now, click on the Upload Files button above this pane . Now point to the `train25.json` file that you had saved earlier in your local computer. To check if the upload was successful, run the following command and ensure that one of the listing entries corresponds to `train25.json` file.

In [None]:
## Ensure that the user uploaded train25.json file has been placed correctly
!ls -l {TRAIN_DIR}/nls-chapbooks-25
if not os.path.exists( os.path.join(TRAIN_DIR, 'nls-chapbooks-25', 'train25.json') ):
  raise ValueError('Error: you missed to upload the train25.json file.\nClick "Files" in the left toolbar and upload file.')

### 3.2 Convert Annotations

The manual annotations of bounding boxes corresponding to illustrations are contained in `train25.json` file and the corresponding images are stored in `img` folder. Manual annotations for the test set are already contained in `test.json` file. We can now export the annotations to [COCO](https://cocodataset.org/#format-data) format which is the most commonly used format for training object detectors, including our EfficientDet model.

In [None]:
## Convert manual annotations to COCO format
os.chdir(BASEDIR + '/nls-chapbooks-illustrations/tools')
!python lisa_to_coco.py --lisa_project_fn={TRAIN_DIR}/nls-chapbooks-25/train25.json
!python lisa_to_coco.py --lisa_project_fn={TRAIN_DIR}/nls-chapbooks-25/test.json

## Expected output
# Exporting annotations in 25 images to COCO format
# ...
# Written COCO dataset to /content/sample_data/chapbooks/train/nls-chapbooks-25/train25_train_coco.json
# Exporting annotations in 50 images to COCO format
# ...
# Written COCO dataset to /content/sample_

The program code for training EfficientDet object detector model uses the [tfrecord](https://www.tensorflow.org/tutorials/load_data/tfrecord) data storage format to represent images and their manual annotations in a compact form. Therefore, we convert our annotations in [COCO](https://cocodataset.org/#format-data) format to the tfrecord format using the [create_coco_tfrecord.py](https://github.com/google/automl/blob/master/efficientdet/dataset/create_coco_tfrecord.py) script so that it can be used for EfficientDet training. 

In [None]:
## Convert to tfrecord format
TFRECORD_DIR = os.path.join(TRAIN_DIR, 'tfrecord', 'nls-chapbooks-25')
if not os.path.exists(TFRECORD_DIR):
  os.makedirs(TFRECORD_DIR)
os.chdir(BASEDIR + '/nls-chapbooks-illustrations/automl/efficientdet/')
!PYTHONPATH={BASEDIR}/nls-chapbooks-illustrations/automl/efficientdet/ python \
  dataset/create_coco_tfrecord.py \
  --logtostderr \
  --image_dir={TRAIN_DIR}/nls-chapbooks-25/img/ \
  --object_annotations_file={TRAIN_DIR}/nls-chapbooks-25/train25_train_coco.json \
  --output_file_prefix={TFRECORD_DIR}/train \
  --num_shards=1

!PYTHONPATH={BASEDIR}/nls-chapbooks-illustrations/automl/efficientdet/ python \
  dataset/create_coco_tfrecord.py \
  --logtostderr \
  --image_dir={TRAIN_DIR}/nls-chapbooks-25/img/ \
  --object_annotations_file={TRAIN_DIR}/nls-chapbooks-25/test_train_coco.json \
  --output_file_prefix={TFRECORD_DIR}/test \
  --num_shards=1

The next command is used to confirm that we have the following two files in the `/sample_data/chapbooks/train/tfrecord/nls-chapbooks-25/` folder.
```
test-00000-of-00001.tfrecord
train-00000-of-00001.tfrecord
```

In [None]:
!ls -l {TRAIN_DIR}/tfrecord/nls-chapbooks-25/

### 3.3 Train Object Detector Using Manually Annotated Dataset

Now we can start the training of our EfficientDet object detector using the 25 manually annotated pages. We will use the test dataset -- containing 50 manually annotated instances -- to evaluate the performance of our final trained model.

In [None]:
## Start Training Process
MODEL_BASE_DIR = os.path.join(TRAIN_DIR, 'model')
if not os.path.exists(MODEL_BASE_DIR):
  os.makedirs(MODEL_BASE_DIR)
MODEL_DIR = os.path.join(MODEL_BASE_DIR, datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
if not os.path.exists(MODEL_DIR):
  os.makedirs(MODEL_DIR)

os.chdir(BASEDIR + '/nls-chapbooks-illustrations/automl/efficientdet/')
if not os.path.exists('efficientdet-d0'):
  !wget  https://storage.googleapis.com/cloud-tpu-checkpoints/efficientdet/coco/efficientdet-d0.tar.gz
  !tar zxf efficientdet-d0.tar.gz

!PYTHONPATH={BASEDIR}/nls-chapbooks-illustrations/automl/efficientdet/ python \
  main.py --mode=train \
  --train_file_pattern={TFRECORD_DIR}/train-*-of-00001.tfrecord \
  --val_file_pattern={TFRECORD_DIR}/test-*-of-00001.tfrecord \
  --model_name=efficientdet-d0 \
  --model_dir={MODEL_DIR}  \
  --ckpt=efficientdet-d0 \
  --train_batch_size=8 \
  --num_examples_per_epoch=25 --num_epochs=15  \
  --hparams="num_classes=1,moving_average_decay=0" \
  --eval_after_train=True --tf_random_seed=9973


The training process (15 epochs) takes around 7 minutes. After the training is complete, the trained book illustration detector is automatically evaluated on our test dataset (50 manually annotated instances that were not present in the training set). This provides a reasonable estimate of the performance of this model when it is applied to unseen book images.

We will use the following two metrics to assess the performance of the trained model: Average Precision (AP) and Average Recall (AR). A higher value of precision implies that the detections were closer to the ground truth (i.e. the location of book illustrations). A higher recall value implies that most of the book illustrations were detected by the illustration detector (i.e. it did not miss them altogether).

The performance of our retrained illustration detector is as follows. Note that the performance metrics (AP and AR) value may differ (e.g. AP=0.832, AR=0.831) slightly between different runs. What do you think may be the reason for these differences?

```
AP = 0.841
AR = 0.838 
```

This is a remarkably good level of performance obtained from just 25 training samples; the [researchers](https://arxiv.org/abs/1911.09070) who developed the EfficientDet model should be thanked for creating such a light weight, high performing model and sharing this model as an open source project that has enabled projects like the [VGG Chapbooks project](https://www.robots.ox.ac.uk/~vgg/research/chapbooks/).

### 3.4 Visualise Results from Newly Trained Illustration Detector

We test the newly trained book illustration detector.


In [None]:
## Download test image
image_url =  'https://raw.githubusercontent.com/gbergel/chapbooks-sagemaker-lab/gbergel-patch-3/Wynken.jpg' #@param
test_image2_filename = 'test_image2.jpg'
test_image2_path = os.path.join(DEMO_DIR, 'test_image2.jpg')
!wget {image_url} -O {test_image2_path}

show_thumbnail(test_image2_path)


Next, we convert the newly trained book illustration detector in a format (i.e. [saved model format](https://www.tensorflow.org/guide/saved_model)) that allows the detector to run at faster speed.

In [None]:
## Convert model to saved-model format (for faster inference)
os.chdir(BASEDIR + '/nls-chapbooks-illustrations/automl/efficientdet/')
!PYTHONPATH={BASEDIR}/nls-chapbooks-illustrations/automl/efficientdet/ \
  python model_inspect.py \
  --runmode=saved_model \
  --model_name=efficientdet-d0 \
  --ckpt_path={MODEL_DIR} \
  --saved_model_dir={MODEL_DIR}/savedmodel \
  --hparams="num_classes=1,moving_average_decay=0"


Next, we apply the book illustration detector on the downloaded test image.

In [None]:
## Apply illustration detector to test image
os.chdir(BASEDIR + '/nls-chapbooks-illustrations/tools')
!PYTHONPATH={BASEDIR}/nls-chapbooks-illustrations/automl/efficientdet/ \
  python detect-illustration.py \
  --model-name=efficientdet-d0 \
  --saved-model-dir={MODEL_DIR}/savedmodel/  \
  --hparams={BASEDIR}/nls-chapbooks-illustrations/data/efficientdet/hparams.yaml \
  --input-image={test_image2_path} \
  --output-image-dir={DET_DIR} \
  --output-json-fn={DET_DIR}/metadata.json

Finally, we visualise the detection result.

In [None]:
## Show detection result
show_thumbnail( os.path.join(DET_DIR, 'test_image2.jpg'), 1200 )
with open( os.path.join(DET_DIR, 'metadata.json'), 'r' ) as f:
  d = json.load(f)
  print( json.dumps(d, indent=4) )

### 3.5 What can be a challenging test image for this illustration detector trained on only 25 images?

Deep learning models often face a challenge when they have to operate on test data that are different from the training data. Since we trained on 25 images taken from the NLS Chapbooks dataset, the [following image](https://gitlab.com/vgg/nls-chapbooks-illustrations/-/blob/master/data/images/test_images/BL_tyndales-new-testament-1526-c_188_a_17_f001r.jpg) taken from the [British Library](https://www.bl.uk/sacred-texts/articles/from-sacred-scriptures-to-the-peoples-bible) is a challenging test image for this model. Learners are encouraged to apply this illustration detector on this challenging image which may dishearten some of the learners. Here are some notes to help the learners think through this new observation.

*   Each detection comes with a confidence level (0% to 100%) and if we only want to retain high confidence detections, we can set a high threshold (e.g. 0.9) in order to discard incorrect detection like the second detection (confidence = 0.81) which corresponds to a part of the illustration.
*   Recall that we have trained this book illustration detector using only 25 instances of book illustration. More training samples may help improve the performance.



## 4. Additional Learning Exercise (Optional)
Here is an extra challenge. How much performance improvement can we obtain by training on more samples? For example, what performance improvements will we obtain if we double the number of training samples to 50? What about training on 100 samples?

We have in fact created datasets containing 50 and 100 manually annotated images while retaining the same 50 test images. Since the test images remains same, we can compare the performance improvement obtained by increasing the number of training samples. Note that the none of test images are contained in the training dataset. Here is the download link for these additional training datasets.

*   [nls-chapbooks-50.zip](https://www.robots.ox.ac.uk/~vgg/research/chapbooks/dh2022/data/nls-chapbooks-50.zip) (12MB)
  
  - contains 50 manually annotated instances of book illustration
  - test set contains 50 instances which is same as the test instances contained in nls-chapbooks-25.zip and nls-chapbooks-100.zip datasets.

*   [nls-chapbooks-100.zip](https://www.robots.ox.ac.uk/~vgg/research/chapbooks/dh2022/data/nls-chapbooks-100.zip) (18MB)

  - contains 100 manually annotated instances of book illustration
  - test set contains 50 instances which is same as the test instances contained in nls-chapbooks-25.zip and nls-chapbooks-50.zip.

Here are the steps you may want to follow in order to run experiments on these two datasets.

1.   Download the new dataset
2.   Convert to COCO format
3.   Convert to tfrecord format
4.   Start the training ensuring that settings are consistent across experiments


Here are some questions that you can think about before starting the experiment?

*   Will the performance improve?
*   By how much (e.g. 1% or 5% or 10%) will the performance improve when the number of training samples first increases by 25 (i.e. number of training sample is 50) and then increases by 75?

When you have the performance metrics, you can plot them using the following code by updating the second and third numbers in the `AP` and `AR` variables based on your experiments. Currently, it shows a flat line because we have replicated the same AP and AR performance metric for all the three cases.


In [None]:
import matplotlib.pyplot as plt

AP = [0.841, 0.871, 0.891]  # update the 2nd and 3rd numbers based on your experiments
AR = [0.838, 0.858, 0.878]  # update the 2nd and 3rd numbers based on your experiments
TRAINING_IMAGE_COUNT = [25, 50, 100]

plt.plot(TRAINING_IMAGE_COUNT, AP, color='#0072B2', marker='o', label='Average Precision (AP)')
plt.plot(TRAINING_IMAGE_COUNT, AR, color='#D55E00', marker='o', label='Average Recall (AR)')
plt.xlabel('Number of training images')
plt.ylabel('Detection performance (AP and AR)')
plt.title('Book Illustration Detection Performance Dependence on Number of Training Image')
#plt.ylim(0.7, 1.0)
plt.show()

## 5. Poll
What other types of automatic detectors can you think of that may be useful for digital humanists/book historians? Do share your thoughts with fellow learners during the workshop, or afterwards.



# 6. Frequently Asked Questions (FAQ)

* I got an error stating that something was not found or not defined (e.g. `NameError: name 'os' is not defined`)
> Most likely, you are executing a command cell with executing the previous cells which contains some dependencies (e.g. import statements, generate folders, etc.) that are used by the cell. Did you run the commands given in section "1. Download and Install the Required Tools" ?


