# Generate GeM annotation

This Jupyter notebook can be used to generate XML annotation for describing the content and layout of multimodal documents, according to the schema defined by the Genre and Multimodality (GeM) model (Bateman 2008). The goal of this notebook is to facilitate the process of describing the layout, which has been previously identified as a major bottleneck for annotating documents using the GeM model (Thomas 2009; Hiippala 2015).

It should be noted that this notebook does not generate traditional human-annotated GeM markup, but rather a machine-readable variant, which may be referred to as *autogem*. However, various tools will be provided as a part of the <a href="https://github.com/thiippal/gem-tools">gem-tools</a> repository for visualizing *autogem* annotation.

The notebook is intended to be friendy to novice users: therefore most of the functions reside in an external file named generator.py. More advanced users may examine this file for a better understanding of the notebook's operation. 

** References **

Bateman, J.A. (2008) *Multimodality and Genre: A Foundation for the Systematic Analysis of Multimodal Documents*. London: Palgrave.

Hiippala, T. (2015) *The Structure of Multimodal Documents: An Empirical Approach*. New York and London: Routledge.

Thomas, M. (2009) *Localizing pack messages: A framework for corpus-based cross-cultural
multimodal analysis*. PhD thesis, University of Leeds.

## 1. Import the necessary packages.

In [None]:
# Computer vision
import cv2

# File handling
import codecs

# GeM generator
from generator import classify, describe, detect_roi, false_positives, generate_photo, generate_text, load_model, preprocess, project, sort_contours

# Jupyter notebook
from IPython.display import Image

## 2. Set up the classifier.

#### Load the pre-trained data and labels.

In [None]:
model = load_model()

## 3. Process the document image.

#### Preprocess the document image.

For best results, use documents with a resolution of 300 DPI.

In [None]:
image, original, filename, filepath = preprocess("test_images/2005-hwy-side_b-5.jpg")

#### Detect regions of interest in the document image.

Define a kernel for morphological operations.

In [None]:
kernel = (11, 11)

In [None]:
contours = detect_roi(image, kernel)

#### Sort the detected contours.

In [None]:
sorted_contours = sort_contours(contours)

#### Classify the detected contours.

In [None]:
classified_contours, contour_types = classify(sorted_contours, image, model)

#### Draw the detected contours for examination.

In [None]:
Image(filename="output/image_contours.png")

#### Mark false positives and erroneous or missing elements.

Check the image above for any false positives. Enter their numbers, separated by a space, below (e.g. 11 24 32).

In [None]:
fp_list = false_positives(raw_input())

Do you wish to mark additional elements in the document image (y/n)?

In [None]:
# Write a function that takes a raw input: 'y' opens a new window for marking the areas, 'n' continues.

#### Project the contours on the original high resolution document image.

In [None]:
hires_contours = project(image, original, classified_contours)

## 4. Begin the annotation.

Open the XML file.

In [None]:
layout_file_name = 'output/' + str(filename) + '-layout-2.xml'
xmlfile = codecs.open(layout_file_name, 'w', 'utf-8')

xml_opening = '<?xml version="1.0" encoding="UTF-8"?>\n\n <gemLayout>\n'

Begin the annotation.

In [None]:
xmlfile.write(xml_opening)

In [None]:
oh = original.shape[0]
ow = original.shape[1]

segmentation = []
area_model = []
realization = []

for num, hc in enumerate(hires_contours):
    if num in fp_list:
        continue
    else:
        (x, y, w, h) = cv2.boundingRect(hc)
        bounding_box = original[y:y+h, x:x+w]
        if contour_types[num] == 'text':
            # Generate XML entries
            lu, sa, re = generate_text(original, x, w, y, h, num)
            # Append descriptions to list
            segmentation.append(lu)
            area_model.append(sa)
            realization.append(re)
        if contour_types[num] == 'photo':
            # Generate XML entries
            vlu, vsa, vre = generate_photo(original, x, w, y, h, num)
            # Append descriptions to list
            segmentation.append(vlu)
            area_model.append(vsa)
            realization.append(vre)

## 4. Generate the GeM XML file

Generate annotation for layout layer segmentation.

In [None]:
segmentation_opening = '\t<segmentation>\n'

xmlfile.write("".join(segmentation_opening))

for s in segmentation:
    xmlfile.write("".join(s))
    
segmentation_closing = '\t</segmentation>\n'

xmlfile.write("".join(segmentation_closing))

Generate annotation for area model.

In [None]:
areamodel_opening = '\t<area-model>\n'

xmlfile.write("".join(areamodel_opening))

for a in area_model:
    xmlfile.write("".join(a))
    
areamodel_closing = '\t</area-model>\n'

xmlfile.write("".join(areamodel_closing))

Generate annotation for realization information.

In [None]:
realization_opening = '\t<realization>\n'

xmlfile.write("".join(realization_opening))

for r in realization:
    xmlfile.write("".join(r))
    
realization_closing = '\t</realization>\n'

xmlfile.write("".join(realization_closing))

Write closing tag.

In [None]:
xmlfile_closing = '</gemLayout>'

xmlfile.write("".join(xmlfile_closing))

In [None]:
xmlfile.close()