# Plain examples and tests for the Dataset and the Frame classes
If you are looking for a quick example on how to use the `smartdoc15_ch1` package, we recommend you start by looking at the tutorials instead.

## Import

In [1]:
from smartdoc15_ch1 import Dataset

## Dataset loading
We use a reduced version of the dataset for testing purpose here. It contains only a fraction of the data.

The dataset is made accessible by creating an instance of the `Dataset` class.

In [2]:
d = Dataset(data_home="/data/competitions/2015-ICDAR-smartdoc/challenge1/99-computable-version-2017-test",
           download_if_missing=False)

See `Dataset` documentation for a description of the arguments of the constructor.

## Dataset format

The resulting instance is a modified `list` which makes it easy to access and iterate over elements.

In [3]:
d[0]

{'_scale_factor': None,
 'bg_id': 0,
 'bg_name': 'background01',
 'bl_x': 844.579,
 'bl_y': 851.507,
 'br_x': 1337.49,
 'br_y': 750.1089999999999,
 'frame_index': 181,
 'image_path': 'background01/datasheet004/frame_0181.jpeg',
 'image_path_absolute': '/data/competitions/2015-ICDAR-smartdoc/challenge1/99-computable-version-2017-test/smartdoc15-ch1_home/frames/background01/datasheet004/frame_0181.jpeg',
 'model_height': 2970.0,
 'model_id': 3,
 'model_name': 'datasheet004',
 'model_subid': 3,
 'model_width': 2100.0,
 'modeltype_id': 0,
 'modeltype_name': 'datasheet',
 'tl_x': 713.449,
 'tl_y': 234.484,
 'tr_x': 1141.9,
 'tr_y': 180.408}

Index-based access is available, as well as iteration.

In [4]:
len(d)

50

*(As said previously, we use a reduced version of the dataset here.)*

In [5]:
for frame in d[:10]:
    print(frame["image_path"])

background01/datasheet004/frame_0181.jpeg
background02/tax003/frame_0128.jpeg
background03/patent002/frame_0163.jpeg
background04/letter001/frame_0116.jpeg
background02/magazine001/frame_0045.jpeg
background05/patent001/frame_0062.jpeg
background04/patent005/frame_0048.jpeg
background02/patent002/frame_0106.jpeg
background03/tax002/frame_0158.jpeg
background02/datasheet003/frame_0129.jpeg


## Dataset content

Dataset elements are `Frame` objects. Each of them describe a frame of the dataset.
`Frame` objects are `dict` objects with few more methods and properties.

In [6]:
d0 = d[0]
type(d0)

smartdoc15_ch1.smartdoc_loader_v2.Frame

In [7]:
d0

{'_scale_factor': None,
 'bg_id': 0,
 'bg_name': 'background01',
 'bl_x': 844.579,
 'bl_y': 851.507,
 'br_x': 1337.49,
 'br_y': 750.1089999999999,
 'frame_index': 181,
 'image_path': 'background01/datasheet004/frame_0181.jpeg',
 'image_path_absolute': '/data/competitions/2015-ICDAR-smartdoc/challenge1/99-computable-version-2017-test/smartdoc15-ch1_home/frames/background01/datasheet004/frame_0181.jpeg',
 'model_height': 2970.0,
 'model_id': 3,
 'model_name': 'datasheet004',
 'model_subid': 3,
 'model_width': 2100.0,
 'modeltype_id': 0,
 'modeltype_name': 'datasheet',
 'tl_x': 713.449,
 'tl_y': 234.484,
 'tr_x': 1141.9,
 'tr_y': 180.408}

Frames contain everything one may want to know about a particular frame, and every value can be accessed like in a regular `dict` object.

In [8]:
d0["bg_name"], d0['model_name'], d0['frame_index']

('background01', 'datasheet004', 181)

Here, `d0` represents the frame `181` of the capture for `datasheet004` in `background01`.

Please note that `frame_index` values are indexed starting at `1`, as the codecs usually index them in such way. This is the only value which has this indexing, all other values (labels in particular) are indexed starting at `0`.

## Reading images
Of course, the whole point of this structure is to facilitate the iteration over dataset images.

One can read the image associated to a frame using the `Frame.read_image()` method.

In [9]:
img0 = d0.read_image()
img0.shape

(1080, 1920)

By default, images are loaded in grayscale format, unscaled.

You can ask for the color image if needed.

In [10]:
img0_color = d0.read_image(color=True)
img0_color.shape

(1080, 1920, 3)

And you can also force the image to be scaled. This reduces the image in both dimensions by the same factor.

In [11]:
img0_reduced = d0.read_image(force_scale_factor=0.5)
img0_reduced.shape

  warn("The default mode, 'constant', will be changed to 'reflect' in "
  warn("Anti-aliasing will be enabled by default in skimage 0.15 to "


(540, 960)

## Easy-to-use segmentation for each frame
You may need to access the segmentation target for a given frame without browsing the entire segmentation array by index. We offer two formats:
- dict format with a textual indexing of each coordinate
- a list format with coordinates is the same order as the `Dataset.segmentation_targets` array: "tl_x", "tl_y", "bl_x", "bl_y", "br_x", "br_y", "tr_x", "tr_y"

In [12]:
d0.segmentation_dict

{'bl_x': 844.579,
 'bl_y': 851.507,
 'br_x': 1337.49,
 'br_y': 750.1089999999999,
 'tl_x': 713.449,
 'tl_y': 234.484,
 'tr_x': 1141.9,
 'tr_y': 180.408}

In [13]:
d0.segmentation_list

[713.449,
 234.484,
 844.579,
 851.507,
 1337.49,
 750.1089999999999,
 1141.9,
 180.408]

## Scaled segmentation for frames
You may need the scaled segmentation for a given frame, provided its scaling factor was defined during `Dataset` creation, and not forced with the parameter `force_scale_factor` of `Frame.read_image()`. We provide the scaled equivalents for the previous segmentation accessors.

In [14]:
d0.segmentation_dict_scaled

{'bl_x': 844.579,
 'bl_y': 851.507,
 'br_x': 1337.49,
 'br_y': 750.1089999999999,
 'tl_x': 713.449,
 'tl_y': 234.484,
 'tr_x': 1141.9,
 'tr_y': 180.408}

In [15]:
d0.segmentation_list_scaled

[713.449,
 234.484,
 844.579,
 851.507,
 1337.49,
 750.1089999999999,
 1141.9,
 180.408]

## Target values
If you process frames in batch, you may want to access the target values for each possible task in a single line.
The following methods enable you to do so, listing the expected values (unscaled in the case of the segmentation) for the segmentation and the classification tasks.

The expected values are returned as a Numpy array for which the rows are sorted as in the `Dataset` object.

In [16]:
d.model_classif_targets[:10]

array([ 3, 27, 21,  5, 10, 20, 24, 21, 26,  2])

In [17]:
d.modeltype_classif_targets[:10]

array([0, 5, 4, 1, 2, 4, 4, 4, 5, 0])

In [18]:
d.segmentation_targets[:10]

array([[ 713.449,  234.484,  844.579,  851.507, 1337.49 ,  750.109,
        1141.9  ,  180.408],
       [ 824.825,  299.55 ,  967.431,  716.278, 1356.08 ,  618.052,
        1154.95 ,  231.329],
       [ 847.211,  307.687, 1104.4  ,  718.781, 1441.27 ,  594.959,
        1153.41 ,  234.478],
       [1281.59 ,  224.198,  651.798,  268.267,  643.051,  689.852,
        1353.73 ,  643.7  ],
       [ 817.823,  303.746,  765.635,  717.812, 1177.38 ,  735.762,
        1156.65 ,  319.191],
       [ 637.07 ,  179.005,  543.07 ,  802.596, 1048.44 ,  867.4  ,
        1055.05 ,  289.651],
       [ 750.351,  865.451, 1470.5  ,  754.412, 1343.21 ,  333.88 ,
         711.638,  423.068],
       [ 989.385,  244.61 , 1092.06 ,  710.809, 1485.05 ,  656.634,
        1330.85 ,  211.433],
       [ 817.564,  210.388,  986.025,  794.638, 1296.8  ,  713.184,
        1132.31 ,  214.224],
       [ 788.263,  216.669,  864.158,  815.337, 1364.16 ,  749.601,
        1214.21 ,  178.942]])

For the segmentation task, the actual physical shape of the document objects is required to compare the returned results and the expected ones.

In [19]:
d.model_shapes[:10]

array([[2100., 2970.],
       [2100., 2970.],
       [2100., 2970.],
       [2100., 2970.],
       [2100., 2970.],
       [2100., 2970.],
       [2100., 2970.],
       [2100., 2970.],
       [2100., 2970.],
       [2100., 2970.]])

For the sake of completeness, background labels can be obtained easily as well.

In [20]:
d.background_labels[:10]

array([0, 1, 2, 3, 1, 4, 3, 1, 2, 1])

## Listing label values
Some properties may help you listing the possible values for `background_id`, `model_id` and `modeltype_id`.

In [21]:
d.unique_background_ids

array([0, 1, 2, 3, 4])

In [22]:
d.unique_background_names

array(['background01', 'background02', 'background03', 'background04',
       'background05'], dtype=object)

In [23]:
d.unique_model_ids

array([ 0,  1,  2,  3,  4,  5,  7,  8,  9, 10, 12, 13, 15, 16, 17, 18, 19,
       20, 21, 23, 24, 25, 26, 27, 28, 29])

In [24]:
d.unique_model_names

array(['datasheet001', 'datasheet002', 'datasheet003', 'datasheet004',
       'datasheet005', 'letter001', 'letter003', 'letter004', 'letter005',
       'magazine001', 'magazine003', 'magazine004', 'paper001',
       'paper002', 'paper003', 'paper004', 'paper005', 'patent001',
       'patent002', 'patent004', 'patent005', 'tax001', 'tax002',
       'tax003', 'tax004', 'tax005'], dtype=object)

In [25]:
d.unique_modeltype_ids

array([0, 1, 2, 3, 4, 5])

In [26]:
d.unique_modeltype_names

array(['datasheet', 'letter', 'magazine', 'paper', 'patent', 'tax'],
      dtype=object)

Finally, the underlying Pandas Dataframe is made available directly in case you need more flexibility.

In [27]:
d.raw_dataframe[:10]

Unnamed: 0,bg_name,bg_id,model_name,model_id,modeltype_name,modeltype_id,model_subid,image_path,frame_index,model_width,model_height,tl_x,tl_y,bl_x,bl_y,br_x,br_y,tr_x,tr_y
0,background01,0,datasheet004,3,datasheet,0,3,background01/datasheet004/frame_0181.jpeg,181,2100.0,2970.0,713.449,234.484,844.579,851.507,1337.49,750.109,1141.9,180.408
1,background02,1,tax003,27,tax,5,2,background02/tax003/frame_0128.jpeg,128,2100.0,2970.0,824.825,299.55,967.431,716.278,1356.08,618.052,1154.95,231.329
2,background03,2,patent002,21,patent,4,1,background03/patent002/frame_0163.jpeg,163,2100.0,2970.0,847.211,307.687,1104.4,718.781,1441.27,594.959,1153.41,234.478
3,background04,3,letter001,5,letter,1,0,background04/letter001/frame_0116.jpeg,116,2100.0,2970.0,1281.59,224.198,651.798,268.267,643.051,689.852,1353.73,643.7
4,background02,1,magazine001,10,magazine,2,0,background02/magazine001/frame_0045.jpeg,45,2100.0,2970.0,817.823,303.746,765.635,717.812,1177.38,735.762,1156.65,319.191
5,background05,4,patent001,20,patent,4,0,background05/patent001/frame_0062.jpeg,62,2100.0,2970.0,637.07,179.005,543.07,802.596,1048.44,867.4,1055.05,289.651
6,background04,3,patent005,24,patent,4,4,background04/patent005/frame_0048.jpeg,48,2100.0,2970.0,750.351,865.451,1470.5,754.412,1343.21,333.88,711.638,423.068
7,background02,1,patent002,21,patent,4,1,background02/patent002/frame_0106.jpeg,106,2100.0,2970.0,989.385,244.61,1092.06,710.809,1485.05,656.634,1330.85,211.433
8,background03,2,tax002,26,tax,5,1,background03/tax002/frame_0158.jpeg,158,2100.0,2970.0,817.564,210.388,986.025,794.638,1296.8,713.184,1132.31,214.224
9,background02,1,datasheet003,2,datasheet,0,2,background02/datasheet003/frame_0129.jpeg,129,2100.0,2970.0,788.263,216.669,864.158,815.337,1364.16,749.601,1214.21,178.942
