In [1]:
import random
random.seed(123)

## Models / inferencers
Models & inferencers accept lists of images, and return lists of results (either segmentation or recognition results)

I have made a dummy `SegmentationModel` and `RecognitionModel` in `models.py`. These do the same thing as the current inferencers.

```python
class SegmentationModel:
    def __call__(self, images: list[np.ndarray]) -> list[SegmentationResult]:
        ...


@dataclass
class SegmentationResult:
    boxes: np.ndarray
    masks: np.ndarray
    scores: np.ndarray
    labels: np.ndarray
```


(It would be nice to wrap all models in a "batching" function, which divides an input list into chunks if it is too long) -> This is a card in DevOps


## Using the Volume class

To load images, create a `Volume`. The name of this class is not set in stone... It represents what Catrin called a "batch", a divison of an archive volume, but I don't want to use "batch" because of potential confusion with a model's batch (the number of inputs it operates on simultaneously). 


In [2]:
from htrflow_core.volume import Volume

images = ['../assets/demo_image.jpg'] * 5 

volume = Volume(images)

The `Volume` instance holds a tree. We see the root `node` and its five children, each representing one input image:

In [3]:
print(volume)

└──<htrflow_core.volume.Node object at 0x7f70f6f38250>
    ├──626x1629 image demo_image
    ├──626x1629 image demo_image
    ├──626x1629 image demo_image
    ├──626x1629 image demo_image
    └──626x1629 image demo_image


The images are available through `volume.images()`. We pass them through a segmentation model:

In [4]:
from htrflow_core.models.dummy_models import SegmentationModel

model = SegmentationModel()
results = model(volume.images())
print(results[0])

SegmentationResult(metadata={'model_name': 'SegmentationModel'}, image=array([[[118, 120, 128],
        [115, 117, 125],
        [114, 116, 124],
        ...,
        [215, 219, 220],
        [209, 213, 214],
        [206, 210, 211]],

       [[110, 112, 120],
        [110, 112, 120],
        [110, 112, 120],
        ...,
        [211, 215, 216],
        [207, 211, 212],
        [209, 213, 214]],

       [[109, 112, 120],
        [109, 112, 120],
        [104, 107, 115],
        ...,
        [207, 211, 212],
        [205, 209, 210],
        [209, 213, 214]],

       ...,

       [[146, 152, 151],
        [147, 153, 152],
        [147, 153, 152],
        ...,
        [212, 218, 213],
        [214, 222, 211],
        [211, 221, 204]],

       [[144, 150, 149],
        [146, 152, 151],
        [148, 154, 153],
        ...,
        [217, 223, 212],
        [220, 231, 205],
        [216, 234, 187]],

       [[147, 153, 152],
        [149, 155, 154],
        [151, 157, 156],
        ...,
   

The results are a list of `SegmentationResult`. To apply the results to the input images, we pass them back to the volume with its `update` method. It returns the new regions as a list of images.

In [5]:
regions = volume.update(results)

The volume tree has now grown:

In [6]:
print(volume)

└──<htrflow_core.volume.Node object at 0x7f70f6f38250>
    ├──626x1629 image demo_image
    │   └──156x406 region at (345, 11)
    ├──626x1629 image demo_image
    │   ├──117x406 region at (17, 0)
    │   ├──156x406 region at (948, 262)
    │   └──156x309 region at (0, 85)
    ├──626x1629 image demo_image
    │   ├──156x406 region at (480, 173)
    │   ├──156x406 region at (690, 11)
    │   ├──149x406 region at (570, 0)
    │   ├──156x332 region at (1296, 381)
    │   └──156x292 region at (0, 16)
    ├──626x1629 image demo_image
    │   ├──99x213 region at (1415, 0)
    │   └──116x406 region at (678, 509)
    └──626x1629 image demo_image
        ├──156x278 region at (0, 234)
        ├──156x406 region at (786, 133)
        ├──156x406 region at (1105, 461)
        └──90x406 region at (442, 0)


The new regions can be passed through a segmentation model (such as a line model) again. The `update` method always updates the leaves of the tree.

In [7]:
results = model(volume.segments())
volume.update(results)
print(volume)

└──<htrflow_core.volume.Node object at 0x7f70f6f38250>
    ├──626x1629 image demo_image
    │   └──156x406 region at (345, 11)
    │       ├──37x100 region at (517, 129)
    │       ├──22x100 region at (636, 144)
    │       ├──38x100 region at (543, 125)
    │       ├──38x100 region at (486, 122)
    │       └──38x69 region at (681, 38)
    ├──626x1629 image demo_image
    │   ├──117x406 region at (17, 0)
    │   │   └──28x100 region at (216, 70)
    │   ├──156x406 region at (948, 262)
    │   │   ├──33x100 region at (1070, 384)
    │   │   ├──38x87 region at (948, 359)
    │   │   └──38x57 region at (1296, 329)
    │   └──156x309 region at (0, 85)
    │       ├──38x76 region at (7, 159)
    │       ├──38x76 region at (142, 124)
    │       ├──34x76 region at (218, 85)
    │       ├──38x76 region at (215, 125)
    │       └──38x76 region at (52, 105)
    ├──626x1629 image demo_image
    │   ├──156x406 region at (480, 173)
    │   │   ├──38x100 region at (623, 272)
    │   │   ├──38x10

When the segmentation is done, the segments can be passed to a text recognition model. The results are passed to the workbench in the same manner as before:

In [8]:
from htrflow_core.models.dummy_models import RecognitionModel

recognition_model = RecognitionModel()
results = recognition_model(volume.segments())
volume.update(results)
print(volume)

└──<htrflow_core.volume.Node object at 0x7f70f6f38250>
    ├──626x1629 image demo_image
    │   └──156x406 region at (345, 11)
    │       ├──37x100 region at (517, 129) "Dolor velit non non tempora magnam ut adipisci."
    │       ├──22x100 region at (636, 144) "Dolor quiquia quisquam adipisci velit velit quiquia quiquia."
    │       ├──38x100 region at (543, 125) "Ipsum labore dolorem ut neque ipsum velit."
    │       ├──38x100 region at (486, 122) "Consectetur est numquam voluptatem quiquia ipsum."
    │       └──38x69 region at (681, 38) "Magnam etincidunt consectetur neque quaerat ut sit ipsum."
    ├──626x1629 image demo_image
    │   ├──117x406 region at (17, 0)
    │   │   └──28x100 region at (216, 70) "Modi sed non tempora."
    │   ├──156x406 region at (948, 262)
    │   │   ├──33x100 region at (1070, 384) "Numquam quiquia ut etincidunt sit quaerat adipisci."
    │   │   ├──38x87 region at (948, 359) "Est etincidunt dolore modi."
    │   │   └──38x57 region at (1296, 329) "

## Accessing nodes
Specific nodes are accessed by tuple indexing. Here we extract the first line of the first region of the first image:

In [9]:
# Access image 0, region 0, subregion 0
volume[0, 0, 0]

# Access image 0, region 0
volume[0, 0]

<volume.RegionNode at 0x7f296eceb130>

The image associated with each node is accessed through the `image` attribute. The image isn't stored directly in the node, instead, the node refers to the parent image, and crops it according to its box:

```python

class BaseImageNode:
    
    @property
    def image(self):
        x1, x2, y1, y2 = self.box
        return self.parent.image[y1:y2, x1:x2]
    
    ...
```


In [10]:
volume[0, 0, 0].image

array([[[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       ...,

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]],

       [[255, 255, 255],
        [255, 255, 255],
        [255, 255, 255],
        ...,
        [255, 255, 255],
        [255, 255, 255],
        [255, 255, 255]]

## Coordinates

All nodes have a `coordinate` attribute. This is the location of the node's top-left corner *relative to the original image*. The base image node's coordinate is thus (0,0):

In [None]:
print(volume[0].coordinate)

For first-level regions `coordinate` is the same as the corner of the segment bounding box.

In [None]:
print('Coordinate:', volume[0, 0].coordinate)
print('Bounding box:', volume[0, 0].data['segment'].box, '(x1, x2, y1, y2)')

But for nested regions the two differ, because `coordinate` is relative to the original image, while the segment bounding box is relative to the parent region.

In [None]:
print('Global coordinate:', volume[0, 0, 0].coordinate)
print('Local bounding box:', volume[0, 0, 0].data['segment'].box)