Geometric transforms proposal #7486

KumoLiu · 2024-02-22T09:28:39Z

Design

Design goals

Geometry has first-class support
- Users should be able to create models and pipelines that are purely geometry-based
- Users should be able to create models and pipelines that are combinations of pixel data and geometry data
It should be easy for users to make hybrid workflows
- In hybrid workflows, we should make it easy to update geometry based on transforms to pixel data
Minimal API changes
- We should minimise changes to the API

Characteristics of geometry and pixel data

geometry data
- points: positions in world space
  - may have some kind of vertex / edge descriptor with which to interpret the points
pixel data
- pixel resolution: a mapping from pixel-space to world space
- bounding box: the geometric bounds of the pixel data in world space

Pixel-space vs world-space

We define two spaces in which operations can be carried out
- world space
  - change the object in world space. this can mean rotation, size, location, shearing, etc.
  - applies to both pixel data and geometry data
- pixel space
  - a geometric description of a change to the way pixel data is sampled
  - has no effect on world space
  - applies only to pixel data

Stages of a mixed pixel / geometry pipeline

Load data sources
a. pixel data
b. geometry data
align pixel data with geometry data (depends on task)
apply various transforms to aligned pixel and geometry data
a. our transforms should always keep pixel and geometry data aligned, for any given sequence of spatial transforms applied to both

Spatial transform categories

Categories of spatial transform

agnostic: work the same way on pixel and geometry data
- flip, zoom, etc.
image-specific: transforms that make sense only for raster data
- resample, spacing, etc.
hybrid: transforms that must also take images into acccount
- rotate, etc.

A closer look at hybrid transforms

rotate must perform slightly different operations on pixel and geometry data

the rotation itself in world space is the same for pixel and geometry data
if keep_size is false, the extents of pixel data bounds will change
- this is a pixel space change

Transform API

The transform API has the following layers

dictionary transform -> array transform -> functional transform

Dictionary transforms

Dictionary transforms specific to images can refer to geometry by name rather than requiring to pass tensors in directly

class Spacingd(MapTransform, InvertibleTransform, LazyTransform):
    def __init__(
        self, keys, pixdim, diagonal, mode, padding_mode, align_corners, dtype, scale_extent,
        recompute_affine, min_pixdim, max_pixdim, ensure_same_shape, allow_missing_keys):

As such, there shouldn't need to be any changes to the API for dictionary transforms:

geometry tensors are referred to by name, as are pixel tensors
transforms that aren't image-specific can just process all transforms independent of each other
transforms that are image-specific can perform the operation on image tensors first
the world-space component of the transform can then be applied to the geometry tensors

Array transforms

Array transforms specific to images need to be modified so that geometry data can be updated. This can be done via additional operation parameters that take a tensor or tuple of tensors:

class Spacing(InvertibleTransform, LazyTransform):
    def __call__(
        self, data_array, mode, padding_mode, align_corners, dtype, scale_extent, output_spatial_shape, lazy,
        inputs_to_update # New
    ):

Functional transforms

def spacing(
    data_array, mode, padding_mode, align_corners, dtype, scale_extent, output_spatial_shape, lazy,
    inputs_to_update # New
):

Functional transforms that are specific to image data first calculate the pixel-space and world-space transform components to be applied to the image data. They then call a function that applies the appropriate transform to geometry data.
Note: the geometry data should only need one operation for applying data to it, ideally we should not need to write *_image and *_point functions for each of the operations

Implementation

1. Integration of 'kind' Property to MetaTensor:

Propose to incorporate 'kind' property in MetaTensor. The property 'kind' will enable efficient identification and appropriate handling of different data types. The value of 'kind' can be conveniently retrieved using data.kind.

2. Data Input/Output Enhancements:

Introduce LoadPoint and LoadPointd with properties refer and refer_key. These properties will ascertain if the loaded point corresponds to a certain coordinate system and subsequently facilitate retrieval of information such as affine information from the reference.
Usage Examples:
LoadPointd(key="point", refer_key="image") and LoadPointd(data=point, refer=image)
Subject for Discussion: What data formats should we aim to support?

3. Improvements to Transform API:

The core idea is to house the computational logic within the associated operator and register it to the transform. This modification will minimize changes to the transform API. To accommodate a new data type in MONAI, current user-facing API logic would remain unaltered. New operators will simply be added as required.
Example:

class Flip():
    def __init__(self) -> None:
        self.operators = [flip_image, flip_point]

    def __call__(self, data, *args: Any, **kwds: Any) -> Any:
        for _operator in self.operators:
            ret = _operator(data)
            if ret is not None:
                return ret
    
    def register():
        pass

def flip_image(data):
    if data.kind != "pixel":
        return None
    else:
        ...
        return data

def flip_point(data):
    if data.kind != "point":
        return None
    else:
        ...
        return data

4. User Experience Enhancements:

The user experience can be improved by making the data operations more intuitive and user-friendly.
Code example:

from monai.transform as mt

data = [
    "image": image_path,
    "point": point_path
]

trans = mt.Compose([
    mt.LoadImaged(keys="image"),
    mt.LoadPointd(keys="point", refer_key="image"),
    mt.Flipd(keys=["image", "point"]),
    mt.Rotated(keys=["image", "point"]),
])

The text was updated successfully, but these errors were encountered:

KumoLiu · 2024-02-22T09:31:39Z

Over the past few weeks, we've had insightful discussions concerning 'geometric transforms'. I summarize the consensus into this new ticket, if all agree, I'm looking forward to getting started on the Transform API. This is not the end of our conversation. As we progress, we can continue to share our thoughts and optimize our approach, like deciding on the kinds of data types that we need to support. Your input is valued, and I am keen to learn from everyone.

cc @ericspod @aylward @Nic-Ma @atbenmurray @vikashg

atbenmurray · 2024-02-22T11:19:20Z

@KumoLiu Great! I've been working on a PR that at present has only the API changes, so we get a clear picture of the modifications to the API. I've also been thinking about the differences between "raster-space" and "world-space" transforms and how they relate to images and geometry respectively, as a way of solving @vikashg's rotate use-case. I'll do a brief presentation of that tomorrow if that's ok.

atbenmurray · 2024-03-08T14:59:32Z

I've added a design section to this item as per our discussion last friday

Part of #7486 ### Types of changes  - [x] Non-breaking change (fix or new feature that would not break existing functionality). - [ ] Breaking change (fix or new feature that would cause existing functionality to change). - [ ] New tests added to cover the changes. - [ ] Integration tests passed locally by running `./runtests.sh -f -u --net --coverage`. - [ ] Quick tests passed locally by running `./runtests.sh --quick --unittests --disttests`. - [ ] In-line docstrings updated. - [ ] Documentation updated, tested `make html` command in the `docs/` folder. --------- Signed-off-by: YunLiu <55491388+KumoLiu@users.noreply.github.com>

Part of Project-MONAI#7486 ### Types of changes  - [x] Non-breaking change (fix or new feature that would not break existing functionality). - [ ] Breaking change (fix or new feature that would cause existing functionality to change). - [ ] New tests added to cover the changes. - [ ] Integration tests passed locally by running `./runtests.sh -f -u --net --coverage`. - [ ] Quick tests passed locally by running `./runtests.sh --quick --unittests --disttests`. - [ ] In-line docstrings updated. - [ ] Documentation updated, tested `make html` command in the `docs/` folder. --------- Signed-off-by: YunLiu <55491388+KumoLiu@users.noreply.github.com>

ericspod · 2024-04-23T10:38:09Z

We have added the "kind" property to MetaTensor now with a recent PR. The kinds that we have are "pixel" and "point" but to these we can add "signal" and "text" later for future use. I mention it here to just put it out there for us.

KumoLiu added this to Geometric Transforms Feb 22, 2024

KumoLiu added the Feature request label Feb 22, 2024

KumoLiu mentioned this issue Feb 23, 2024

Add kind property in MetaTensor #7488

Merged

7 tasks

atbenmurray mentioned this issue Feb 23, 2024

Geom api #7490

Closed

7 tasks

This was referenced Feb 29, 2024

Geometric transform -- Flip #7508

Draft

Geometric transform -- Resize #7509

Draft

KumoLiu added this to the Data Input/Output Enhancements for Geometric milestone Mar 1, 2024

KumoLiu mentioned this issue Apr 12, 2024

Add kind property in MetaTensor (#7488) #7635

Closed

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Geometric transforms proposal #7486

Geometric transforms proposal #7486

KumoLiu commented Feb 22, 2024 •

edited by atbenmurray

Loading

KumoLiu commented Feb 22, 2024

atbenmurray commented Feb 22, 2024

atbenmurray commented Mar 8, 2024

ericspod commented Apr 23, 2024

Geometric transforms proposal #7486

Geometric transforms proposal #7486

Comments

KumoLiu commented Feb 22, 2024 • edited by atbenmurray Loading

Design

Design goals

Characteristics of geometry and pixel data

Pixel-space vs world-space

Stages of a mixed pixel / geometry pipeline

Spatial transform categories

A closer look at hybrid transforms

Transform API

Dictionary transforms

Array transforms

Functional transforms

Implementation

1. Integration of 'kind' Property to MetaTensor:

2. Data Input/Output Enhancements:

3. Improvements to Transform API:

4. User Experience Enhancements:

KumoLiu commented Feb 22, 2024

atbenmurray commented Feb 22, 2024

atbenmurray commented Mar 8, 2024

ericspod commented Apr 23, 2024

KumoLiu commented Feb 22, 2024 •

edited by atbenmurray

Loading