Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geometric transforms proposal #7486

Open
KumoLiu opened this issue Feb 22, 2024 · 4 comments
Open

Geometric transforms proposal #7486

KumoLiu opened this issue Feb 22, 2024 · 4 comments

Comments

@KumoLiu
Copy link
Contributor

KumoLiu commented Feb 22, 2024

Design

Design goals

  • Geometry has first-class support
    • Users should be able to create models and pipelines that are purely geometry-based
    • Users should be able to create models and pipelines that are combinations of pixel data and geometry data
  • It should be easy for users to make hybrid workflows
    • In hybrid workflows, we should make it easy to update geometry based on transforms to pixel data
  • Minimal API changes
    • We should minimise changes to the API

Characteristics of geometry and pixel data

  • geometry data
    • points: positions in world space
      • may have some kind of vertex / edge descriptor with which to interpret the points
  • pixel data
    • pixel resolution: a mapping from pixel-space to world space
    • bounding box: the geometric bounds of the pixel data in world space

Pixel-space vs world-space

  • We define two spaces in which operations can be carried out
    • world space
      • change the object in world space. this can mean rotation, size, location, shearing, etc.
      • applies to both pixel data and geometry data
    • pixel space
      • a geometric description of a change to the way pixel data is sampled
      • has no effect on world space
      • applies only to pixel data

Stages of a mixed pixel / geometry pipeline

  1. Load data sources
    a. pixel data
    b. geometry data
  2. align pixel data with geometry data (depends on task)
  3. apply various transforms to aligned pixel and geometry data
    a. our transforms should always keep pixel and geometry data aligned, for any given sequence of spatial transforms applied to both

Spatial transform categories

Categories of spatial transform

  • agnostic: work the same way on pixel and geometry data
    • flip, zoom, etc.
  • image-specific: transforms that make sense only for raster data
    • resample, spacing, etc.
  • hybrid: transforms that must also take images into acccount
    • rotate, etc.

A closer look at hybrid transforms

rotate must perform slightly different operations on pixel and geometry data

  • the rotation itself in world space is the same for pixel and geometry data
  • if keep_size is false, the extents of pixel data bounds will change
    • this is a pixel space change

Transform API

The transform API has the following layers

dictionary transform -> array transform -> functional transform

Dictionary transforms

Dictionary transforms specific to images can refer to geometry by name rather than requiring to pass tensors in directly

class Spacingd(MapTransform, InvertibleTransform, LazyTransform):
    def __init__(
        self, keys, pixdim, diagonal, mode, padding_mode, align_corners, dtype, scale_extent,
        recompute_affine, min_pixdim, max_pixdim, ensure_same_shape, allow_missing_keys):

As such, there shouldn't need to be any changes to the API for dictionary transforms:

  • geometry tensors are referred to by name, as are pixel tensors
  • transforms that aren't image-specific can just process all transforms independent of each other
  • transforms that are image-specific can perform the operation on image tensors first
  • the world-space component of the transform can then be applied to the geometry tensors

Array transforms

Array transforms specific to images need to be modified so that geometry data can be updated. This can be done via additional operation parameters that take a tensor or tuple of tensors:

class Spacing(InvertibleTransform, LazyTransform):
    def __call__(
        self, data_array, mode, padding_mode, align_corners, dtype, scale_extent, output_spatial_shape, lazy,
        inputs_to_update # New
    ):

Functional transforms

def spacing(
    data_array, mode, padding_mode, align_corners, dtype, scale_extent, output_spatial_shape, lazy,
    inputs_to_update # New
):

Functional transforms that are specific to image data first calculate the pixel-space and world-space transform components to be applied to the image data. They then call a function that applies the appropriate transform to geometry data.
Note: the geometry data should only need one operation for applying data to it, ideally we should not need to write *_image and *_point functions for each of the operations

Implementation

1. Integration of 'kind' Property to MetaTensor:

Propose to incorporate 'kind' property in MetaTensor. The property 'kind' will enable efficient identification and appropriate handling of different data types. The value of 'kind' can be conveniently retrieved using data.kind.

2. Data Input/Output Enhancements:

Introduce LoadPoint and LoadPointd with properties refer and refer_key. These properties will ascertain if the loaded point corresponds to a certain coordinate system and subsequently facilitate retrieval of information such as affine information from the reference.
Usage Examples:
LoadPointd(key="point", refer_key="image") and LoadPointd(data=point, refer=image)
Subject for Discussion: What data formats should we aim to support?

3. Improvements to Transform API:

The core idea is to house the computational logic within the associated operator and register it to the transform. This modification will minimize changes to the transform API. To accommodate a new data type in MONAI, current user-facing API logic would remain unaltered. New operators will simply be added as required.
Example:

class Flip():
    def __init__(self) -> None:
        self.operators = [flip_image, flip_point]

    def __call__(self, data, *args: Any, **kwds: Any) -> Any:
        for _operator in self.operators:
            ret = _operator(data)
            if ret is not None:
                return ret
    
    def register():
        pass

def flip_image(data):
    if data.kind != "pixel":
        return None
    else:
        ...
        return data

def flip_point(data):
    if data.kind != "point":
        return None
    else:
        ...
        return data

4. User Experience Enhancements:

The user experience can be improved by making the data operations more intuitive and user-friendly.
Code example:

from monai.transform as mt

data = [
    "image": image_path,
    "point": point_path
]

trans = mt.Compose([
    mt.LoadImaged(keys="image"),
    mt.LoadPointd(keys="point", refer_key="image"),
    mt.Flipd(keys=["image", "point"]),
    mt.Rotated(keys=["image", "point"]),
])
@KumoLiu
Copy link
Contributor Author

KumoLiu commented Feb 22, 2024

Over the past few weeks, we've had insightful discussions concerning 'geometric transforms'. I summarize the consensus into this new ticket, if all agree, I'm looking forward to getting started on the Transform API. This is not the end of our conversation. As we progress, we can continue to share our thoughts and optimize our approach, like deciding on the kinds of data types that we need to support. Your input is valued, and I am keen to learn from everyone.

cc @ericspod @aylward @Nic-Ma @atbenmurray @vikashg

@atbenmurray
Copy link
Contributor

@KumoLiu Great! I've been working on a PR that at present has only the API changes, so we get a clear picture of the modifications to the API. I've also been thinking about the differences between "raster-space" and "world-space" transforms and how they relate to images and geometry respectively, as a way of solving @vikashg's rotate use-case. I'll do a brief presentation of that tomorrow if that's ok.

@atbenmurray
Copy link
Contributor

I've added a design section to this item as per our discussion last friday

KumoLiu added a commit that referenced this issue Apr 12, 2024
Part of #7486


### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [ ] Breaking change (fix or new feature that would cause existing
functionality to change).
- [ ] New tests added to cover the changes.
- [ ] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.
- [ ] Quick tests passed locally by running `./runtests.sh --quick
--unittests --disttests`.
- [ ] In-line docstrings updated.
- [ ] Documentation updated, tested `make html` command in the `docs/`
folder.

---------

Signed-off-by: YunLiu <55491388+KumoLiu@users.noreply.github.com>
KumoLiu added a commit to KumoLiu/MONAI that referenced this issue Apr 19, 2024
Part of Project-MONAI#7486


### Types of changes
<!--- Put an `x` in all the boxes that apply, and remove the not
applicable items -->
- [x] Non-breaking change (fix or new feature that would not break
existing functionality).
- [ ] Breaking change (fix or new feature that would cause existing
functionality to change).
- [ ] New tests added to cover the changes.
- [ ] Integration tests passed locally by running `./runtests.sh -f -u
--net --coverage`.
- [ ] Quick tests passed locally by running `./runtests.sh --quick
--unittests --disttests`.
- [ ] In-line docstrings updated.
- [ ] Documentation updated, tested `make html` command in the `docs/`
folder.

---------

Signed-off-by: YunLiu <55491388+KumoLiu@users.noreply.github.com>
@ericspod
Copy link
Member

We have added the "kind" property to MetaTensor now with a recent PR. The kinds that we have are "pixel" and "point" but to these we can add "signal" and "text" later for future use. I mention it here to just put it out there for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

3 participants