Proposing spaces and transforms #94

bogovicj · 2022-02-02T19:32:40Z

This is a preliminary proposal and discussion for axes and coordinate transformations. For more details see:

this early version of the specification.
this page of examples for a growing list of concrete descriptions of particular use cases.
this page of notes, rationale, and brainstorming

Add discrete axes

Extend the axes specification, adding a new optional field discrete whose values are booleans. Discrete axes should not be interpolated, whereas continuous axes may be interpolated. For example, the channels are usually discrete:

{"name": "c", "type": "channel", "discrete": true },
{"name": "y", "type": "space", "unit": "micrometer"},
{"name": "x", "type": "space", "unit": "micrometer"},

Add spaces

A space is a named list of axes and defines the coordinate system of the data. Two simple examples - a viewer may prefer the physical spatial coordinates of data:

{ 
"name" : "physical-micrometers", 
"axes" : [ 
  {"name": "y", "type": "space", "unit": "micrometer"},
  {"name": "x", "type": "space", "unit": "micrometer"}
]
}

where an algorithm that processes pixels may prefer the discrete pixel grid:

"name" : "pixel-space", 
"axes" : [ 
  {"name": "j", "type": "space", "discrete": true },
  {"name": "i", "type": "space", "discrete": true }
]

array space

This is a nice default that does not hurt much that I can see, and is more concise than alternatives. See my brainstorming here.

Every array / dataset has a default space whose name is the empty string. It's axes have default names dim_i, are discrete, and there exist as many axes as array dimensions. For example, a 3D dataset's array space is

{
"name" : "", 
"axes" : [ 
  {"name": "dim_0", "discrete": true },
  {"name": "dim_1", "discrete": true }
]
}

Array space is shared across all datasets, so if any applications need to differentiate between them, a new discrete space should be explicitly defined with other, unique axis names.

coordinateTransformations

This proposal adds the idea that a coordinateTransformation is a function from one space to another space to the existing coordinateTransformations metadata. coordinateTransformations will now have new fields:

input_space and output_space, whose values are strings corresponding to the names of defined spaces
input_axes and output_axes whose values are arrays of strings corresponding to axis names. (See below)
name whose value is a string, makes it possible to reference a transformation.

For example, assuming "array space" and the "physical-micrometers" space defined above:

{
  "name" : "pixels-to-micrometers",
  "type" : "scale",
  "scale" : [1.1, 2.2],
  "input_space" : "",
  "output_space": "physical-micrometers"
}

this is equivalent to:

{
  "name" : "pixels-to-micrometers",
  "type" : "scale",
  "scale" : [1.1, 2.2],
  "input_axes" : ["dim_0", "dim_1"],
  "output_axes": ["y", "x"]
}

Providing input_axes and output_axes enables transforming subsets of axes as in this example.

Specific questions

We considered using the name "view" instead of "space"
- but giving a name to the raw data array is nice, and that's arguably not a "view"
Is the default, nameless "array space" worthwhile? Are its axis names appropriate?
Where are these space + transform metadata stored in the container?
- all together in the root or special location? with the metadata for particular datasets?
If you have a use case this does not cover, please add that use case here User stories transformations #84
Are multiple transformations between two spaces, in the same direction allowed?
- I propose "not now, but probably later"

Thanks

This took shape with the help of lots of people.
@constantinpape @xulman @tischi @sbesson @axtimwalde @tpietzsch @d-v-b @joshmoore @jbms @thewtex @dzenanz @lassoan @satra @jni
organizers and participants of the 2022 Bioimage Hackathon, and even more.

The text was updated successfully, but these errors were encountered:

satra · 2022-02-03T01:25:46Z

@bogovicj - this looks great and will cover a lot of ground.

We considered using the name "view" instead of "space" but giving a name to the raw data array is nice, and that's arguably not a "view"

a "view" in my head is essentially some coordinatetransformation (could be identity) on a space. don't have strong feelings but space sounds good to me.

Is the default, nameless "array space" worthwhile? Are its axis names appropriate?

are there suggestions for what this would be used for?

Where are these space + transform metadata stored in the container? all together in the root or special location? with the metadata for particular datasets?

all spaces could be stored in some root location with the name of the space added as metadata of a dataset. similarly all transforms could be stored in some root location or in an external location.

Are multiple transformations between two spaces, in the same direction allowed? I propose ["not now, but probably later"]

agree, but are useful. since different algorithms are likely to generate different transformations for registering between any two things.

some notes on coordinateTransformations:

this is where bringing affines and nonlinear transformations back in would be great.
in the proposal and tool i point to below, there was the question of what input and output spaces mean when thinking about transforms. this was obvious to most developers writing registration tools, but was really hard for us to communicate well to people using those tools. for example if we want to transform an image to a physical space going from pixels to microns then most of the time we take the grid coordinates in microns, map each of them to the corresponding pixels using the transform and the resample/interpolate as necessary. thus this transform is taking/function is mapping coordinates in my "output space" into coordinates in my "input space". xfm(point) == pixels. thus if transforms are treated as functions, this transform (xfm) would have microns as the input space and pixels as the output, but is used to generate an image in the input space.

some additional pointers from the neuroimaging community if people feel like reading:

Coordinate systems in neuroimaging - MEG/EEG/MRI/etc
Proposal for transforms
nitransforms - a little library associated with the transforms proposal

jbms · 2022-02-03T03:23:37Z

@satra I would agree that input and output should be used in the function sense, even if that is "backwards" from the direction of the conversion that will actually be performed. However, one thing to note is that for efficiency a flow field would normally be represented as you describe, where the physical space is the input space and the array voxel space is the output space, but I think it is customary to represent an affine transformation (or separate scale/translation) where the array voxel space is the input space and the physical space is the output space. Presumably that discrepancy can be addressed easily enough by an "invert" option.

@bogovicj My understanding of your proposal is that a "space" essentially does two things: it defines a list of dimensions, with associated names, units, and discrete/continuous flags, and it define a namespace for those dimensions.

However, suppose I define a space as follows:

{
  "name": "physical_xyzt",
  "axes": [
    {"name": "x", "type": "space", "unit": "meter", "discrete": false},
    {"name": "y", "type": "space", "unit": "meter", "discrete": false},
    {"name": "z", "type": "space", "unit": "meter", "discrete": false},
    {"name": "t", "type": "time", "unit": "second", "discrete": false}
  ]
}

Then maybe I have a collection of arrays:

em_image (indexed by x, y, z)
functional_lightsheet (indexed by x, y, z, t, and c)
stimulus (indexed by t and k)
brain_layers (indexed by x, y)

This type of dimension correspondence is easily expressed in the netcdf data model, but the current proposal does not seem to allow that --- instead it would be necessary to define a separate space for each combination of dimensions.

An alternative we could consider is to allow specifying individual dimensions of a space, e.g. "physical.x" to indicate the "x" dimension of the "physical" space. A possible simplification then would be to not define "spaces" at all, but just define individual named dimensions; for disambiguation purposes, these names could be longer than typical for dimension names, e.g. "physical_x", or maybe there could be a naming convention like "physical.x" so that viewers might know to display just "x" as a shorthand.

A separate comment: it seems to me that if SI prefixes are allowed as multipliers on the units, then arbitrary multipliers should also be allowed (this came up in previous discussions about units). This would allow in many cases the coordinate space of an array itself (prior to any transformation) to have meaningful units, e.g. if it uses a post-alignment coordinate space; under the current proposal you are forced to just specify these dimensions as unitless discrete dimensions and always need a separate transform to indicate any units at all.

One last thought regarding the discrete indicator: what if instead discrete was indicated by the lack of a unit?

We could consider also allowing a string description field to be associated with a dimension and/or space, though I'm not sure how useful that would be.

joshmoore · 2022-02-03T14:25:10Z

This type of dimension correspondence is easily expressed in the netcdf data model, but the current proposal does not seem to allow that

👍 for compatibility with the NC model as being a nice to have.

bogovicj · 2022-02-07T14:09:49Z

Thanks @satra and @jbms!

multiple transforms between spaces

since different algorithms are likely to generate different transformations for registering between any two things.

Agree, and this will eventually be important to me as well. If this is a critical use case for you, please consider adding it as a user story here #84

input / output and direction

"what input and output spaces mean when thinking about transforms" - @satra

"I would agree that input and output should be used in the function sense" - @jbms

Yes. Transforms here are in the "forward" direction despide the fact that the "inverse" is used for rendering / interpolating.
That some transformations we want are not closed-form-invertible is exactly why we included the "inverse" option in this list of
transforms long ago.

I'll make a new issue to discuss what specific types of transforms will be in the next version.

axes and spaces

just define individual named dimensions... these names could be longer than typical for dimension names, e.g. "physical_x"

This is actually what I'm proposing. Specifically that axis names are unique across all all spaces. I agree that dimension correspondence in the way you described is important, and unique axis names gives us that.

For many use cases, short axis names (e.g. "x", "y", "t") are fine and recommended. I expect "Long" axis names (e.g. "fafb-v14.x") will only be necessary when registering between images.

A possible simplification then would be to not define "spaces" at all

While not strictly necessary, I think of "spaces" as a way for a dataset to communicate to downstream applications what axes make sense to be displayed together, and to formalize naming conventions that we'd need otherwise, like "axes with a common prefix go together". I also imagine they could make UIs easier for end users, especially if that naming convention is not respected. For instance, with my user hat on, I'd prefer to see:

choose 1 of [ "array", "crop", "physical" ]

than

choose 3 of ["0", "1", "2", "i", "j", "k", "x", "y", "z"]

One last thought regarding the discrete indicator: what if instead discrete was indicated by the lack of a unit?

Interesting idea. I'm imagining "discrete" communicating "don't interpolate across this dimension," and that could
might be useful for axes with units, but I may be overthinking. Let's revisit if and when more user stories come in.

allowing a string description field

Nice idea, I'm open to this, let's see what others think of this.

jbms · 2022-02-07T18:22:52Z

Interesting idea. I'm imagining "discrete" communicating "don't interpolate across this dimension," and that could
might be useful for axes with units, but I may be overthinking. Let's revisit if and when more user stories come in.

When displaying a segmentation (specified by a segment label volume), we never want to interpolate, but we may want to say that the segmentation has the same dimensions as a microscopy image volume, that we would want to interpolate. But for that use case, we already don't need any additional information to tell us not to interpolate, because that is already indicated by the fact we are displaying the volume as a segmentation.

imagesc-bot · 2022-05-14T08:04:52Z

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/ome-zarr-chunking-questions/66794/38

imagesc-bot · 2022-09-16T07:28:59Z

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/ome-ngff-community-call-transforms-and-tables/71792/1

imagesc-bot · 2023-01-17T09:08:10Z

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/ashlar-stitching-questions-and-developments/67418/18

m-albert · 2023-07-28T20:38:24Z

Hi @bogovicj and others, it's been great following all the discussions here and seeing how NGFF is moving towards adopting a powerful framework for coordinate systems and transforms.

I wanted to comment here that in several contexts it came up that it'd be useful to be able to specify transforms for subsets of a dataset, i.e. different transforms for different image coordinates. Here are some examples:

channel alignment Transformation Specification #28 (comment)
slice alignment affines and dimensions #74
drift correction and generally registration of timelapses (multi-view, multi-position etc.) User stories transformations #84 (comment)
multiplexed imaging User stories transformations #84 (comment)

As far as I understand, in the current form of the proposal this is not possible within the same dataset in ways other than using displacement fields.

I've seen that some time ago @bogovicj and @tischi discussed ideas for new transforms here. I especially liked the "coordinate-wise" transform, which is similar to the "ByDimension" transform composed of lower dimensional transforms acting on a subset of their input and output coordinate system's axes.

Considering the many use cases it would help (including some I'm working on 😁), I was wondering whether supporting such transforms is currently being considered/discussed?

This was referenced Feb 2, 2022

affines and dimensions #74

Closed

(Nonlinear) Transformation specification #30

Closed

Transformation Specification #28

Closed

constantinpape mentioned this issue Feb 7, 2022

Compatibility with xarray #48

Open

constantinpape mentioned this issue Feb 10, 2022

Support for non-zero origin zarr-developers/zarr-specs#122

Open

constantinpape mentioned this issue Feb 18, 2022

Finalize axes & initial transformation #85

Merged

jbms mentioned this issue Feb 19, 2022

Multiscale metadata #102

Open

constantinpape mentioned this issue Feb 20, 2022

Transformation types #101

Open

constantinpape mentioned this issue Feb 28, 2022

[Draft] Add transformation spec #63

Closed

sbesson mentioned this issue Mar 18, 2022

Extract specification snippets as standalone JSON files (latest) #110

Merged

joshmoore added this to the 0.5 milestone Apr 21, 2022

sbesson mentioned this issue May 19, 2022

Read multiple fields of view on a grid, for HCS dataset ome/ome-zarr-py#200

Open

kevinyamauchi mentioned this issue Jun 13, 2022

Add NAP-3: Spaces napari/napari#4684

Merged

jluethi mentioned this issue Jun 16, 2022

Image registration task fractal-analytics-platform/fractal-tasks-core#36

Closed

ivirshup mentioned this issue Jul 14, 2022

design discussions single-cell-data/SOMA#32

Merged

jluethi mentioned this issue Jul 26, 2022

How do we represent regions of interest in OME-NGFF? #133

Open

bogovicj mentioned this issue Sep 15, 2022

Coordinate systems and new coordinate transformations proposal #138

Draft

jstriebel mentioned this issue Oct 20, 2022

Revise how the domain of an array is specified zarr-developers/zarr-specs#144

Closed

joshmoore mentioned this issue Mar 17, 2023

website: roadmap/status page #181

Open

jluethi mentioned this issue Aug 9, 2023

Multiplexing registration overview fractal-analytics-platform/fractal-tasks-core#39

Open

yarikoptic mentioned this issue Apr 25, 2024

[BEP014] Open questions to resolve bids-standard/bids-specification#15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposing spaces and transforms #94

Proposing spaces and transforms #94

bogovicj commented Feb 2, 2022 •

edited by constantinpape

satra commented Feb 3, 2022

jbms commented Feb 3, 2022

joshmoore commented Feb 3, 2022

bogovicj commented Feb 7, 2022

jbms commented Feb 7, 2022 •

edited

imagesc-bot commented May 14, 2022

imagesc-bot commented Sep 16, 2022

imagesc-bot commented Jan 17, 2023

m-albert commented Jul 28, 2023

Proposing spaces and transforms #94

Proposing spaces and transforms #94

Comments

bogovicj commented Feb 2, 2022 • edited by constantinpape

Add discrete axes

Add spaces

array space

coordinateTransformations

Specific questions

See also

Thanks

satra commented Feb 3, 2022

jbms commented Feb 3, 2022

joshmoore commented Feb 3, 2022

bogovicj commented Feb 7, 2022

multiple transforms between spaces

input / output and direction

axes and spaces

jbms commented Feb 7, 2022 • edited

imagesc-bot commented May 14, 2022

imagesc-bot commented Sep 16, 2022

imagesc-bot commented Jan 17, 2023

m-albert commented Jul 28, 2023

bogovicj commented Feb 2, 2022 •

edited by constantinpape

jbms commented Feb 7, 2022 •

edited