Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add identity to image coordinateTransformations schema #152

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

thewtex
Copy link

@thewtex thewtex commented Oct 20, 2022

No description provided.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 20, 2022

Automated Review URLs

Copy link
Contributor

@constantinpape constantinpape left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections on a technical level. I think we initially decided that we don't need an explicit representation of the identity, are there use-cases where we need it now?
(sorry, I didn't have time to keep up with the discussions around the transformation spec.)

@ivirshup
Copy link
Contributor

@thewtex, what's the relationship here to #149?

@joshmoore
Copy link
Member

@ivirshup: my understanding is that this is solely focused on adding the schema for the previous version (0.4). I assume you might have the identical change for /latest?

@ivirshup
Copy link
Contributor

@joshmoore ah, hadn't noticed the difference in files changed. Thanks for the heads up.

@thewtex
Copy link
Author

thewtex commented Oct 27, 2022

No objections on a technical level. I think we initially decided that we don't need an explicit representation of the identity, are there use-cases where we need it now?

This came up when coordinateTransformations: [] was present. The schema currently complains that there must be at least one transformation. Additionally, we could also update the schema so that the list can be empty. However, I think the approach to list an Identity transformation is a good, explicit way to indicate that this is intentional.

@thewtex, what's the relationship here to #149?
@ivirshup: my understanding is that this is solely focused on adding the schema for the previous version (0.4).

Correct, the identity transformation already exists in the documentation for v0.4, but it is not supported by the schema.

@clbarnes
Copy link

clbarnes commented Jul 5, 2023

Could we instead remove identity from the spec and modify the schema to get rid of the requirement that there be at least one coordinate transformation? Requiring that there be an empty array (as opposed to not including the field) is enough to remind people that there is meant to be some mapping from data to world coordinates. It seems like (literally) needless API surface.

@thewtex
Copy link
Author

thewtex commented Jul 15, 2023

@clbarnes that would be a functional way to indicate that there is an identity coordinateTransformation, too. Removing identity would be a change to the already published spec, though.

I would like one of the proposed solutions, a) support for an explicit identity transformation, b) support for an empty coordinateTransformation[], that address the situation of an identity transformation to be integrated so I can use the schema with v0.4 datasets. @constantinpape @joshmoore do you have any guidance on this?

A required, empty coordinateTransformation[] is a good idea, but, as an additional requirement, probably a candidate for v0.5.
@bogovicj what will work best and what do you recommend considering the current v0.5 draft?

@joshmoore
Copy link
Member

do you have any guidance on this?

My tendency would be to do the thing most like the 0.4 spec text for the moment.

@thewtex
Copy link
Author

thewtex commented Jul 16, 2023

My tendency would be to do the thing most like the 0.4 spec text for the moment.

Since this is following the 0.4 spec, can we merge this? And make additional improvements as needed in 0.5?

@clbarnes
Copy link

clbarnes commented Jul 17, 2023

Since this is following the 0.4 spec

By my reading, the identity transform is not allowed anywhere in v0.4, even though it is defined. From the 0.4 schema:

Each "datasets" dictionary ... The transformations are defined according to § 3.3 "coordinateTransformations" metadata. The transformation MUST only be of type translation or scale.

Each "multiscales" dictionary ... . The transformations MUST follow the same rules about allowed types, order, etc. as in "datasets:coordinateTransformations" and are applied after them.

The same is true in latest. The thing most like the 0.4 text, then, is to just scrub the single mention of the identity transform which can't be included in the metadata anyway.

As written, the way to express an identity transform is a trivial scale (a single scale is required) so that's probably the right place here). If that wasn't explicit enough, I think support for a zero-length (but extant) coordinateTransformations array gets the point across, while also simplifying the JSON structure (fewer characters), the schema (no length constraints, no required scale), and the spec (fewer transforms, no required scale, and fewer optional keys).

@d-v-b
Copy link
Contributor

d-v-b commented Jul 17, 2023

Since this is following the 0.4 spec

By my reading, the identity transform is not allowed anywhere in v0.4, even though it is defined. From the 0.4 schema:

Each "datasets" dictionary ... The transformations are defined according to § 3.3 "coordinateTransformations" metadata. The transformation MUST only be of type translation or scale.

Each "multiscales" dictionary ... . The transformations MUST follow the same rules about allowed types, order, etc. as in "datasets:coordinateTransformations" and are applied after them.

The same is true in latest. The thing most like the 0.4 text, then, is to just scrub the single mention of the identity transform which can't be included in the metadata anyway.

As written, the way to express an identity transform is a trivial scale (a single scale is required) so that's probably the right place here). If that wasn't explicit enough, I think support for a zero-length (but extant) coordinateTransformations array gets the point across, while also simplifying the JSON structure (fewer characters), the schema (no length constraints, no required scale), and the spec (fewer transforms, no required scale, and fewer optional keys).

+1 to this. I don't think an explicit identity transformation can make any sense in 0.4 without making a lot of changes to the spec.

@thewtex
Copy link
Author

thewtex commented Jul 24, 2023

The 0.4 spec does include the identity specification:

image

and this PR ensures it is in the schema.

A scale consisting of 1.0's is not the same -- interpreting software handles it differently. And identity means there are not extra computations.

A zero-length coordinateTransformation is not allowed in the 0.4 schema.

This backward-compatible update to the schema ensures that the text and the schema are consistent.

Removing a transformation in the spec is not backwards compatible.

@d-v-b
Copy link
Contributor

d-v-b commented Jul 24, 2023

@thewtex if I understand things correctly, there has never been a way to use the Identity transform in v0.4, because of the text here and here. In the two places were the spec describes uses of coordinateTransformations, it explicitly says that only scale and translation are allowed.

To rectify this there are two options. The first option, supported by me and @clbarnes, is to remove references to the Identity transform completely, and update the spec to state that an empty collection of coordinateTransformations is to be interpreted as expressing an identity transformation. This would technically not be a breaking change, because it was impossible to use the identity transform anyway.

The other option is this PR, but this still requires additional changes to the spec: we would need to remove this sentence to allow use of an explicit identity transform. Also, unlike removing the explicit identity transform, your PR would be a breaking change if you add support for the identity transform in multiscales -- in v0.4, only scale and translation are allowed, so after this change, ome-ngff readers will need to be updated to handle datasets using the identity transform.

@clbarnes
Copy link

clbarnes commented Jul 24, 2023

The 0.4 spec does include the identity specification:

As I said in my comment #152 (comment) , it's clearly defined, but it's not allowed. Or at least, I couldn't find anywhere that you're actually allowed to include an identity transformation - I'm very happy to be corrected. The only places coordinateTransforms are used is in the multiscale definition; however, in multiscales, coordinateTransformations MUST either be a scale or a translation, i.e. must not be an identity (as quoted in the linked comment).

So exactly as Davis says, we know what an identity transform is, but we're not allowed to use it. No current implementation can correctly support identity transforms because there is nowhere they can use them while being compliant with the spec. Our options, then, are

  1. Allow the use of an identity transform, which necessitates adding it to the schema
    • Why? What purpose does it serve?
    • This requires updating implementations
  2. Remove the definition of the literally useless transformation which we're already not allowed to use anyway
    • This simplifies the spec, simplifies the schema, and requires no change in implementations

An extension to 2 would be to update the schema to allow zero-length coordinatetransformations lists, where currently we require a trivial scale. This would update the schema (making it simpler), and possibly require updating implementations (making them simpler, and in a backwards-compatible way, because no existing spec-compliant data has identity transforms and implementations will continue to be able to read trivial scales).

@thewtex
Copy link
Author

thewtex commented Jul 24, 2023

The transformations are defined according to [[#trafo-md]]. The transformation MUST only be of type translation or scale.

This line is inconsistent. [[#trafo-md]] refers to identity, translation, and scale.

Identity transforms serve a purpose -- they indicate that the transformation is the identity. This is different than an undefined or unknown transformation.

There are inconsistencies with the specification, which this patch looks to resolve. I will push an update so the transformation list is consistent. The spec does include an identity transformation. These lines do exist:

The value of "type" MUST be one of the elements of the `type` column in the table below.

<tr><th>`identity` <td> <td>identity transformation, is the default transformation and is typically not explicitly defined

And there are implementations that use it, which is why I created this patch. If there are implementations that do not support identity, I am happy to create patches for them.

Make the list in the overview consistent with the transformation table.
@d-v-b
Copy link
Contributor

d-v-b commented Jul 24, 2023

Thanks @thewtex for removing that inconsistent sentence.

Identity transforms serve a purpose -- they indicate that the transformation is the identity. This is different than an undefined or unknown transformation.

I don't think anyone is opposing identity transforms. I think the question is how they should be represented in the spec.

What are the advantages of an identity transform expressed as an object, vs representing the identity transform implicitly as an empty list? Speaking for myself, I like the idea of keeping the spec simple -- making the identity transform implicit rather than explicit would allow me to delete some code, which is a small win. But perhaps there is some advantage to the explicit identity transform that I'm not aware of?

0.4/index.bs Outdated
@@ -367,7 +367,7 @@ to the current zarr group. The "path"s MUST be ordered from largest (i.e. highes

Each "datasets" dictionary MUST have the same number of dimensions and MUST NOT have more than 5 dimensions. The number of dimensions and order MUST correspond to number and order of "axes".
Each dictionary in "datasets" MUST contain the field "coordinateTransformations", which contains a list of transformations that map the data coordinates to the physical coordinates (as specified by "axes") for this resolution level.
The transformations are defined according to [[#trafo-md]]. The transformation MUST only be of type `translation` or `scale`.
The transformations are defined according to [[#trafo-md]]. The transformation MUST only be of type `identity`, `translation` or `scale`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we remove "The transformation MUST only be of type identity, translation or scale." entirely, and just say "The elements of coordinateTransformations are the transformations defined in [[#trafo-md]]"? This way, the transformations section of the spec is the single source of truth.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 yes added in 1a55fa1.

@clbarnes
Copy link

There are inconsistencies with the specification, which this patch looks to resolve.

Right, but there are 2 ways to resolve the inconsistency. This resolution (adding references to identity in several places) requires changes to spec-compliant implementations, and increases complexity of the spec. The alternative resolution (dropping the one reference to identity) reduces complexity of the spec and requires no implementation changes to spec-compliant implementations: we shouldn't restrict changes on behalf of things not compliant with the spec, a colossal set including "the colour blue" and "the concept of fraternal love". I'm very on board with explicit representations of not-transforms, as opposed to unknown or undefined transforms; in my opinion, an empty list of transforms very clearly says "do not apply any transforms". A nullable or possibly-undefined list of transforms (e.g. as currently supported in the multiscales representation) is what should be avoided.

@d-v-b
Copy link
Contributor

d-v-b commented Sep 7, 2023

This has been dormant for a few months. How can we move this forward?

@joshmoore joshmoore added this to the 0.5 milestone Oct 4, 2023
@thewtex
Copy link
Author

thewtex commented Oct 4, 2023

What are the advantages of an identity transform expressed as an object, vs representing the identity transform implicitly as an empty list?

When transformations are components of a multi-modal registration analysis workflow, there are often a chain of transformations that represent stages of registration, e.g. a calibration step, a re-orientation step, a template registration step, an atlas registration step. For some modalities or datasets in an analysis, a step may be the identity transformation, and it is useful to indicate that. An empty list does not capture that.

Right, but there are 2 ways to resolve the inconsistency. This resolution (adding references to identity in several places) requires changes to spec-compliant implementations

If the spec calls out "identity", which it does, an implementation that does not support "identity" is not "spec-compliant".

@clbarnes
Copy link

clbarnes commented Oct 4, 2023

If the spec calls out "identity", which it does, an implementation that does not support "identity" is not "spec-compliant".

I feel like we're going around in circles a bit. It is correct to say that the current spec includes the identity transformation. However, the spec does not allow the use of that transformation. Therefore, any implementation which allows the use of the identity transformation is not spec-compliant. So while it's possible for implementations to have an identity transformation defined, if they can parse it from the metadata, or if they can write it into the metadata, they are not compliant with the spec. Therefore, there are zero functional changes associated with removal of the identity transform, it simply allows the removal of code which must be unreachable according to the current spec. Meanwhile, allowing the use of the identity transform requires changes which hook an implementation's representation of that transform into the parsing and transforming workflow.

Does this make sense? I feel like I may not be getting my point across very clearly.

I see your point about explicit transformations allowing logic like "In my particular workflow, the 3rd transformation is always the one generated by process X, and therefore if X generates an identity transform, I must be able to represent it in the transformation array". That's totally reasonable! However, I'm not sure it's necessarily compatible with the current spec, which limits the number and order of transformations (one scale, and up to one translation, which must occur after the scale). Unless your registration workflow is specifically designed to fit to the OME-NGFF spec (which it might be) I can see it easily breaking those rules.

@thewtex
Copy link
Author

thewtex commented Oct 11, 2023

@clbarnes I understand that you are trying to assert that the spec does not allow identity transformations, and an implementation with an identity transform is not spec-compliant. This assertion was made repeatedly. I understand your point.

However, it does not make sense that the spec states that transforms supported include the identity transform, and an identity transform is not allowed.

Clearly, clarifications are required. Perhaps this PR should be integrated into the v0.5 update?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants