Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clearly defined anatomical orientation #208

Open
dyf opened this issue Jul 26, 2023 · 20 comments
Open

clearly defined anatomical orientation #208

dyf opened this issue Jul 26, 2023 · 20 comments

Comments

@dyf
Copy link

dyf commented Jul 26, 2023

The anatomical orientation of an array is a critical piece of metadata for downstream analysis, particularly for the increasingly common task of aligning acquired images to an atlas for anatomical quantification and standardized comparison to other data.

Currently the NGFF spec includes coordinate transformations, but the anatomical orientation of the sample once the transformation is applied is unspecified. As a result, tools simply make assumptions about orientation, which leads to wasted time and erroneous results. In systems with a fair amount of anatomical symmetry like the brain, it is impossible to retroactively inspect data to understand the orientation in which it was acquired. A place in the spec where we are explicit about anatomical orientation will allow acquisition and analysis tools to stop making assumptions.

I propose we add a field with a controlled vocabulary for anatomical orientation. Some prior art:

The ITK ecosystem uses 3 letter acronyms to describe anatomical orientation. For example RAS corresponds to:

  • the first axis increases from left to Right
  • the second axis increases from posterior to Anterior
  • the third axis increases from inferior to Superior

This works, however the acronyms are ambiguous. I personally continually have to look up if R is left-to-right or right-to-left.

Nifti's coordinate transforms are assumed to map data into RAS. This approach also works, however it relies on users and data generators being familiar with the Nifti spec and abiding by it.

The Brain Image Library asks for a more explicit definition of anatomical orientation. Submitters choose for each axis from a controlled vocabular that resembles the following:

  • left-to-right
  • right-to-left
  • anterior-to-posterior
  • posterior-to-anterior
  • inferior-to-superior
  • superior-to-inferior

We have adopted this at the the Allen Institute for Neural Dynamics in our data schema. We may consider adding dorsal-to-ventral, ventral-to-dorsal, rostral-to-caudal, and caudal-to-rostral to this vocabulary.

At the recent Get Your Brain Together hackathon hosted at the Allen Institute this was discussed at length.

Please consider adding an anatomicalOrientation field to axes metadata. Because this would be a controlled vocabulary, I recommend separating it from longName, which is uncontrolled (see #142). I am of course also open to this living elsewhere.

Should this have a default, I suggest it be RAS to be consistent with Nifti.

@d-v-b
Copy link
Contributor

d-v-b commented Jul 26, 2023

@dyf thanks for raising this important issue! I think what you are describing could be expressed via the coordinateSystem semantics defined over in #138, have you had a look at that PR?

If you have imaging data in the condition where it was acquired / stored in instrument coordinates, but can be transformed to anatomical coordinates via some transformation, you basically have two coordinateSystems, and your controlled vocabulary could be used for the names of the axes in the my_reference_brain coordinateSystem.

"coordinateSystems" : [
                {
                    "name" : "instrument",
                    "axes": [
                        {"name": "z", "type": "space", "unit": "micrometer"},
                        {"name": "y", "type": "space", "unit": "micrometer"},
                        {"name": "x", "type": "space", "unit": "micrometer"}
                    ]
                },
                {
                    "name" : "my_reference_brain",
                    "axes": [
                        {"name": "anterior-to-posterior", "type": "space", "unit": "micrometer"},
                        {"name": "inferior-to-superior", "type": "space", "unit": "micrometer"},
                        {"name": "left-to-right", "type": "space", "unit": "micrometer"}
                    ]
                }
            ],

The transformation from one coordinateSystem to the other would be contained in different metadata, as per #138. (@bogovicj please correct me if I'm getting anything wrong here)

Would this work for you @dyf ?

@bogovicj
Copy link

bogovicj commented Jul 26, 2023

Yup, the way @d-v-b described is how I had in mind to describe anatomical coordinates using the v0.5 spec - coming soon.

@jcfr
Copy link

jcfr commented Jul 26, 2023

The ITK ecosystem uses 3 letter acronyms to describe anatomical orientation.

Timely discussion as we just migrated the content of the wiki page to Read the Docs.
See https://slicer.readthedocs.io/en/latest/user_guide/coordinate_systems.html

cc: @lassoan @pieper @muratmaga

@d-v-b
Copy link
Contributor

d-v-b commented Jul 26, 2023

@dyf wrote

Please consider adding an anatomicalOrientation field to axes metadata. Because this would be a controlled vocabulary, I recommend separating it from longName, which is uncontrolled (see #142). I am of course also open to this living elsewhere.

Regarding anatomicalOrientation, for my data (images of cells and tissues), this concept would be meaningless, and I think that's a red flag as far as putting this in spec is concerned. I think we should try to avoid adding metadata that only applies to one subdomain of imaging -- (you might reasonably have the same response if I proposed coverSlipDirection metadata, or anything else that was specific to the details of how my images are produced).

In that vein, your point about longName being uncontrolled is interesting. Note that there is nothing stopping the community of neuroimaging scientists from introducing their own refinement of ome-ngff that restricts the type of longName from string to a concrete subtype of string, such as your controlled vocabulary. This might even be a better outcome for you, because then you are free to evolve your controlled vocabulary without first needing to change the ome-ngff spec. For example, your specification could state "image data must be stored as ome-ngff, with the following extra constraints on the names of axes.... etc". I think that's a solid general solution to problems unique to subdomains of imaging -- those communities can "subclass" the spec and add extra domain-specific constraints as needed. As long as the resulting ome-ngff files are consistent with the main spec, this seems like a good outcome.

@dyf
Copy link
Author

dyf commented Jul 26, 2023

Wow, amazed at the quick response, thanks all!

How about a more domain-agnostic name? e.g. direction, orientation or physicalOrientation?

That said, I'm open to putting this into an extension (e.g. an anatomy or neuroanatomy extension). It's most important (to me) that downstream applications know what terms to expect and how they are defined, so either in the core spec or an easily discoverable extension would be fine.

Pardon my ignorance - is there an extension mechanism already?

@joshmoore
Copy link
Member

joshmoore commented Jul 27, 2023

In talking to the good folks of Get Your Brain Together (late one night), I proposed either as part of the existing coordinate system or perhaps by having a dedicated subclass of existing systems, the ability to add a field that is defined by an enum or ontology that is managed by a community outside of the NGFF process.

The "name" fields in the above examples could almost be used for these purposes but issues such as:

  • namespace collision with other communities
  • need for multiple coordinate systems of the same type

could benefit from having additional type information that the medical community can detect and "do the right" with. I think the question is whether this should be a specifically anatomical extension, or if there's a way to add this type information more generically.

(Oops. Comments were added while I was writing. I think nonetheless that this holds. I don't think names are sufficient alone. Another possible option would be a prefixing mechanism that is community-specific, but this would likely become unwieldy.)

@dyf
Copy link
Author

dyf commented Jul 28, 2023

@joshmoore I agree that names are insufficient for the reasons you say.

Adding a new field that refers to an enum/ontology would be great. Different communities would be able to use different controlled terms by indicating what ontology they use. Validation may get tricky, but it's at least a start.

I this orientation might be a good generic term that others could use for similar purposes.

@d-v-b
Copy link
Contributor

d-v-b commented Jul 28, 2023

I had a look at how the CF conventions handles this: https://cfconventions.org/Data/cf-conventions/cf-conventions-1.10/cf-conventions.html#standard-name. As I understand it, a physical quantity (like length) can be associated with a standard_name field, which is basically used as a lookup into a collection of standard quantities. Materially, this would result in metadata like:

{
    "name" : "my_reference_brain",
    "axes": [
        {"name": "a-p", "standard_name": "anterior-to-posterior", "type": "space", "unit": "micrometer"},
        {"name": "i-s", "standard_name": "inferior-to-superior", "type": "space", "unit": "micrometer"},
        {"name": "l-r", "standard_name": "left-to-right", "type": "space", "unit": "micrometer"}
    ]
}

I think the standard_name field would have to be nullable, because there's plenty of communities that don't have strict semantics for axes. In the case of CF, the CF metadata standard itself includes a reference to the list of standard_name values. Unless we want to do something similar (i.e., bake these names into an ome-ngff-adjacent standard), there might need to be an additional piece of metadata that links out to a resource / authority that defines the semantics of "anterior-to-posterior".

@joshmoore said:

The "name" fields in the above examples could almost be used for these purposes but issues such as:

namespace collision with other communities
need for multiple coordinate systems of the same type

Can you elaborate on these concerns? I don't really understand how namespace collision with another community would be a problem -- presumably all that matters is that community X can save and load their data. If community Y uses the same metadata names as X, why is that bad? Perhaps I'm missing something. And I didn't understand the second concern at all.

@dyf
Copy link
Author

dyf commented Aug 1, 2023

I like adding a standard_name field along with a clear way to refer to an outside ontology. I would prefer an additional field for this (rather than a prefix). I would be happy to find or create an ontology we could refer to via URI.

I can't speak for @joshmoore, but for me a primary goal is to remove undocumented assumptions so that tools can reliably interpret the orientation of images. If multiple subcommunities have different, disparately (or un-) documented conventions for how to interpret name, then the situation is still muddled.

@joshmoore
Copy link
Member

@d-v-b Can you elaborate on these concerns? I don't really understand how namespace collision with another community would be a problem -- presumably all that matters is that community X can save and load their data. If community Y uses the same metadata names as X, why is that bad?

I was primarily referring to collisions either on the namespace prefix itself or on the key if there is no namespace prefix. What's bad is if community X and Y cannot tell if the data came from the other community.

@d-v-b And I didn't understand the second concern at all.

I may have misunderstood @dyf's example, but if detection is based on the use of a unique value as the name, then that coordinate system can only exist once. i.e., let's not overload name with multiple responsibilities -- currently it is for unique look up of the coordinate systems, and I don't think it should also be responsible for determining orientation (or other metadata that could be requested in the future).

@dyf I like adding a standard_name field along with a clear way to refer to an outside ontology. I would prefer an additional field for this (rather than a prefix). I would be happy to find or create an ontology we could refer to via URI.

In my opinion, standard_name is not really descriptive of what we are looking for. I also agree that prefixes aren't ideal. And I'm not sure we should limit to a hard-coded ontology (i.e. it should be extensible), but 👍 for @dyf & co. finding or creating an ontology that works for their purposes.

@d-v-b
Copy link
Contributor

d-v-b commented Aug 3, 2023

I think this issue relates to #203, insofar as we are thinking about giving a "proper name" to a measurement / quantity (anatomical coordinates can be thought of as special lengths).

Depending on how big the ontology is going to be, I wonder if we should consider requiring that it be versioned and stored inside the zarr metadata under the appropriate namespace (maybe under an ome.ontologies key in the container root?). This would make the data self-describing and make things much easier for parsers (i.e., you should be able to validate without an internet connection).

@pieper
Copy link

pieper commented Aug 4, 2023

It would be great to be able to use dicom headers for lossless conversion between ome-ngff and WSI or other dicom objects. A lot of the concepts mentioned in this thread already have well standardized representations in dicom and it would be a shame to devise something from scratch that duplicates the concepts in an incompatible way. I know a lot of people think dicom is complex and hard to use, but in my experience it's the data that's complex and so any standard will be complex and we might aw well use the one we have.

It may not be realistic to assume that everyone will use dicom, but the ability to losslessly transcode between formats seems like a very valuable goal.

@dyf
Copy link
Author

dyf commented Aug 4, 2023

@d-v-b I think this issue relates to #203, insofar as we are thinking about giving a "proper name" to a measurement / quantity (anatomical coordinates can be thought of as special lengths).

Yes, we put axis units next to anatomical orientation in our metadata schema.

@d-v-b Depending on how big the ontology is going to be, I wonder if we should consider requiring that it be versioned and stored inside the zarr metadata under the appropriate namespace (maybe under an ome.ontologies key in the container root?). This would make the data self-describing and make things much easier for parsers (i.e., you should be able to validate without an internet connection).

It's quite small. I would love to be able to include it here so that we can use it for validation.

@joshmoore
Copy link
Member

d-v-b commented 4 days ago
I think this issue relates to #203, insofar as we are thinking about giving a "proper name" to a measurement / quantity (anatomical coordinates can be thought of as special lengths).

I wonder if meaning or interpretation are starting to approach what you are getting at.

d-v-b commented 4 days ago
Depending on how big the ontology is going to be, I wonder if we should consider requiring that it be versioned and stored inside the zarr metadata under the appropriate namespace (maybe under an ome.ontologies key in the container root?). This would make the data self-describing and make things much easier for parsers (i.e., you should be able to validate without an internet connection).

That might make things more self-describing for humans but not for the programs that will need to understand a priori. Further, if this is a general mechanism that we want to use for the interpretation of other fields in the future, I fear the inlining burden will become burdensome.

dyf commented 3 days ago
It's quite small. I would love to be able to include it here so that we can use it for validation.

What do you mean by "include it here", @dyf? i.e. in this issue?

pieper commented 3 days ago
It would be great to be able to use dicom headers for lossless conversion between ome-ngff and WSI or other dicom objects. A lot of the concepts mentioned in this thread already have well standardized representations in dicom and it would be a shame to devise something from scratch that duplicates the concepts in an incompatible way. I know a lot of people think dicom is complex and hard to use, but in my experience it's the data that's complex and so any standard will be complex and we might aw well use the one we have. It may not be realistic to assume that everyone will use dicom, but the ability to losslessly transcode between formats seems like a very valuable goal.

💯 for working towards interoperability, @pieper. Can you show a snippet of what you think that header information would look like?

dyf commented 3 days ago

@d-v-b I think this issue relates to #203, insofar as we are thinking about giving a "proper name" to a measurement / quantity (anatomical coordinates can be thought of as special lengths).

Yes, we put axis units next to anatomical orientation in our metadata schema.

Can you share an example of that, @dyf? I think working towards a collection of N similar json blurbs that encode the same information that we can start referring to by name for this discussion would be useful.

@d-v-b
Copy link
Contributor

d-v-b commented Aug 7, 2023

@joshmoore

d-v-b commented 4 days ago
I think this issue relates to #203, insofar as we are thinking about giving a "proper name" to a measurement / quantity (anatomical coordinates can be thought of as special lengths).

I wonder if meaning or interpretation are starting to approach what you are getting at.

Hmm.. these are a bit subjective, no? I feel like standard_name or ontology_identifier are a lot more explicit. But I think picking the right field name(s) would be a lot easier if we had examples of the ontology data referenced by those fields.

d-v-b commented 4 days ago
Depending on how big the ontology is going to be, I wonder if we should consider requiring that it be versioned and stored inside the zarr metadata under the appropriate namespace (maybe under an ome.ontologies key in the container root?). This would make the data self-describing and make things much easier for parsers (i.e., you should be able to validate without an internet connection).

That might make things more self-describing for humans but not for the programs that will need to understand a priori. Further, if this is a general mechanism that we want to use for the interpretation of other fields in the future, I fear the inlining burden will become burdensome.

I think the alternative to making a dataset self-describing is to use links, but links can break, either because the content at the URI has moved, or because internet connectivity is unreliable. I'd be a a little uncomfortable requiring an internet connection for validating an ome-ngff with ontologies in it, at least if the ontology information is small enough to fit in JSON. Hopefully an example ontology document can clear up some of these issues.

@dyf
Copy link
Author

dyf commented Aug 8, 2023

@d-v-b I want to include the terms that are in the issue (e.g. anterior-to-posterior). So, it would be 6-8 terms in a JSON file. We could extend it if needed.

If you are looking for a relevant external source, see:

https://openminds.ebrains.eu/v3/ --> controlledTerms -> anatomicalAxisOrientation.

They describe each term in jsonld, e.g.: RAS, RAI.

The difference is that I am asking for these 3-letter codes to be broken up so we can describe each axis clearly and independently (rather than assume that all arrays are 3-dimensional).

We could very easily package these terms up into a JSON file and add them to the repository.

@dclunie
Copy link

dclunie commented Aug 8, 2023

FYI, the DICOM coordinate systems and orientations may be relevant when something is patient-relative.

The co-ordinate system is used when the origin of an image TLHC and the unit vectors defining the orientation of the rows and columns of an image are to be described. DICOM is +ve in the LPS (left-posterior-superior) directions.

For 2D images (such as a mammogram) the row and column directions of an image are defined in a patient-relative sense categorically.

These are described at:

Note that quadrupeds as well as bipeds are accounted for (theoretically).

See also this (outdated) explanation:

There are also (US) volume-relative and slide-relative coordinate systems:

@dyf
Copy link
Author

dyf commented Aug 16, 2023

Do folks have an opinion on the format and location of the controlled vocabulary? I am planning to open a draft PR for this in the next week or so.

@lassoan
Copy link

lassoan commented Aug 16, 2023

It would be nice if the file format could store standard coded entry data as specified in the DICOM standard (and used in all clinical imaging applications). You can find all the intricate details in the DICOM standard, but in practice it is quite simple - an entry is specified by these 3 strings:

  • coding scheme: name of the coding scheme (e.g., SCT = Snomed Clinical Terms, FMA = Foundational Model of Anatomy, ...)
  • code value: identifier specified in the coding scheme (usually not a human-readable term, but some alphanumeric code)
  • code meaning: optional human-readable description (for example, it can be displayed for the user if the application does not know the term)

@pieper
Copy link

pieper commented Aug 16, 2023

+1 to @lassoan's suggestion of adopting the (scheme, value, meaning) code tuple concept from DICOM. Another feature of it is that in addition to standardizing references to useful vocabularies like SCT and FMA, anyone can designate custom vocabularies by prefixing the scheme with 99 as a flag that it's non-standard. So there's flexibility to support research use cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants