Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-34175: Simplify ingest with extended exposure table and related dimensions #54

Merged
merged 1 commit into from Apr 5, 2022

Conversation

PaulPrice
Copy link
Contributor

The ObservationInfo class gets post-processed to set up the
properties, which is not convenient for subclasses (which would
have to replicate the post-processing). Moved the post-processing
into a metaclass, so that this post-processing is done automatically
for ObservationInfo and all subclasses. This is more convenient
than the simpler class decorator solution, because the user doesn't
need to add the decorator.

@PaulPrice PaulPrice requested a review from timj March 24, 2022 18:47
@timj
Copy link
Member

timj commented Mar 24, 2022

General comments before I start looking in detail:

  • Can't this be done using __init_subclass__ rather than a metaclass?

  • I don't like the idea of having to specify the core model properties in the subclass. Can we add a concept of EXTRAS and then the subclass specifies only the things it needs in addition to the main ones? That would make it clear if a subclass is explicitly breaking with the base class data model by removing things.

  • From a data model perspective I quite like the idea of making the properties look like they are model extensions. So ObservationInfo.ext_slit_pos rather than ObservationInfo.slit_pos.

  • Did you decide against having the MetadataTranslator publish the non-standard properties as extras. It would make ObservationInfo a little more dynamic than it currently is but would let you continue to use ObservationInfo directly.

  • I don't really like the idea that the caller has to know that their instrument has to be used with a special ObservationInfo. An alternative could be to have a class method ObservationInfo.from_metadata() that can return a subclass that is known to the metadata translator subclass (once you have found the right MetadataTranslator there's a lot that it can know since it knows everything about the data). For example the astrometadata commands will work for PFS but they will only know a subset of information so the write-sidecar functionality won't work for you when writing translated metadata and astrometadata translate won't report the special items.

@PaulPrice
Copy link
Contributor Author

That's useful, thanks. I'll work on cleaning it up in the way you've suggested.

Copy link
Member

@timj timj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much better. Thank you. My main concern is that to_simple is not self-describing and that means that reading the JSON back is going to either fail or drop extensions on the floor.


return state
return self.extensions, state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we can't read a previously pickled ObservationInfo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. Do you have reason to think these exist in the wild, or are you simply prompting me to document this change in behaviour somewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure but it is a change in interface and the package has been around for a while. Presumably we can see what state there is in the pickle and treat the extensions as optional?

@@ -363,7 +443,7 @@ def to_json(self):
return json.dumps(self.to_simple())

@classmethod
def from_simple(cls, simple):
def from_simple(cls, simple, extensions=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter is not documented. How does a caller know which extensions are relevant though when they have some serialized JSON? The JSON has the keys but it doesn't know the definitions of the extensions. Maybe we should store the translator class in the JSON (using lsst.utils.introspection.get_full_type_name), and by extension in the output of to_simple, so that we can import it (with lsst.utils.doImportType) and ask for the extensions?

Does to_simple include the ext_ in the name so that we know the JSON contains extensions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that this leaves the possibility of us having to run doImportType on untrusted text. I'm not sure what other option we have though when reading in a JSON file. For scalars we can probably insert some default property definitions derived from the python type in the JSON but for some special thing equivalent to an astropy quantity that will be a lot less feasible.

@@ -1056,11 +1074,12 @@ def to_property(self):
# These translator methods can be defined in terms of other properties
CONCRETE = set()

for name, description in PROPERTIES.items():
for name, definition in PROPERTIES.items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably deserves a comment here to say that subclasses with extensions are assumed to not need abstract methods created for them.

import unittest
from astro_metadata_translator import ObservationInfo, makeObservationInfo, PropertyDefinition

"""Test that extensions to the core set of properties works"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test where you:

  • Create a little translator class that defines and extension and an extension translation method (maybe a trivial_map and an explicit to_ method)
  • Translate a simple header
  • Check that the extenion properties are filled in.

I think this could be done with a very small change to the test_translation.py test file (that uses InstrumentTestTranslator).

# Type checking is applied, like in the original
obsinfo.makeObservationInfo(extensions=self.extensions, ext_myInfo=12345)

def test_pickle(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to show that JSON round trips and you can serialize an ObservationInfo to JSON and then read the JSON back in and get an equivalent ObservationInfo.

Some instruments have important properties beyond the core set
that have been defined, and we want to be able to extract those
as well. Added the concept of 'extensions' to both the
MetadataTranslator and ObservationInfo. This allows instruments
to define additional instrument-specific properties in their
appropriate MetadataTranslator, and these get picked up by the
ObservationInfo.

Unfortunately, extension properties have to get defined in the
ObservationInfo at runtime (because we don't know what they are
at compile time), which means we need to use a different scheme
for making the fields read-only, and there's no clear place to
hang their docstrings. But this is counteracted by the fact that
users don't need to choose a specific class to use, or define
the extensions themselves.
@PaulPrice PaulPrice merged commit f4b76ee into main Apr 5, 2022
@PaulPrice PaulPrice deleted the tickets/DM-34175 branch April 5, 2022 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants