Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-28258: Epoch is not getting passed consistently among reference catalog loading functions #223

Merged
merged 1 commit into from Jan 13, 2021

Conversation

laurenam
Copy link
Contributor

@laurenam laurenam commented Jan 8, 2021

No description provided.

Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we ensure that this works in gen3, since I don't think there are any gen3 refloader tests? See DM-27843 for an example where the gen2 and gen3 refcat loaders have diverged.

Please squash your two commits: the ones that add the test should be on the same commit. And typo/spellcheck your commit messages please.

@@ -543,8 +543,6 @@ def getMetadataBox(cls, bbox, wcs, filterName=None, photoCalib=None, epoch=None,
md.add("SMATCHV", 1, 'SourceMatchVector version number')
filterName = "UNKNOWN" if filterName is None else str(filterName)
md.add('FILTER', filterName, 'filter name for photometric data')
# Calling str on the epoch is a workaround for code that never worked anyway and was not used
# see DM-17438
md.add('EPOCH', "NONE" if epoch is None else str(epoch), 'Epoch (TAI MJD) for catalog')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the similar block of code in getMetadataCircle need to be unified, probably with @natelust 's help. Do we really need the str (and in the gen2 version below)? What happens without it? The "see DM-17438" in the removed comment isn't that helpful since I think that ticket is referring to the gen2 code, not this gen3 code.

The gen2/gen3 refcat loaders should really be derived from a shared base class, as some of the code can be shared. Can you at least lift out the shared parts of the metadata getters?

Copy link
Contributor Author

@laurenam laurenam Jan 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked that the str() representation appropriately returns a str of the float value for MJD when the epoch is passed as an astropy.time.Time object, which is what our code is (thanks to you!) doing consistently. This is the appropriate format for the metadata entry, so I think it is necessary and appropriate (I will double check what happens without the casting before merging).

As for consolidation, these methods add different metadata and other code (perhaps limited to the ’joinMatchListWithCatalog` function in this file), rely on it, so I’d prefer not to mess with that here. I would also like to punt on any changes towards Gen2/Gen3 sharing given my complete lack of instinct for any undesired consequences. If @natelust thinks what is here is “ok” for now, perhaps we could push that (or just the eventual removal of the gen2 bits) onto another ticket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have have looked further into any need for a str() casting on epoch which returns a string of the MJD float. While a non-casted epoch would indeed result in a TypeError: scalar 'Time' object is not subscriptable.
With the str() cast, the metadata "EPOCH" header would be a string type, which actually goes against any standards I've seen (for either "EPOCH" or "MJD*" entries which are typically of type float). So, I think what we really want here is epoch.mjd such that the metadata header is set as a float representation of the MJD stored in the astropy.time.Time object. As far as I can tell, the only code in our codebase that reads this metadata (other than the pipe_analysis scripts!) is the joinMatchListWithCatalogImpl() function which indeed tries to get this header as a float:

    try:
        epoch = matchmeta.getDouble('EPOCH')
    except (pexExcept.NotFoundError, pexExcept.TypeError):
        epoch = None  # Not present, or not correct type means it's not set

So, if it were a string, that getDouble('EPOCH') would fail with a TypeError and, since it's in a try, would get silently ignored and the epoch would get set to None. So, I have updated to use epoch.mjd.

Running with this update, I can confirm we now get the correct entry:

In [49]: packedMatches = butler.get("srcMatch", visit=visit, ccd=ccd) 
    ...: matchmeta = packedMatches.table.getMetadata()                                                            

In [50]: matchmeta.toDict()                                                                                       
Out[50]: 
{'RA': 149.787995492025,
 'DEC': 2.68818182859397,
 'RADIUS': 0.115824492696761,
 'SMATCHV': 1,
 'FILTER': 'i',
 'EPOCH': 56744.4706331655}

In [51]: matchmeta.getDouble("EPOCH")                                                                             
Out[51]: 56744.4706331655

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to be a bit careful with the EPOCH header value because it's actually part of the FITS standard:

KEYWORD:   EPOCH
REFERENCE: FITS Standard 
STATUS:    reserved
HDU:       any
VALUE:     real
COMMENT:   equinox of celestial coordinate system
DEFINITION: The value field shall contain a floating point number
giving the equinox in years for the celestial coordinate system in
which positions are expressed.  Starting with Version 1, the Standard
has deprecated the use of the EPOCH keyword and thus it shall not be
used in FITS files created after the adoption of the standard; rather,
the EQUINOX keyword shall be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think entering it as a float is at least more consistent with this than a string. As for the true/standard meaning of the header and any deprecation of the term "EPOCH", I think I'm not going to worry about that (unless you insist!) This is only used internally as metadata that gets attached to our reference match catalogs and it comes with a docstring:

In [16]: matchmeta.getComment("EPOCH")                                                                                
Out[16]: 'Epoch (TAI MJD) for catalog'

And the only code I know of that makes use of this entry does expect the "EPOCH" keyword.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given @timj 's quote from the FITS standard, epoch.mjd is incorrect: it should be epoch.jyear, since the standard calls for "the equinox in years", which is not mjd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, if we were strictly following FITS standards...but I don't think we are claiming to here. The number as it is currently being used is expecting MJD (as the docstring specifies), so we'd be better off changing the header name than changing the unit (no more units breakdowns. please!), but that may require RFC-ing...so I'm still rather inclined to leave it as is (and I do note that the word "epoch" seems to imply MJD throughout our codebase, so this doesn't feel like too much of a stretch...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please file a ticket to remove that epoch keyword and replace it with either EQUINOX (following the FITS standard) which would then be a Julian year, or MJD if we want to keep it as MJD. Julian year makes more sense to me as it is the value you would directly use in proper motion calculations, with values given in mas/year. EPOCH in daf_base DateTime.h is defined as "Julian epoch year".

tests/test_htmIndex.py Show resolved Hide resolved
tests/test_htmIndex.py Show resolved Hide resolved
@laurenam
Copy link
Contributor Author

laurenam commented Jan 9, 2021

How do we ensure that this works in gen3, since I don't think there are any gen3 refloader tests? See DM-27843 for an example where the gen2 and gen3 refcat loaders have diverged.

Hhhhmmm, I guess I was banking on ci_hsc_gen3 to pick up any problems, but I admit to still being embarrassingly naive on most-things-Gen3 and I’m not sure of the full extent of the ci_hsc_gen3 coverage (a few more details in the README there in this regard would go a long way!). Might @natelust or @TallJimbo weigh in here on whether or not we need to worry about Gen3 breakage that a Jenkins + ci_hsc would not have caught here?

@parejkoj
Copy link
Contributor

parejkoj commented Jan 9, 2021

"...worry about Gen3 breakage that a Jenkins + ci_hsc would not have caught here?" It's not so much breakage I'm worried about, but divergence. For example, I don't believe we have any checks that the gen2/gen3 ref loaders give you the same thing when you request a given region, and I don't think we have any unittests at all of the gen3 ref loader.

The epoch parameter was not being passed consistently among the different
functions that take it as an input parameter.  The epoch is used to correct
the reference catalogs for proper motion and parallax (when present).  This
correction was thus not taking effect when the passing down of the parameter
was incomplete.  This situation is remedied here.

The epoch object being passed around is consistently of type astropy.time.Time
or None.  The value desired for the metadata "EPOCH" header set in the
getMetadataCircle() and getMetadataBox() functions is the float representation
of MJD.  This is obtained via the .mdj attribute of the astropy.time.Time
object.

This also adds a unittest to check that the proper motion correction is
occurring when loading a reference catalog via the loadPixelBox() funtion
The check here uses an extreme value for the epoch such that the differences
will be significant and reasnoably tested to have changed via
assertFloatsNotEqual.  Note that this simply checks that the coordinates do
indeed change when the epoch is passed to loadPixelBox().  It makes no attempt
at assessing the correctness of the change.  This is left to the explicit
testProperMotion() test below.
@laurenam laurenam merged commit c400000 into master Jan 13, 2021
@laurenam laurenam deleted the tickets/DM-28258 branch January 13, 2021 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants