Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-13711: Add type of observation information to visitInfo #652

Merged
merged 6 commits into from Sep 12, 2022

Conversation

parejkoj
Copy link
Contributor

No description provided.

Copy link
Member

@kfindeisen kfindeisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, style comments only.

@@ -89,6 +89,7 @@ class VisitInfo : public table::io::PersistableFacade<VisitInfo>, public typehan
* @param[in] id Identifier of this full focal plane data.
* @param[in] focusZ Defocal distance of main-cam hexapod in mm. 0 is in focus.;
* Extra-focal is negative while intra-focal is positive.
* @param[in] observationType Type of this observation (e.g. science, dark, flat, bias, focus).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this open-ended, or is there a set of allowed values?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open-ended. Butler can't enforce specific observation types.

Copy link
Member

@kfindeisen kfindeisen Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how the Butler is relevant here.

My question was motivated by the fact that, if this is going into VisitInfo, then it may be parsed by downstream code (otherwise why have it?). Establishing to the reader whether or not there are conventions, and if so what they are, is important for avoiding bugs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently butler is the main user but I really meant that the metadata translator can't know every single observation type because instruments have many different modes of operation. You can't know the full list ever because there is no global data model of allowed observing modes. "dark" and "bias" are fairly standard but DECam has modes that HSC doesn't have and writing them all down is impossible. You have to refer to the specific instrument's documentation. Each metadata translator can know the obs types for its instrument but that's the limit.

include/lsst/afw/image/VisitInfo.h Outdated Show resolved Hide resolved
src/image/VisitInfo.cc Outdated Show resolved Hide resolved
@@ -364,6 +377,7 @@ def _testIsEmpty(self, visitInfo):
self.assertEqual(visitInfo.getInstrumentLabel(), "")
self.assertEqual(visitInfo.getId(), 0)
self.assertTrue(math.isnan(visitInfo.getFocusZ()))
self.assertTrue(visitInfo.getFocusZ(), "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be assertEqual(visitInfo.getObservationType(), "")?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this: because of this typo, I also missed adding the empty tests for the other new properties.

include/lsst/afw/image/VisitInfo.h Show resolved Hide resolved
@@ -73,7 +74,8 @@ void declareVisitInfo(lsst::utils::python::WrapperCollection &wrappers) {
"rotType"_a = RotType::UNKNOWN,
"observatory"_a = coord::Observatory(nanAngle, nanAngle, nan),
"weather"_a = coord::Weather(nan, nan, nan), "instrumentLabel"_a = "", "id"_a = 0,
"focusZ"_a = nan, "observationType"_a = "");
"focusZ"_a = nan, "observationType"_a = "", "scienceProgram"_a = "",
"observationReason"_a = "", "object"_a = "", "hasSimulatedContent"_a = false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of a bool field having a default is a bit scary, since both true and false look like meaningful values. Maybe add a comment justifying why false should be the default (I assume for backward-compatibility)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I worried about that too, but the infrastructure to make this behave as a None-able field seemed excessive (I don't think we want have nullptrs in VisitInfo). I'll add a note here, and I think to the couple of places in VisitInfo.cc that also default to false. @timj said that going forward it should always be properly populated for LSST data (and we'll fix the metadata where it's not):

https://lsstc.slack.com/archives/C2JPMCF5X/p1661906553417399?thread_ts=1661881492.710089&cid=C2JPMCF5X

I'm open to suggestions though, since I'm also worried about this causing someone to be misled when using data from other sources.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ObservationInfo will always set it so if your VisitInfo is set from raw data that have been processed in gen3 it will always be set and ObservationInfo declares it to be bool -- if someone creates their ObservationInfo and requests that the simulated content flag not be set and then passes it to VisitInfo then it will have a None value. If you are worried that this might be common then I don't know how easy it is to have a True/False/Undef state in C++.

Copy link
Member

@kfindeisen kfindeisen Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note is all I was asking for; I definitely don't think using something like std::optional is worth it!

(Though, given the follow-up discussion, I trust there's a better argument for its correctness than "This constructor must only be called by ObservationInfo"? I assumed it was more that any code that doesn't set hasSimulatedContent works in contexts that don't need to distinguish simulated from real data... though now that I think about it that's iffy for the same reason.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we actually have a use case for hasSimulatedContent in the pipeline yet: I suspect it's mostly for notebook analysis of various test stand data that was made to look like it's on-mountain. I've added notes, though I wish I had something more to say than "// default false for backwards compatibility"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and this reminds me: it currently prints 0 or 1, since those are the C++ true/false output values. I don't think we want to use std::boolalpha, since printing will happen from python and the capitalization will make them unhelpful. So, I'm left with just having them printed as 0 or 1 and it's up to the user to interpret?

Copy link
Member

@kfindeisen kfindeisen Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer boolalpha, it's more reader-friendly. I think it's easier to interpret "true really means True" than "1 really means True". Though I'd forgotten how hard it is to use stream flags without side effects. 😖

scienceProgram(schema.addField<std::string>(
"scienceProgram", "observing program (survey or proposal) identifier", "", 0)),
observationReason(schema.addField<std::string>(
"observationReason", "reason this observation was taken, or its purpose", "", 0)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with observationType, a list of standard values might be a better comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is free format text at the mercy of the person doing the observing or the metadata translator's interpretation of some headers.

_scienceProgram(getString(metadata, "PROGRAM")),
_observationReason(getString(metadata, "REASON")),
_object(getString(metadata, "OBJECT")),
_hasSimulatedContent(false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this reading "HAS-SIMULATED-CONTENT"?

Copy link
Contributor Author

@parejkoj parejkoj Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It reads from that below: there's no getBool method defined in this file, so I have to get it like the int64 fields, inside an if block.

Comment on lines 363 to 409
std::vector<std::string> keyList = {"EXPID", "EXPTIME", "DARKTIME", "DATE-AVG", "TIMESYS",
"TIME-MID", "MJD-AVG-UT1", "AVG-ERA", "BORE-RA", "BORE-DEC",
"BORE-AZ", "BORE-ALT", "BORE-AIRMASS", "BORE-ROTANG", "ROTTYPE",
"OBS-LONG", "OBS-LAT", "OBS-ELEV", "AIRTEMP", "AIRPRESS",
"HUMIDITY", "INSTRUMENT", "IDNUM", "FOCUSZ", "OBSTYPE"};
std::vector<std::string> keyList = {"EXPID",
"EXPTIME",
"DARKTIME",
"DATE-AVG",
"TIMESYS",
"TIME-MID",
"MJD-AVG-UT1",
"AVG-ERA",
"BORE-RA",
"BORE-DEC",
"BORE-AZ",
"BORE-ALT",
"BORE-AIRMASS",
"BORE-ROTANG",
"ROTTYPE",
"OBS-LONG",
"OBS-LAT",
"OBS-ELEV",
"AIRTEMP",
"AIRPRESS",
"HUMIDITY",
"INSTRUMENT",
"IDNUM",
"FOCUSZ",
"OBSTYPE",
"PROGRAM",
"REASON",
"OBJECT",
"HAS-SIMULATED-CONTENT"};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😬

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what clang-format did, since HAS-SIMULATED-CONTENT is so much longer than the rest.

@parejkoj parejkoj force-pushed the tickets/DM-13711 branch 2 times, most recently from c6d821e to a4f5593 Compare September 6, 2022 23:06
@parejkoj
Copy link
Contributor Author

parejkoj commented Sep 6, 2022

I've added two explicit stringification tests: I know that's generally implementation dependent, but I discovered some bugs (missing comas and spaces) in my initial version and figured making it explicit would help with that. It also makes the "boolalpha" printing more obvious.

@parejkoj parejkoj force-pushed the tickets/DM-13711 branch 2 times, most recently from c11d0c3 to 00ce055 Compare September 9, 2022 18:31
This shouldn't have been in this if statement.
scienceProgram, observationReason, object, hasSimulatedContent are all
requested by the commissioning team; adding them now means less file
version number churn.
I noticed I'd made some typos in my stringification, and this should
help ensure such things don't happen by making the stringification more
explicit in the future.
@parejkoj parejkoj merged commit 4fea24e into main Sep 12, 2022
@parejkoj parejkoj deleted the tickets/DM-13711 branch September 12, 2022 21:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants