Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-15751: Allow using jointcal output to make warps #232

Merged
merged 7 commits into from Nov 21, 2018

Conversation

parejkoj
Copy link
Contributor

@parejkoj parejkoj commented Nov 7, 2018

No description provided.

@parejkoj parejkoj force-pushed the tickets/DM-15751 branch 3 times, most recently from 04fe52d to a012427 Compare November 16, 2018 16:48
@@ -70,9 +66,15 @@ class CoaddBaseConfig(pexConfig.Config):
default=False
)
modelPsf = measAlg.GaussianPsfFactory.makeField(doc="Model Psf factory")
doApplyUberCal = pexConfig.Field(
doApplyJointcal = pexConfig.Field(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unnecessary interface change unrelated to this ticket, and would cause unnecessary disruption. This config does the same thing as it did before. Please don't change the name.

doApplyUbercal appears in our pipelines docs, multiple dmtn's, the middleware teams' workflow development. The command line interfaces go beyond Science Pipelines.

In general, config parameters can be added, but not removed. If you you must add a doApplyJointcal, you can deprecate the old one like we did with doPsfMatch here: https://github.com/lsst/pipe_tasks/blob/master/python/lsst/pipe/tasks/makeCoaddTempExp.py#L74 Note that this was a case where the old config no longer made sense since you can now make psfMatched and direct warps simultaneously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've reverted this change. I'm debating whether to RFC a new config name (applyJointcal is even better than "doX", I think), but that won't block this.

@@ -140,32 +142,64 @@ def getSkyInfo(self, patchRef):
"""
return getSkyInfo(coaddName=self.config.coaddName, patchRef=patchRef)

def getCalExp(self, dataRef, bgSubtracted):
"""!Return one "calexp" calibrated exposure
def getCalibratedExposure(self, dataRef, bgSubtracted):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also an unnecessary interface change, but its use is limited to pipe_tasks as far as I know, so I can live this.

fluxMag0[1]/fluxMag0[0]**2,
exposure.getBBox())

exposure.maskedImage = photoCalib.calibrateImage(exposure.maskedImage)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on here if doApplyUbercal=False? (I'm sure there's a reason it doesn't just return exposure as is, just not clear what it is)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image is calibrated, and then "decalibrated" (which means if that is False, the returned image is unchanged): once RFC-545 is implemented, the "decalibrated" step (dividing by the mean) will be removed (see the TODO below).

np.random.seed(10)

self.config = CoaddBaseConfig()
self.task = CoaddBaseTask(self.config, "testCoaddBase")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CoaddBaseTask is a shared interface for AssembleCoaddTask and MakeCoaddTempExp, and was never meant to be instantiated (yes we should have probably made it a real ABC and uninstantiable).
Since people use the unit tests for examples of real code that works, I'm afraid they'll copy this bad example. Can you instantiate MakeCoaddTempExp instead?

Copy link
Contributor Author

@parejkoj parejkoj Nov 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't tests of MakeCoaddTempExp, they're tests of CoaddBase. I don't see how CoaddBase could be an ABC anyway: it has no unimplemented methods, so it seems like a perfectly legitimate concrete class (as are Task and CmdLineTask). If they are supposed to be ABCs, we need to define the missing methods and what their API is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implicit unimplemented method is run(). getCalExp() probably should have been in makeCoaddTempExp anyway. It's logically part of makeCoaddTempExp and assembleCoadd doesn't use it.

After the conversation below about modifying task configs and

I and others have used this system in tests before: I wasn't aware there was anything odd about it.

I feel even more strongly about leaving anti-examples out of unittests. You can leave it if you but a big scary inline comment warning people to never use coaddBaseTask because it doesn't do anything and only exists to reduce code duplication between assemble and makeCoaddTempExp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If getCalExp() should live in makeCoaddTempExp, I'm happy to fix that right now, and then change the test appropriately. Is there anything else in the CoaddBase interface that shouldn't live in the base class?

Note that CoaddBase cannot be an ABC with run() as the unimplemented interface, because MakeCoaddTempExp and AssembleCoadd have different call signatures for run().

We cannot just declare something as "don't instantiate this object" unless we actually make that true in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved getCalibratedExposure() to makeCoaddTempExp, where it should probably have always lived, and changed the test to be MakeCoaddTempExpTestCase. Are you ok with that change?

This doesn't change the fact that CoaddBase is a concrete class, and that the coadd API seems to be unclear.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works for me

self.assertEqual(result.getWcs(), self.skyWcs)

def test_getCalibratedExposureNoJointcalPhotoCalib(self):
self.config.doApplyJointcal = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifying a task config to change how a method behaves in each unit test? Bold move. In normal processing, the config is frozen and this is impossible. In this case, how do you know that test_getCalibratedExposureNoJointcalPhotoCalib doesn't execute BEFORE test_getCalibratedExposure? My understanding is that unit tests do not execute in a guaranteed order.

I would instantiate two tasks: one with doApplyJointcal=False and one with doApplyJointcal=True, and let each test use the version of the task it wants to test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is actually ok, because setUp runs (and creates a new CoaddBaseConfig, resetting state) before each test method is run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. OK for the unit test behavior. (That also means you don't save any time by doing it this way: changing the configs after creating the task.)

Changing a live task's configs by modifying a view into the config is still non-standard. I'd be OK with it if you get @r-owen's blessing. (At least add an inline comment deterring any onlookers from copying your pattern, and have setup set doApplyUbercal=True so you only do it once in test_getCalibratedExposure)

Copy link
Contributor Author

@parejkoj parejkoj Nov 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When and how is the config frozen? Does it actually become immutable? Where is this documented? What do you mean by "a view into the config"?

I and others have used this system in tests before: I wasn't aware there was anything odd about it.

What do you mean "don't save any time"? I don't do this to save time, I do it to reduce code duplication: create the config in setUp, modify it for individual tests as needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer some of my own questions:

  • When and how is the config frozen?: inside ArgumentParser.parse_args() by calling config.freeze().
  • Does it actually become immutable?: Yes, in that after freeze(), attempting to modify a config Field raises FieldValidationError.
  • Where is this documented?: The Returns section of parse_args says "validated and frozen" for the config element of the returned argparse namespace. That's the only mention I could find, and I'd never have thought to look there for it.

All of the above suggests to me that there's nothing wrong with modifying a config if you never have an ArgumentParser, as in this case.

Copy link
Member

@TallJimbo TallJimbo Nov 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many tasks have constructor logic that depends on configuration (for instance, those that produce catalogs finalize their schemas at construction). Allowing configuration to be modified after task construction for those could lead to hard-to-find bugs downstream (e.g. because the config and the schema are out of sync). As a result, I've always advocated that all Task configuration be considered frozen after initialization, and in fact until recently I thought we enforced it in all Tasks. It turns out we only enforce that in CmdLineTasks, but there's nothing that would make them special in this respect.

I've just created DM-16548 to fix that (I should have done that last week when I learned of the problem; glad this discussion reminded me of it). Comments there on where to better document this are very welcome.

Once that lands, I think this test code will have to change, and if it's not too hard it might be best to change it up front - I can certainly believe that the local logic here is perfectly fine, but I think it's something of an anti-pattern that we don't want cargo-culted to other tests and tasks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @TallJimbo that tasks should always assume their configuration is frozen. For the record: it is the command parser that freezes the config (somebody has to do it). So the config received by the command line task and all its subtasks is frozen, and tasks should be written assuming that their config is frozen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, this technique is used in other tests in this package and (I suspect) elsewhere, some of them of quite long standing; like @parejkoj, I was surprised to see it being called out as an anti-pattern.

The nearest we get to advising against this in official documentation is (as far as I've found) in the pipe_base documentation which asserts:

The configuration is read-only; once you construct a task, the same configuration will be used to process all data with that task

From that wording, it's not clear if that's a statement of technical fact or of policy, though.

I'd be happy to see an update to that documentation — or, as DM-16548 suggests, a technical fix — to establish policy here. (And in the shorter term, given the prevailing opinions expressed above, I suggest modifying the approach being taken in this code even prior to any documentation fixes.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said above, the freezing happens inside ArgumentParser, so it is decoupled from the Task system (you don't need an ArgumentParser to write Task tests), and thus surprising to anyone not involved in developing pipe_base. I can rework these tests, but as @jdswinbank says, making config freezing happen in Task in DM-16548 will likely require changing quite a few tests.

I don't know if that requires an RFC: it is in interface change, but it's a change to what the system was apparently intended to be. (I won't argue against such a change: I'm just pointing out the distinction between the actual behavior here vs. what the original authors apparently intended.)

@@ -189,6 +189,7 @@ def buildInputImages(self, butler, obsCatalog=None, truthCatalog=None, tract=0):
visit = obsRecord.getI(visitKey)
self.log.info("Generating image for visit={visit}, ccd={ccd}".format(ccd=ccd, visit=visit))
exposure = lsst.afw.image.ExposureF(obsRecord.getBBox())
exposure.maskedImage.image.array += 1e-8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add inline comment about how you chose 1e-8

@@ -22,6 +22,7 @@
import lsst.pex.config as pexConfig
import lsst.afw.geom as afwGeom
import lsst.afw.image as afwImage
import lsst.daf.persistence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent style. I don't care. Just pointing it out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dev guide doesn't say anything about which of these we prefer. Some high level guidance there would be preferred.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coder's choice. If a module already has a style for imports, I'll copy it for consistency.

Rename doApplyUberCal -> doApplyJointcal
Rename getCalExp() -> getCalibratedExposure()

Remove meas_mosaic dependency

Add some basic tests for CoaddBaseTask.getCalibratedExposure()
meas_mosaic's photoCalib FluxFitBoundedField is slower than jointcal's
so we use the old mosaic code to update the exposures.
Identically zero images result in NaN calibrated variance.
Modify test task config to preempt DM-16548.

Move getCalibratedExposure to MakeCoaddTempExp and rework test.
@yalsayyad
Copy link
Contributor

Marked as approved with the note that dafPersist is the minimum-ambiguous alias that matches the others in that file, and obs_subaru needs some useMeasMosaic=True

@parejkoj
Copy link
Contributor Author

On a new thought, I've removed all of the custom exception handling from getCalibratedExposure(): I don't think it actually added anything useful to the stability of the system. And all exceptions are masked in run() at present anyway...

@parejkoj parejkoj force-pushed the tickets/DM-15751 branch 2 times, most recently from ea4ffd6 to 065ec21 Compare November 21, 2018 09:25
@@ -438,12 +447,18 @@ def getCalibratedExposure(self, dataRef, bgSubtracted):
calexp's background background model and add it to the calexp.
@return calibrated exposure

@raises MissingExposureError If data for the exposure is not available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc string says it raises a MissingExposureError. Looks like you ended up naming the class MissingExposure? I think I like MissingExposureError better.

Why is a MissingExposureError better than a dafPersist.NoResults?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we talked about this in slack, but for posterity: the logs go from

coaddDriver.makeCoaddTempExp WARN: Calexp DataId(initialdata={'ccd': 23, 'visit': 36837, 'pointing': 1299, 
'filter': 'HSC-Y', 'field': 'SSP_WIDE', 'dateObs': '2015-07-23', 'taiObs': '2015-07-23', 'expTime': 200.0, 'tract': 
9697}, tag=set()) not found; skipping it: No locations for get: datasetType:calexp dataId:DataId(initialdata={'ccd':
 23, 'visit': 36837, 'pointing': 1299, 'filter': 'HSC-Y', 'field': 'SSP_WIDE', 'dateObs': '2015-07-23', 'taiObs': 
'2015-07-23', 'expTime': 200.0, 'tract': 9697}, tag=set())

to

coaddDriver.makeCoaddTempExp WARN: Calexp DataId(initialdata={'ccd': 32, 'visit': 36837, 'pointing': 1299, 
'filter': 'HSC-Y', 'field': 'SSP_WIDE', 'dateObs': '2015-07-23', 'taiObs': '2015-07-23', 'expTime': 200.0, 'tract': 
9697}, tag=set()) not found; skipping it: "calexp" not found: No locations for get: datasetType:calexp 
dataId:DataId(initialdata={'ccd': 32, 'visit': 36837, 'pointing': 1299, 'filter': 'HSC-Y', 'field': 'SSP_WIDE', 
'dateObs': '2015-07-23', 'taiObs': '2015-07-23', 'expTime': 200.0, 'tract': 9697}, tag=set()) ```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the name. I bounced back and forth between those a few times.

In this case, this "catch and raise" will make more sense with DM-16537. The idea is to better follow the logic you described to me on slack: sometimes exposures don't exist in a patch, and you want to ignore that. However, other errors should not be masked, for example ImportError for meas_mosaic, the jointcal calibrations being missing (jointcal persists all exposures it processes in each visit, even if the Exposure isn't in the tract), or a legitimate math error. This makes the distinction between those cases explicit.

I've also wrapped the meas_mosaic line in a similar try:except, because my understanding is that mosaic isn't guaranteed to persist every Exposure in a visit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not ignoring errors, we are logging them and continuing to process the other dataIds. We take the logs seriously. @hsinfang compares logs of the biweekly DRPs with the rerun before it for any new errors and files tickets for each of them. A "jointcal calibration missing" and a "legitimate math error" are no reason to bring down an entire job either. We'd never get through whole tracts.

MissingExposureError doesn't seem relevant to this ticket, but it does no harm.

@@ -97,29 +107,10 @@ def test_getCalibratedExposure(self):
self.assertEqual(result.getCalib().getFluxMag0()[0], 1/self.exposurePhotoCalib.getCalibrationMean())
self.assertEqual(result.getWcs(), self.skyWcs)

def test_getCalibratedExposureNoJointcalPhotoCalib(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you need to remove these two tests now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I removed the try:except blocks around the jointcal get()s, these were no longer useful tests of getCalibratedExposure().

Explicitly raise on missing calexp/meas_mosaic makes it clear to the calling
code when the something was missing, vs. some other error.

Remove uninformative exception tests (exceptions from loading jointcal just
fall through now).
@parejkoj parejkoj merged commit be3b003 into master Nov 21, 2018
@parejkoj parejkoj deleted the tickets/DM-15751 branch September 1, 2021 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants