Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-25304: Add task to extract and process bright stars #402

Merged
merged 2 commits into from Jan 27, 2021

Conversation

MorganSchmitz
Copy link
Contributor

Main PR out of the 4 for DM-25304, the others being daf_butler, obs_base and obs_subaru.

Jenkins run #32381.

Copy link
Contributor

@natelust natelust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, some stuff for you to consider. Once you address these I will take another look

Computes the flux of an object in an annulus. This is required to
normalize each bright star stamp as their central pixels are likely
saturated and/or contain ghosts, and cannot be used.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you probably want to inherit the docstring from PipelineTask, but as it is your task will have weird documentation generated by sphinx. I will help you look into what the best way to have proper documentation is.

return annulusStat.getValue()

@pipeBase.timeMethod
def run(self, inputExposure, refObjLoader=None, dataId=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataId needs to be documented

RunnerClass = pipeBase.ButlerInitializedTaskRunner

def __init__(self, butler=None, initInputs=None, *args, **kwargs):
pipeBase.Task.__init__(self, *args, **kwargs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super() MUST be used here instead of a direct call to pipeBase.Task (im surprised this worked at all)

pipeBase.Task.__init__(self, *args, **kwargs)
# Compute (model) stamp size depending on provided "buffer" value
self.modelStampSize = (int(self.config.stampSize[0]*self.config.modelStampBuffer),
int(self.config.stampSize[1]*self.config.modelStampBuffer))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you explicitly choose int here (that will floor)? If so why not ceil? if not, think about which is more appropriate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two reasons to favor int:

  • if the resulting stamp size corresponds to an even number of pixels, we will add an extra pixel. So we end up with either the integer number immediately below or immediately above the exact (real number) stampSize*modelStampBuffer. Going with ceil would lead to either the integer immediately higher, or the one above that.
  • we will be saving quite a lot of stamps of that size, so since it hardly matters in terms of the resulting processed stamps (by construction, we are taking a buffer to ensure we don't lose any useful pixel even after warping), I figure it is always best to go with one less pixel in each dimension.

self.refCatLoader.ref_dataset_name = "gaia_dr2_20200414"


class ProcessBrightStarsTask(pipeBase.PipelineTask, pipeBase.CmdLineTask):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have sat on this long enough you might be able to drop gen2 support

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not just yet, if only because:

a) I am (and will keep on) running this on RC2 a lot;
b) the Gaia refcat is not yet included in any gen3 repos (as far as I know).

Happy to clean things up when the time comes though!

badMasks = self.config.badMaskPlanes
andMask = annulusMask.getPlaneBitMask(badMasks[0])
for bm in badMasks[1:]:
andMask = andMask | annulusMask.getPlaneBitMask(bm)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to avoid special casing indexes like 0 an rest and using temporary variables consider:

from operator import ior
from functools import reduce
...
reduce(ior, (annulusMask.getPlaneBitMask(bm) for bm in badMasks))

Its up to you which you find nicer though, if you want to keep yours that's fine, though I would suggest switching to |=

# Extract stamps around bright stars
extractedStamps = self.extractStamps(inputExposure, refCatLoader=refObjLoader)
# Warp (and shift, and potentially rotate) them
self.log.info("Applying warp to %i star stamps from exposure %s" % (len(extractedStamps.starIms),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same from above

len(warpedStars), dataId))
fluxes = []
for wstar in warpedStars:
annularFlux = self.computeAnnularFlux(wstar)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above comment on computeAnnularFlux

gaiaGMag=extractedStamps.GMags[j],
gaiaId=extractedStamps.gaiaIds[j],
annularFlux=fluxes[j])
for j in range(len(warpedStars))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider changing this to for j, warp in enumerate(warpedStars) you still get j to index by, but I think it is a bit clearer. This would be even more of a win if you have the annularFlux calculated by a method, either in this loop or another (or automatically triggered or something if r1, r2 are specified in constructor).

and cpix[1] >= self.config.stampSize[1]/2
and cpix[1] < inputExposure.getDimensions()[1] - self.config.stampSize[1]/2):
starIms += [inputExposure.getCutout(sp, geom.Extent2I(self.config.stampSize))]
pixCenters += [cpix]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dont use += [] just to append to a list, you are for sure allocating a new list each time to throw it away. Depending on when the size constraints are hit, you are reallocating the main list anyway. It is better to just use append, and let the reallocating happen when needed.

@MorganSchmitz MorganSchmitz force-pushed the tickets/DM-25304 branch 2 times, most recently from e1d8112 to 523b02f Compare October 21, 2020 11:12
dataRef.put(output.brightStarStamps, "brightStarStamps")
return pipeBase.Struct(brightStarStamps=output.brightStarStamps)

def runQuantum(self, butlerQC, inputRefs, outputRefs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We absolutely can not merge this as is. You cannot assume that 1) a gen2 butler is available 2) this is run on a machine that has this file path. If this would merge, it would seem to work in some cases, and then would fail in ci and integration tests.

At this point I dont think we should be prioritizing getting things done with the mentality of defaulting to gen2 and then addressing gen3 later as a secondary concern. Gen2 is deprecated, so unless there is some technical reason something can not be done in gen3 yet, I feel this should be the bar for considering something "working". The only technical limitation I know of getting Gaia support is simply doing the work.

There are two paths forward I see. 1) work with the current bootstrapping to get Gaia calibrations migrated when the repo is converted to a gen3 repo. 2) Do the work to get Gaia ingested naively.

What you have below is fine for proving that you code could work, but it is not there yet for merging to master.

The image from which bright star stamps should be extracted.
refObjLoader : `LoadIndexedReferenceObjectsTask`, optional
Loader to find objects within a reference catalog.
dataId : `dict`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In gen3 world, this will be a lsst.daf.butler.DataId (which itself is already either a DataCoordinate or dict). It should be included in the documentation, and it is up to you if you want to also include dict (as one might not know DataId could be a dict) or you could document it as "dict or lsst.daf.butler.DataCoordinate"

@MorganSchmitz MorganSchmitz merged commit cbd0928 into master Jan 27, 2021
@MorganSchmitz MorganSchmitz deleted the tickets/DM-25304 branch January 27, 2021 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants