Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add imageMapReduce task #46

Merged
merged 1 commit into from Mar 8, 2017
Merged

Add imageMapReduce task #46

merged 1 commit into from Mar 8, 2017

Conversation

djreiss
Copy link
Contributor

@djreiss djreiss commented Feb 23, 2017

Unit tests are in PR #45.

Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall design is reasonable. Several comments about docs, new unittests and returning Structs.

I think it would be good for you to run coverage (either via pytest-cov, or coverage.py), to help find where your current tests aren't getting to. I can help you with that, if you like.

Although I have some comments to that effect above, you do need some more docs about what happens when Map doesn't return an Exposure (e.g., moving average as you suggested), both at the file/class level, and for the method args/return values.

I do worry about about implementing MapReduce in the stack in ip_diffim, instead of either taking an off-the-shelf python MR code, or putting it somewhere higher (e.g. lsst.utils?). The implementation here is somewhat specific to your use case, though, so I don't know how feasible either of those would be. At the very least, you should think about how one would make this parallel via multiprocessing or mpi, since MR lends it self perfectly to that.

import lsst.pex.config as pexConfig
import lsst.pipe.base as pipeBase

__all__ = ("ImageMapReduceTask", "ImageMapReduceConfig",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__all__ should probably be at the very top. There was recently discussion of this, with this ticket filed: https://jira.lsstcorp.org/browse/DM-9596

It's good that you're using a tuple for these. Apparently there are "interesting" corner cases involving it being a list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out, but I'll leave it as-is until there's an implemented RFC.


class ImageMapperSubtask(with_metaclass(abc.ABCMeta, pipeBase.Task)):
"""Abstract base class for any task that is to be
used as `ImageMapReduceConfig.mapperSubtask`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring could use some expansion: what would a non-abstract child class used by mapperSubtask actually do?

_DefaultName = "ip_diffim_ImageMapperSubtask"

def run(self, subExp, expandedSubExp, fullBBox, **kwargs):
"""Perform operation on given sub-exposure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perform what operation?

ConfigClass = ImageMapperSubtaskConfig
_DefaultName = "ip_diffim_ImageMapperSubtask"

def run(self, subExp, expandedSubExp, fullBBox, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need @abc.abstractmethod on here, and any other methods below that must be overridden in order to have a complete non-abstract child class.

subExp : afw.Exposure
the sub-exposure upon which to operate
expandedSubExp : afw.Exposure
the expanded sub-exposure upon which to operate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's an "expanded sub-exposure"?

scaleByFwhm = self.config.scaleByFwhm
bbox = exposure.getBBox()

psfFwhm = (exposure.getPsf().computeShape().getDeterminantRadius() *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, or coding standards don't like spaces around *, but I don't really care.


def rescaleValue(val):
if scaleByFwhm:
return np.rint(val * psfFwhm).astype(int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the psf has different dimensions in X vs. Y?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses the average.

nGridX = bbox.getWidth() / gridStepX
# Readjust gridStepX so that it fits perfectly in the image.
gridStepX = float(bbox.getWidth() - gridSizeX) / float(nGridX)
if gridStepX <= 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<=0 here, or <=1? (same below)

Copy link
Contributor Author

@djreiss djreiss Mar 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just to catch a case which shouldn't happen. I'll raise an error instead.

centroidY = self.config.gridCentroidsY[i]
centroid = afwGeom.Point2D(centroidX, centroidY)
bb0 = afwGeom.Box2I(bbox0)
xoff = int(np.floor(centroid.getX())) - bb0.getWidth()//2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good use of explicit integer division.

return bb0, bb1

xoff = 0
while(xoff <= bbox.getWidth()):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've missed while loops!

@djreiss
Copy link
Contributor Author

djreiss commented Mar 1, 2017

The overall design is reasonable. Several comments about docs, new unittests and returning Structs.

I think it would be good for you to run coverage (either via pytest-cov, or coverage.py), to help find where your current tests aren't getting to. I can help you with that, if you like.

Although I have some comments to that effect above, you do need some more docs about what happens when Map doesn't return an Exposure (e.g., moving average as you suggested), both at the file/class level, and for the method args/return values.

I do worry about about implementing MapReduce in the stack in ip_diffim, instead of either taking an off-the-shelf python MR code, or putting it somewhere higher (e.g. lsst.utils?). The implementation here is somewhat specific to your use case, though, so I don't know how feasible either of those would be. At the very least, you should think about how one would make this parallel via multiprocessing or mpi, since MR lends it self perfectly to that.

Thanks for the excellent review. I have no issues with nearly all of your comments. I agree with the potential for moving this out of ip_diffim if there is a use-case, since it is not specific to image differencing. For now, I have plans for two separate uses of ImageMapReduce in ip_diffim, which is why I placed it there.

Copy link
Contributor

@parejkoj parejkoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. The docs are a lot clearer now.

See the one minor comment.

@@ -175,7 +246,8 @@ class ImageMapReduceConfig(pexConfig.Config):
)

# Separate gridCentroidsX and gridCentroidsY since pexConfig.ListField accepts limited dtypes
# (i.e., no Point2D)
# (i.e., no Point2D). The resulting set of centroids is the "vertical stack" of
# `gridCentroidsX` and `gridCentroidsY`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please give an example here, showing what you get for, e.g., (1,2 ), (3,4)

@djreiss djreiss merged commit ab7f4cf into master Mar 8, 2017
@djreiss
Copy link
Contributor Author

djreiss commented Mar 8, 2017

Waited until pybind11 merge. Confirmed build on Jenkins.

@ktlim ktlim deleted the tickets/DM-8520 branch August 25, 2018 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants