Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tickets/DM-13911: Create multiband classes #353

Merged
merged 29 commits into from Jul 20, 2018
Merged

tickets/DM-13911: Create multiband classes #353

merged 29 commits into from Jul 20, 2018

Conversation

fred3m
Copy link
Contributor

@fred3m fred3m commented May 10, 2018

No description provided.

a `_singles` attribute that is a list of single band objects,
a `_filters` attribute that is a list of filter
names, a `_bbox` attribute is a `Box2I`,
and a `_xy0` attribute that is a `Point2I`.
Copy link
Member

@TallJimbo TallJimbo May 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need both of these; _bbox.getMin() should always be equivalent to _xy0. This also applies to all of the attributes and properties relating to xy0; those should all delegate to the bbox, so you only have it in one place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll remove _xy0

Slices used to extract a subset of an image
(used when `singles` inherits from `MultibandBase` to
create a new object with a slice of the original).
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the base class __init__ is being passed singles, shouldn't it be responsible for creating the _singles attribute from it?

On that note, it'd probably be cleaner to have the base __init__ also take bbox and filters and create those attributes as well, rather than requiring derived __init__ to create any attributes used by the base class.

You should then remove the exception raise below and expect derived class __init__ to call base class __init__.


@property
def singles(self):
"""List of afw single band objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better return a tuple, so it's impossible to do obj.singles.append(another).

return self._singles

@property
def bbox(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid bbox and xy0 don't meet our requirements for properties (https://developer.lsst.io/python/style.html#properties-should-be-used-when-they-behave-like-regular-instance-attributes), because they return mutable objects that can be modified with surprising results. In other words, you can do

obj.bbox.grow(3)

or

obj.xy0.setX(4)

And end up with an inconsistent object. Our convention would be to just define getters and setters for these, since it's much less likely that someone would expect e.g. obj.getBBox().grow(3) to affect obj.

def yx0(self):
"""Minimum (y,x) position
"""
return (self.y0, self.x0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of having a reversed version of xy0 in this form:

  • the symmetry with xy0 in terms of naming would imply that it returns some kind of point object
  • anything defined here should also be available on single-band Image/MaskedImage/Exposure objects.

I'm guessing you want it to offset indices into numpy array views? If so, I'm not opposed to having something that does this, but I'd like a name that suggests its purpose ("arrayOrigin"?) rather than symmetry with xy0, and whatever you go with should be added to the single-band classes.


def __repr__(self):
result = "<{0}, filters={1}, yx0={2}, image shape={3}>".format(
self.__class__.__name__, self.filters, self.yx0, self.array.shape[-2:])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest showing the bbox here instead of yx0 and shape; LSST convention is (x, y) ordering, even if we need to support the transposed version for NumPy interoperability in some places.

def array(self):
"""Data cube array in multiple bands
"""
return self._array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add a setter for this property; could be implemented just as:

@array.setter
def _setArray(self, value):
    self.array[:, :] = value

def shape(self):
"""Shape of the Multiband Object
"""
return self.array.shape
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this; requiring the user to call obj.array.shape is a bit more verbose but much clearer, and it's another place where I think it'd be confusing to have a method on a multi-band class but not its single-band counterpart.

if yslice.start is not None:
y0 += yslice.start
else:
raise IndexError("y index must be a slice")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure this works with box slices as well.

# whose data points to the appropriate slice
# of self.array
self._updateSingles(singles.imageType)
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend splitting this up into at least two private methods called by __init__ for clarity.

Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good from a large-scale structural standpoint, but I'd like to see some refactoring to reduce complexity and maintenance load:

  • I'd like to see most constructors split up into a simpler constructor and one or more static/class methods - I feel like they're simply trying to process too many mostly-orthogonal arguments at once.
  • There are some methods (mostly bbox setters) that I suspect we could do without, and perhaps even some classes (the single-pixel ones). While I know you've already done a lot of work implementing those, so it may not seem keeping them does much harm, I'm worried that the maintenance load could be larger than any future convenience they might provide. I am of course quite happy to be convinced they should be kept if they are things you have direct plans to use.

@@ -23,3 +23,4 @@
"""Application Framework detection-related classes including Source
"""
from .detectionLib import *
from .multiband import *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add newline.

List of filter names.
singles: list
List of single band objects
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__init__ documentation should go in the class docstring (https://developer.lsst.io/python/numpydoc.html#placement-of-class-docstrings)

"""Initialize a `MultibandBase` object

Must be overloaded in derived classes to use `array` and slicing
functionality.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should say a bit more about what derived class __init__ methods must do to use that functionality, and whether they're permitted not to.

Should also probably say that derived classes must call the base class __init__.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an outdated comment, thanks for pointing this out. I also clarified the docs that the source must either call the base class __init__ or set the _filters, _singles, _bbox, and _singleType properties.

copy of the instance that inherits from `MultibandBase`
"""
err = "_copySingles must be implemented in an inherited class to enable copies"
raise NotImplementedError(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered using collections.ABC and abstractmethod here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't, but this is a good idea and is now implemented (and also helped me catch that I didn't implement copy for MultibandFootprint).


@property
def singles(self):
"""List of afw single band objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably shouldn't say "afw" here, as subclasses of MultibandBase don't have to use afw objects, even if all current ones do.

-------
targetSlices, imageIndices: tuple, tuple
Slices of the target and input images
in the form (by, bx), (iy, ix).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will only be parsed correctly by Sphinx/Numpydoc if you split it into multiple lines.

-------
newImage: `afw.Image` or `afw.MaskedImage`
The new image with the input image projected
into it's bounding box.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's -> its.

"""Project an image into a bounding box

Project an image into a new image with a
(potentially) different bounding box.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Project" isn't quite clear enough of a verb to explain what this function does for the more detailed description, I think.

I think you want something like "Return a new image whose pixels are equal to those of image within bbox, and equal to fill outside".

from lsst.afw.image.multiband import MaskedPixel, MultibandMaskedPixel, MultibandExposure


def _testImageFilterSlicing(cls, mImage, singleType, bbox, value):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this cls should be self, or maybe something indicating that it's a TestCase instance, not a type object of some kind. Might be best to put these helper methods in a mixin class or two.

At least one-line docstrings describing what kinds of operations are tested would be welcome as well.

(These comments of course apply to the test helpers below as well).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that the typical syntax for a function in python that takes a class as the first argument is to use cls as opposed to self. I did add the docstrings as requested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was thinking that you were passing in the TestCase instance as the first argument. The fact that you aren't also explains the use of vanilla asserts. So this is indeed okay, though it might be even better to make these TestCase methods so you can use asserts that produce better error messages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I explained this wrong and now realize why I need to rename them. These are instances of TestCases, so I should probably rename cls to testCase.

assert isinstance(mImage["R"], singleType)
assert isinstance(mImage[:], type(mImage))
idx = np.int32(2)
assert isinstance(mImage[idx], singleType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these lines should be cls.assertIsInstance(..., ...).

@TallJimbo
Copy link
Member

@fred3m, I'm about to get on an airplane, and I don't think there's anything else controversial to discuss right now, so I'm approving changes and you should feel welcome to merge after you things are in good shape. Of course, I'm happy to take another look if you aren't in a rush, but it probably won't happen today.

@fred3m
Copy link
Contributor Author

fred3m commented Jul 11, 2018

@TallJimbo I made substantial updates to the code based on your review, so would you mind looking over it again before I merge? The main changes are

  1. Using multiple staticmethods as constructors to simplify the __init__ methods. The implementation might look a bit strange at first since, for example, MultibandImage.__init__ takes an image array instead of a list of single band Image objects, but because the array is the property that we want to make a view into, this is the logical property to pass to __init__.

  2. Slight modification of the single band functions you added to slicing.py so that the multiband classes call the same functions, ensuring compatibility and less future maintenance.

  3. Storing the MultibandMaskedImage in MultibandFootprint. I used the MultibandMaskedImage because HeavyFootprint doesn't (at this time) have a constructor that takes a one dimensional array of values. In the future it probably should, since building a HeavyFootprint from a 1D slice of a 2D array (filters, 1D values) would be beneficial, but since this ticket has already been dragging on for 2.5 months I feel like that should be left for a future ticket.

Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken another (much less thorough) look, and left another round of comments. Nothing big and structural this time, but there are a few smaller issues that should probably involve a sweep for similar occurrences throughout the changeset.


print(mImage["G"])
print(mImage[0])
print(mImage[:1].__repr__())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to write print(repr(mImage[:1])). Same throughout the docs.

# 333 if index < 0:
#
#IndexError: Indices must be <0 or between 1000 and 1199, received 1
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the traceback itself is particularly useful here, and I'm sure it will get out-of-date fast (if it isn't already).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I thought I made a note when I asked you to look over this again but it looks like I didn't. I updated everything except the docs, just in case there are any more API changes. I have a jupyter notebook that contains all of the statements in the docs and will re-run them all (to make sure that the docs are up to date) before merging.

used to fill the single band objects (in fact the single band `Image` objects are initialized
with pointers to the `Multiband.array`).
Accessing this property may be necessary since images can not be set directly and must be
updated by using the `array` property.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't images be set directly? You have implemented MultibandImage.__setitem__, and setting pixel values via Image.__setitem__ should do what the caller expects.

==================

A `MultibandFootprint` is a collection of `HeavyFootprint` objects, one in each band,
that do not necessarily need to have the same (or even overlapping) `SpanSet`s.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this has changed.

----------
offset: `Extent2I`
Amount to shift the bounding box in x and y.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should document that this and the method above return new objects rather than operating in place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't realize that's what it was supposed to do, although looking back at the RFC now the name makes a lot more sense... I'll fix this for all of the classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I may have mispoken a bit between this comment and the one above. On the single-band objects, setXY0 does operate in-place (as should any method with that name). The names shiftedBy and shiftedTo (by being past tense) imply that they return new objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the RFC, it looks like shiftedTo/By should return a view into the image, so a shallow copy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, shallow copy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arg, this is a breaking change until RFC-343 is implemented. Specifically, since the single band objects don't have shiftedBy and shiftedTo methods, the generic methods in MultibandBase don't know how to make a shallow copy of the singles and then modify only their bounding box while keeping a view to the original object.

So I'll leave the shiftedBy and shiftedTo methods in place (but with a NotImplementedError raised before the code executes, and also re-implement setXy0 with a note in DM-10781 to also fix the multiband classes.

kwargs: dict
Keyword arguments to pass to `_fullInitialize` to
initialize a new instance of an inherited class that are the
same in all bands.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any mention of _fullInitialize anywhere else in this file; is this documentation out of date?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, that was used before we changed the constructors.


see `imageFactory` for a description of parameters
"""
return imageFactory(MultibandImage, filters, filterKwargs, singleType, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want this method to behave like the Factory method on one of the single-band classes, it'd need to support the same constructor arguments as the class's constructor. But I think that's probably unwise; Factory is an idiosyncratic crutch for a lack of a set of better-named factory methods (which has since largely been addressed since Factory was added to the single-band classes, but it's so broadly used now that it's hard to get rid of).

In any case, if you want this method to do something else, please give it another name. If you just wanted these classes to behave more like their single-band counterparts, don't worry about it in this particular case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I wrote this was to give users the ability to pass a butler and create a MultibandExposure from a collection of keyword arguments, some of which may differ by filter. If we think that it's too much of a burden to support that I can remove both imageFactory and tripleFactory, otherwise I can just rename them something like makeImageFromKwargs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to keeping anything anything you have a use case for.

return cls(filters, image, mask, variance)


def tripleFromArrays(cls, filters, image, mask, variance, bbox=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verb-based name here, too.

----------
coords: `Point2D` or `tuple`
Coordinates to evaluate the PSF. If `coords` is `None`
then `Psf.getAveragePosition()` is used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tend to use "coord" to mean sky coordinates; it'd be a bit more in keeping withe precedent and a bit less confusing to stack users to call this "position".

@@ -87,67 +87,40 @@ def handleNegativeIndex(index, size, origin, default):
return index


def makePointFromIndices(x, y, origin, bbox):
"""Create a Point2I from an x, y pair, correctly handling negative indices.
def makeBoxFromSlices(sliceArgs, bboxGetter):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this now does what interpretSliceArgs used to do (I think?) could we rename it to interpretSliceArgs and come up with a new name for the new thing? Or do we not need the thing called interpretSliceArgs anymore, since I don't see it used by __getitem__ and __setitem__ below?

In any case, it definitely shouldn't be called somethign Box- or Slice-specific, since it also works with single-pixel indexing arguments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we need two different functions, but I'm open to changing the names.

Because most of the multiband classes slice arrays, interpretSliceArgs returns indices or slices, which is what the multiband classes need. However single band objects require the additional step of transforming those indices into either a Box2I or Point2I, so the new name was supposed to convey that this is used for stack primitives.

Do you have a suggestion for a better name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about translateSliceArgs?

@fred3m fred3m force-pushed the tickets/DM-13911 branch 2 times, most recently from f71f2b5 to 6d6a42b Compare July 19, 2018 17:21
@fred3m fred3m merged commit 7a076d4 into master Jul 20, 2018
@ktlim ktlim deleted the tickets/DM-13911 branch August 25, 2018 06:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants