New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-35617: Add photometric repeatability metric #9
Conversation
@@ -272,3 +277,71 @@ def getInputSchema(self) -> KeyedDataSchema: | |||
|
|||
def __call__(self, data: KeyedData, **kwargs) -> Vector: | |||
return cast(Vector, data[self.vectorKey.format(**kwargs)]) | |||
|
|||
|
|||
class AndSelector(VectorAction): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs docstring. This is useful to split out, but we should think about if we want to update KeyedDataSelectorAction
to make use of it. It would save duplication, but add an additional level in .
hierarchy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would propose not to add the AndSelector
and instead use MultiCriteriaDownselectVector
.
8fb7b45
to
f6ee8f5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think it looks good! I did a test run on rc2_subset, and it ran fine.
There are a few comments, but otherwise the main barrier to merging is making sure the plot works. (I tried to test that, but came up against some errors.)
mask *= subMask # type: ignore | ||
return cast(Vector, data[self.vectorKey.format(**kwargs)])[mask] | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the purpose of MultiCriteriaDownselectVector (or any of the "Downselect" modules, actually). This looks like it does the same thing as "AndSelector" in vector.selectors, but just returns the subselection rather than the mask. In that case, why not use the selector, and simply apply the mask in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DownselectVector
returns a vector where a single configurable action has been used to apply selection criteria. Usually this will be a shorter vector due to the applied filtering.
In this instance, I needed to apply several per-group selection criteria (e.g., number of repeated measurements, mean signal-to-noise ratio) to keep only the per-group signal-to-noise measurements of interest.
The first time I implemented this, I introduced the AndSelector
, but then I felt that this made the syntax more complicated/verbose than I wanted. Basically I needed to introduce the new AndSelector and then set the multiple selectors for that selector. It added an extra layer of nesting that seemed unwarranted.
I also considered to make DownselectVector
always allow for the possibility of multiple selectors by using selectors = ConfigurableActionStructField
rather than selector = ConfigurableActionField
. This also seemed unsatisfying because one would need to add an extra layer in cases where there is only a single criterion applied. For example
self.process.filterActions.xStars = DownselectVector(
vectorKey="mags", selector=VectorSelector(vectorKey="starSelector")
)
would become
self.process.filterActions.xStars = DownselectVector(vectorKey="mags")
self.process.filterActions.xStars.selectors.star = VectorSelector(vectorKey="starSelector")
I wasn't sure that we wanted to introduce that additional layer.
I'm still not fully convinced that introducing MultiCriteriaDownselectVector
is the optimal solution though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be remembering the And selector you had, but I could have sworn there was something different elsewhere. Anyway, the way I had imagined this to work is that you would build the mask you want (with all the various selectors) in the build stage, and then you would just apply that mask in the filter stage, and thus you would only need the one selector (VectorSelector
), and then you would have the vector around for anything else you might need it for. I am not saying things MUST be done how I thought them out, just that that process made sense to me reading things out. I wanted to bring that up to the discussion for consideration, but if you determine your way is best I am happy to have the new selector, but it does make the other kind of redundant.
python/lsst/analysis/tools/analysisMetrics/photometricRepeatabilityMetrics.py
Outdated
Show resolved
Hide resolved
98f50bc
to
faea362
Compare
@@ -325,3 +330,70 @@ def getInputSchema(self) -> KeyedDataSchema: | |||
|
|||
def __call__(self, data: KeyedData, **kwargs) -> Vector: | |||
return cast(Vector, data[self.vectorKey.format(**kwargs)]) | |||
|
|||
|
|||
# class AndSelector(VectorAction): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't intend this to go in (see later comment) it should be deleted
errCol = f"{fluxCol}{self.uncertaintySuffix.format(**kwargs)}" | ||
result = cast(Vector, data[fluxCol]) / data[errCol] # type: ignore | ||
|
||
return np.array(cast(Vector, result)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was it you I discussed this with before and it was resolved? (Sorry I have been looking at so many reviews recently)
def getInputSchema(self) -> KeyedDataSchema: | ||
return ((self.vectorKey, Vector),) | ||
|
||
def __call__(self, data: KeyedData, **kwargs) -> Vector: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MyPy is basically saying it cant work out what type you want bands to be, because match arms are defining it to be different things:
- is defining it as tuple[Any, ...] - a multi element tuple of any type
- I think it is infering the same as 1
- it thinks it is a tuple with a single element of type Any i.e. tuple[Any]
- None
solution to all this is before the match statement add:
bands: tuple[Any, ...] | None
That declares the type for the variable before it is used anywhere
|
||
def __call__(self, data: KeyedData, **kwargs) -> Scalar: | ||
mask = self.getMask(**kwargs) | ||
values = data[self.vectorKey.format(**kwargs)][mask] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for now, there is no way for MyPy to know if when you do data[key]
the result will be a Vector
or Scalar
. If it is a Scalar
then you can't do [mask]
because a number does not understand __getitem__
. To get around this do:
from typing import cast
...
values = cast(Vector, data[self.vectorKey.format(**kwargs)])[mask]
7792b13
to
bb118f0
Compare
15aee5e
to
0ff3246
Compare
No description provided.