DM-40633: Ignore many numpy warnings #155

taranu · 2023-10-12T22:15:45Z

These warnings spam the logs with mostly useless messages. You can tell which action they occur in and it's usually obvious why - divide by zero, nan, etc - but these are typically expected and not necessarily a problem for the task.

erykoff · 2023-10-12T22:31:44Z

python/lsst/analysis/tools/interfaces/_task.py

+            # of a task. When DM-39114 is implemented, this step should not
+            # be required and may be removed.
+            weakref.finalize(results, _plotCloser, *weakrefArgs)
+            return results


Return should be out of the with context.

sr525 · 2023-11-14T18:58:40Z

python/lsst/analysis/tools/actions/plot/colorColorFitPlot.py

@@ -155,11 +155,11 @@ def makePlot(
          * Statistics that are shown on the plot or used by the plotting code:
             * ``approxMagDepth``
                   The approximate magnitude corresponding to the SN cut used.
-             * ``f"{self.plotName}_sigmaMAD"``
+             * ``f"{self.plotName}_nanmedian"``
                   The sigma mad of the distances to the line fit.


Why has this changed from sigmaMad to median?

Whoops, that was a find/replace fail I missed correcting.

sr525 · 2023-11-14T18:58:51Z

python/lsst/analysis/tools/actions/plot/colorColorFitPlot.py

                   The sigma mad of the distances to the line fit.
             * ``f"{self.identity or ''}_median"``
                   The median of the distances to the line fit.
-             * ``f"{self.identity or ''}_hardwired_sigmaMAD"``
+             * ``f"{self.identity or ''}_hardwired_nanmedian"``


Same comment here

sr525 · 2023-11-14T19:02:52Z

python/lsst/analysis/tools/actions/plot/focalPlanePlot.py

@@ -36,7 +36,7 @@
 from scipy.stats import binned_statistic_2d, binned_statistic_dd

 from ...interfaces import KeyedData, KeyedDataSchema, PlotAction, Scalar, Vector
-from ...statistics import nansigmaMad
+from ...statistics import nanmedian, nansigmaMad


I think nanMedian and nanSigmaMad would be easier to read

I agree, but I was following the numpy naming convention. Having said that, it might make more sense to not follow numpy so as to distinguish them instead?

I think that make sense. Less confusion about where they come from.

sr525 · 2023-11-27T16:38:50Z

python/lsst/analysis/tools/actions/scalar/scalarActions.py

-    -------
-    count : `Scalar`
-        The number of unique rows in a given column.
-    """


Why did we get rid of the more extensive docs? Same comment for the other places as well.

@timj and/or @jonathansick can correct me if I'm wrong, but I thought the Parameters section in a class docstring is meant to document __init__, even for callable types. I don't think we should add docs for __init__ to Config classes.

If we are to keep these, they should be docs for __call__ on every Action, but IMO this is unnecessary in most cases as here the doc for count is redundant with the class docstring.

sr525 · 2023-11-27T16:47:59Z

python/lsst/analysis/tools/statistics.py

+
+def divide(dividend: Scalar | Vector, divisor: Scalar | Vector) -> Scalar | Vector:
+    """Return dividend/divisor."""
+    with warnings.catch_warnings():


Can we not catch the issue rather than the warnings?

if divisor == 0:
return NaN

Or something? I guess it does the same thing but feels more honest. Also doesn't solve my problem that what I really want is it to also tell me why it has failed. Maybe we just need a metric for each whatever that is number of 0, number of NaN etc.

We can, but it would be more complicated than that since divisor can be a vector.

Yes, earlier I was trying to say we should probably have metrics for each algorithm's individual flag columns where they are meaningful (and in the hopefully rare case of bad values without a flag set), but see further comments below.

sr525 · 2023-11-27T16:48:59Z

python/lsst/analysis/tools/statistics.py

+
+def nansigmaMad(vector: Vector) -> Scalar:
+    """Return the sigma_MAD of a vector."""
+    return cast(Scalar, sps.median_abs_deviation(vector, scale="normal", nan_policy="omit"))


Is that the sigma MAD or just the MAD?

scale="normal" makes it sigma_MAD.

sr525 · 2023-11-27T16:50:51Z

Would we be better off having a data clean up step before it gets fed to any actions? Something that makes a mask that covers all the NaNs etc and logs a message about how many there were, then we do the actions on clean data?

taranu · 2023-11-27T19:17:11Z

Would we be better off having a data clean up step before it gets fed to any actions? Something that makes a mask that covers all the NaNs etc and logs a message about how many there were, then we do the actions on clean data?

As I said above, we should have metrics for algorithmic flag columns. For the most part, numpy errors are cropping up where we are not adding the appropriate flag column(s) to the standard Visit/CoaddFlagSelector.

Whether we should add selectors for flag columns is an interesting question. For example, for a size vs magnitude plot both axes have their own relevant flags. I think it would be useful to have a labelled summary with the number of sources with an x flag set, a y flag set, neither or both. This can get more complicated with extra columns (e.g. if there is a S/N selector using a column that isn't on either axis).

python/lsst/analysis/tools/math.py

erykoff reviewed Oct 12, 2023

View reviewed changes

taranu force-pushed the tickets/DM-40633 branch from b58d41e to 9665675 Compare November 10, 2023 19:10

sr525 reviewed Nov 14, 2023

View reviewed changes

taranu force-pushed the tickets/DM-40633 branch from 9665675 to ab9063f Compare November 20, 2023 22:40

sr525 reviewed Nov 27, 2023

View reviewed changes

taranu force-pushed the tickets/DM-40633 branch 2 times, most recently from 8e9945a to 3c1b532 Compare November 30, 2023 21:25

taranu force-pushed the tickets/DM-40633 branch 2 times, most recently from d62184d to f8c562a Compare December 7, 2023 20:41

sr525 reviewed Dec 8, 2023

View reviewed changes

python/lsst/analysis/tools/math.py Show resolved Hide resolved

taranu added 6 commits December 11, 2023 13:03

Add preamble header

7fd54bc

Add docstring to ScalarAction.__call__

42b59e9

Add dev option to error on unfiltered numpy warnings

4594f0e

Filter numpy warnings from actions/tools

20928b6

Document the warning filtering system

bfef7a8

Specify extendedness values in TestDiffMatched

ff43f0f

taranu force-pushed the tickets/DM-40633 branch from f8c562a to ff43f0f Compare December 11, 2023 22:03

taranu merged commit edc5d77 into main Dec 11, 2023
8 checks passed

taranu deleted the tickets/DM-40633 branch December 11, 2023 23:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-40633: Ignore many numpy warnings #155

DM-40633: Ignore many numpy warnings #155

taranu commented Oct 12, 2023

erykoff Oct 12, 2023

sr525 Nov 14, 2023

taranu Nov 20, 2023

sr525 Nov 14, 2023

sr525 Nov 14, 2023

taranu Nov 20, 2023

sr525 Nov 27, 2023

sr525 Nov 27, 2023 •

edited

taranu Nov 27, 2023 •

edited

sr525 Nov 27, 2023

taranu Nov 27, 2023

sr525 Nov 27, 2023

taranu Nov 27, 2023

sr525 commented Nov 27, 2023

taranu commented Nov 27, 2023

DM-40633: Ignore many numpy warnings #155

DM-40633: Ignore many numpy warnings #155

Conversation

taranu commented Oct 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sr525 Nov 27, 2023 • edited

Choose a reason for hiding this comment

taranu Nov 27, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sr525 commented Nov 27, 2023

taranu commented Nov 27, 2023

sr525 Nov 27, 2023 •

edited

taranu Nov 27, 2023 •

edited