DM-37075: Create sky object plots including GaaP fluxes and band ratios #48

laurenam · 2022-12-06T00:05:42Z

No description provided.

leeskelvin · 2022-12-20T15:34:52Z

python/lsst/analysis/tools/analysisPlots/skySource.py

+        self.produce.panels["panel_flux"].hists = dict(
+            hist_psf_flux="psfFlux", hist_ap09_flux="ap09Flux", hist_gaap1p0_flux="gaap1p0Flux"
+        )
+


I don't think you want this newline here do you(?), if you want to keep all of the "panel_flux" produce code block together (as below with the "panel_sn" produce code block).

leeskelvin · 2022-12-20T15:41:52Z

python/lsst/analysis/tools/actions/plot/histPlot.py

+    def validate(self):
+        super().validate()
+        if self.histDensity and self.expectedValue is None:
+            raise FieldValidationError("Must provide expectedValue if histDensity is True")


For me at least, this line of code fails, giving this error:

TypeError: FieldValidationError.__init__() missing 2 required positional arguments: 'config' and 'msg'

I'm not super familiar with this error, but grepping around the rest of the stack, it seems you need to raise it by passing in three args:

raise FieldValidationError(field, config, msg)

Thanks for catching that! It's a new error type for me too and I had very good intentions of testing it... 😔

I now get, e.g.:

FieldValidationError: Field 'produce.panels['panel_gaapSn'].referenceValue' failed validation: Must provide referenceValue if histDensity is True For more information see the Field definition at: File pex/config/config.py:104 (__call__) and the Config definition at: File analysis/tools/actions/plot/histPlot.py:50 (<module>)

leeskelvin · 2022-12-20T15:45:23Z

python/lsst/analysis/tools/actions/plot/histPlot.py

@@ -163,7 +217,7 @@ def _makeAxes(self, fig):
            ncols = 2
        nrows = int(np.ceil(num_panels / ncols))

-        gs = GridSpec(nrows, ncols, left=0.13, right=0.99, bottom=0.1, top=0.88, wspace=0.25, hspace=0.45)
+        gs = GridSpec(nrows, ncols, left=0.12, right=0.99, bottom=0.1, top=0.88, wspace=0.31, hspace=0.45)


I'd probably also recommend reformatting this line too:

gs = GridSpec( nrows, ncols, left=0.12, right=0.99, bottom=0.1, top=0.88, wspace=0.31, hspace=0.45, )

Apologies, this one slipped through my net. I had my local black formatter set up incorrectly with regards line length. You can change this back to:

gs = GridSpec(nrows, ncols, left=0.12, right=0.99, bottom=0.1, top=0.88, wspace=0.31, hspace=0.45)

if you'd prefer? Sorry for the mix-up - I thought I'd caught all of these comments and removed them, but missed this one.

Oh, I figured you were giving me your preferred aesthetic for formatting (the PR checks had all passed, so we already knew black was “happy”). I don’t really have a preference on this one…it’s so close to being at max length that having it this way is arguably preferable in case anyone adds to it, so I’m inclined to leave the update?

I had to get back in there, so I did switch it back...

leeskelvin · 2022-12-20T17:57:55Z

python/lsst/analysis/tools/actions/plot/histPlot.py

@@ -66,6 +73,21 @@ class HistPanel(Config):
        "is plotted, the percentile limit is the maximum value across all input data.",
        default=98.0,
    )
+    expectedValue = Field[float](
+        doc="Value at which to add a black solid vertical line.  Ignored if set to None.",


Double space after period.

Yeah, 'cuz I'm down with the double space after period, and also, from: https://peps.python.org/pep-0008/#comments

You should use two spaces after a sentence-ending period in multi- sentence comments, except after the final sentence.

I could argue that the precedent is set for almost every file in the stack by the template preamble: https://github.com/lsst/templates/blob/main/file_templates/stack_license_preamble_py/example.py
...but I will change to single space if you insist 😉

👉 https://developer.lsst.io/python/style.html#sentences-in-comments-should-not-be-separated-by-double-spaces 😉 😄

I stand corrected!! Perhaps the template needs updating to meet our standards 😆 (and I did try to search for this in the dev guide...my key word "period" resulted in me missing this!)

All of the offending double spaces should now be removed!

leeskelvin · 2022-12-20T17:58:07Z

python/lsst/analysis/tools/actions/plot/histPlot.py

+        optional=True,
+    )
+    histDensity = Field[bool](
+        doc="Whether to plot the histogram as a normalized probability distribution.  Must also "


Double space after period.

leeskelvin · 2022-12-20T17:58:45Z

python/lsst/analysis/tools/actions/plot/histPlot.py

+            panel_range = [minMed - 3.5 * maxMad, maxMed + 3.5 * maxMad]
+            if panel_range[1] - panel_range[0] == 0:
+                log.info(
+                    "NOTE: panel_range for {} based on med/sigMad was 0.  Computing using "


Double space after period.

leeskelvin · 2022-12-20T17:59:00Z

python/lsst/analysis/tools/actions/plot/histPlot.py

        # add a buffer to the top of the plot to allow headspace for labels
        ylims = list(ax.get_ylim())
        if ax.get_yscale() == "log":
            ylims[1] = 10 ** (np.log10(ylims[1]) * 1.1)
        else:
            ylims[1] *= 1.1
        ax.set_ylim(ylims[0], ylims[1])
+
+        # Draw a vertical line at expected value, if given.  If histDensity


Double space after period.

leeskelvin · 2022-12-20T19:30:20Z

python/lsst/analysis/tools/actions/plot/histPlot.py

+        yAnchor0=0.0,
+        nth_row=0,
+        nth_col=0,
+        ncols=1,


This ncols arg doesn't seem to be being used in this method?

Good catch!

leeskelvin · 2022-12-20T20:09:07Z

python/lsst/analysis/tools/actions/plot/histPlot.py

+                nth_col -= 1
+            # Set some font sizes based on number of panels being plotted.
+            label_font_size = max(6, 10 - nrows)
+            legend_font_size = max(4, int(8 - len(self.panels[panel].hists) / 2 - nrows // 2))  # type: ignore


does this need a # type: ignore here? Looks fine on my side?

I get:

$ mypy python/lsst/analysis/tools/actions/plot/histPlot.py python/lsst/analysis/tools/actions/plot/histPlot.py:196: error: <nothing> has no attribute "hists" [attr-defined] Found 1 error in 1 file (checked 1 source file)

if I take it out, so leaving it as is.

leeskelvin · 2022-12-20T20:09:58Z

python/lsst/analysis/tools/actions/plot/histPlot.py

+            label_font_size = max(6, 10 - nrows)
+            legend_font_size = max(4, int(8 - len(self.panels[panel].hists) / 2 - nrows // 2))  # type: ignore


why do both label_font_size and legend_font_size need to be set inside this for loop? Can they be set once outside it?

Ah, I missed the self.panels[panel].hists - that certainly does need to be assigned for each panel. Still not sure about label_font_size however.

leeskelvin · 2022-12-20T20:17:01Z

python/lsst/analysis/tools/actions/plot/histPlot.py


-    def _makePanel(self, data, panel, ax, col, **kwargs):
+    def _makePanel(self, data, panel, ax, colors, label_font_size=9, legend_font_size=7, ncols=1, **kwargs):


As far as I can see, it doesn't look like these kwargs are passed anywhere, so perhaps they can be removed?

Sure thing...this was preexisting, so I didn't remove it, but if you think it should go, I'm on board!

leeskelvin · 2022-12-20T20:39:11Z

python/lsst/analysis/tools/actions/plot/histPlot.py

+        if self.panels[panel].doPercentileRange:
+            panel_range = self._getPercentilePanelRange(data, panel)
+        else:
+            # Set the panel range to be extend 3.5 times the maximum sigmaMad
+            # for the datasets in the panel to the left[right] from the
+            # minimum[maximum] median value of all datasets in the panel.
+            maxMad = np.nanmax(mads)
+            maxMed = np.nanmax(meds)
+            minMed = np.nanmin(meds)
+            panel_range = [minMed - 3.5 * maxMad, maxMed + 3.5 * maxMad]
+            if panel_range[1] - panel_range[0] == 0:
+                log.info(
+                    "NOTE: panel_range for {} based on med/sigMad was 0.  Computing using "
+                    "percentile range instead.".format(panel)
+                )
+                panel_range = self._getPercentilePanelRange(data, panel)


A few comments here:

I think much of this panel_range logic needs to be moved back into the method previously called getPanelRange (renamed here to getPercentilePanelRange). Having all the get-my-plotting-range logic in one method makes more sense to my eye.

We probably also want to add an absolute range user input option, in addition to percentile scaling and sigma-mad scaling. This would help us reproduce plots with a fixed range on the x-axis, to aid in cross-comparisons. To achieve this, I'd recommend replacing doPercentileRange with a string field named something like rangeType. The default for this should be percent, with lower = 0 and upper = 100 (i.e., plotting everything, which probably makes the most sense as a generic default). We could use match/case logic to select the appropriate matching code block. This then allows us to think about:

I think the sigma-mad range also needs to be configurable. To achieve this here, I would recommend changing pLower and pUpper to more generic labels, such as lower and upper. Thus, if the user sets rangeType to sigma, then lower would be the lower sigma range (and similarly for upper).

Big yes to all of the above. Have a look at it now and see what you think.

leeskelvin · 2022-12-20T20:40:06Z

python/lsst/analysis/tools/actions/plot/histPlot.py

@@ -66,6 +73,21 @@ class HistPanel(Config):
        "is plotted, the percentile limit is the maximum value across all input data.",
        default=98.0,
    )
+    expectedValue = Field[float](


Perhaps a better generic name would be referenceValue?

To put this into context somewhat, if I want to add a line to my RA histogram plot to show a particular RA of interest, I'm not sure that expectedValue would seem to best describe that line of reference.

leeskelvin · 2022-12-20T20:48:20Z

python/lsst/analysis/tools/actions/plot/histPlot.py

+            if self.panels[panel].histDensity:
+                expected_label = None
+            else:
+                expected_label = "${{\\mu_{{expected}}}}$: {}".format(self.panels[panel].expectedValue)


Why mu_expected? If I want to add a generic reference line at RA=300 deg, I think it might be confusing to have this black line labeled as mu_expected. Does this need a string label at all, i.e., will just the line and the value suffice?

I've change expected -> reference. I'm inclined to leave the label in hopes of making sure it's clear that this curve is not data-based.

As head space at the top of these panels is in high demand, would mu_ref work for you? Not a deal breaker if not, but I think syncs nicely with P_norm opposite.

leeskelvin · 2022-12-20T20:57:31Z

python/lsst/analysis/tools/actions/plot/histPlot.py

+        if self.panels[panel].expectedValue is not None:
+            ax2 = ax.twinx()
+            ax2.axis("off")
+            ax2.set_xlim(ax.get_xlim())
+            ax2.set_ylim(ax.get_ylim())
+
+            if self.panels[panel].histDensity:
+                expected_label = None
+            else:
+                expected_label = "${{\\mu_{{expected}}}}$: {}".format(self.panels[panel].expectedValue)
+            ax2.axvline(
+                self.panels[panel].expectedValue, ls="-", lw=1, c="black", zorder=0, label=expected_label
+            )
+            if self.panels[panel].histDensity:
+                ref_x = np.arange(panel_range[0], panel_range[1], (panel_range[1] - panel_range[0]) / 100.0)
+                ref_mean = self.panels[panel].expectedValue
+                ref_std = 1.0
+                ref_y = (
+                    1.0
+                    / (ref_std * np.sqrt(2.0 * np.pi))
+                    * np.exp(-((ref_x - ref_mean) ** 2) / (2.0 * ref_std**2))
+                )
+                ax2.fill_between(ref_x, ref_y, alpha=0.1, color="black", label="P$_{{norm}}(0,1)$", zorder=-1)
+                # Make sure the y-axis extends beyond the data plotted and that
+                # both axes y-ranges are in sync.
+                y_max = max(max(ref_y), ax2.get_ylim()[1])
+                if ax2.get_ylim()[1] < 1.05 * y_max:
+                    ax.set_ylim(ax.get_ylim()[0], 1.05 * y_max)
+                    ax2.set_ylim(ax.get_ylim())
+            ax2.legend(fontsize=legend_font_size, handlelength=1.5, loc="upper right", frameon=False)


Would it be possible to move everything below if self.panels[panel].expectedValue is not None: into its own separate private method? Something like _addReferenceLines or similar? I think that would help the legibility of this method a fair amount.

leeskelvin · 2022-12-22T03:10:41Z

python/lsst/analysis/tools/actions/plot/histPlot.py

+        "the values of lowerRange and upperRange.",
+        allowed={
+            "percentile": "Upper and lower percentile ranges of the data.",
+            "sigmaMad": "Range is (sigmaMad - lowerRange*sigmiMad, sigmaMad + upperRange*sigmaMad).",


sigmaMad typo

leeskelvin

This LGTM! Only a couple of new comments above - 1 typo and 1 reference to ref renaming suggestion. If you are diving in and feel like resolving that erroneously long line-wrap issue too, please do! Thanks for this, these are great updates!

This updates the histPlot action to allow for finer grained control over the plot ranges. The "rangeType" can now be selected as one of "percentile", "sigmaMad", or "fixed". The lower and upper bounds of the range will then be set accordingly by the values in lowerRange and upperRange as follows: "percentile": (lowerRange percentile of data, upperRange percentile of data) "sigmaMad": (min(medians) - lowerRange*max(sigmaMads), max(medians) + upperRange*max(sigmaMads)) "fixed": (lowerRange, upperRange) It also updates the way the right-hand statistics legends are added such that each entry gets a title (set to the x-axis label) and the legends line up (reasonably well for up to ~6 panels with <~ 4 datasets per panel) with their respective rows. In order to avoid confusion between col for color vs. column, all references to col refer to the latter, and color is spelled out for the former.

Also overplot a shaded curve representing the idealized Pnorm(0, 1) distribution for reference.

This updates the skyObject and skySource analysisPlot classes to adapt to the new range setting configs in the HistPlot action. This also adds vertical solid black lines at a "reference" value (0.0 here) in both the flux and S/N panels as well as a shaded PDF(0, 1) distribution on the now density normalized S/N plots for reference. This also adds the GaaP 1p0 flux to the skyObject histograms (GaaP fluxes are not measured for the visit-level sky sources).

laurenam force-pushed the tickets/DM-37075 branch 8 times, most recently from 232701a to 14d6fd2 Compare December 10, 2022 23:42

laurenam requested a review from leeskelvin December 14, 2022 23:18

laurenam force-pushed the tickets/DM-37075 branch from 14d6fd2 to f518a3c Compare December 14, 2022 23:54

leeskelvin reviewed Dec 20, 2022

View reviewed changes

laurenam force-pushed the tickets/DM-37075 branch 5 times, most recently from e967c84 to c72ca5b Compare December 21, 2022 23:32

laurenam force-pushed the tickets/DM-37075 branch from c72ca5b to 6688dc4 Compare December 22, 2022 01:23

leeskelvin reviewed Dec 22, 2022

View reviewed changes

leeskelvin approved these changes Dec 22, 2022

View reviewed changes

laurenam added 4 commits December 22, 2022 12:31

Add logging to HistPlot action

0b31894

Add histPlot option to draw line at a reference value

94ff89c

Add option to normalize the histogram as a PDF

0011b21

Also overplot a shaded curve representing the idealized Pnorm(0, 1) distribution for reference.

laurenam force-pushed the tickets/DM-37075 branch from 6688dc4 to d44513f Compare December 22, 2022 20:36

laurenam force-pushed the tickets/DM-37075 branch from d44513f to d6000fe Compare December 23, 2022 00:58

laurenam merged commit fcebaf1 into main Dec 23, 2022

laurenam deleted the tickets/DM-37075 branch December 23, 2022 04:33

		label_font_size = max(6, 10 - nrows)
		legend_font_size = max(4, int(8 - len(self.panels[panel].hists) / 2 - nrows // 2)) # type: ignore


		def _makePanel(self, data, panel, ax, col, **kwargs):
		def _makePanel(self, data, panel, ax, colors, label_font_size=9, legend_font_size=7, ncols=1, **kwargs):

DM-37075: Create sky object plots including GaaP fluxes and band ratios #48

DM-37075: Create sky object plots including GaaP fluxes and band ratios #48

Conversation

laurenam commented Dec 6, 2022

leeskelvin Dec 20, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leeskelvin Dec 20, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leeskelvin left a comment

Choose a reason for hiding this comment

leeskelvin Dec 20, 2022 •

edited

leeskelvin Dec 20, 2022 •

edited