Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-37075: Create sky object plots including GaaP fluxes and band ratios #48

Merged
merged 5 commits into from Dec 23, 2022

Conversation

laurenam
Copy link
Contributor

@laurenam laurenam commented Dec 6, 2022

No description provided.

self.produce.panels["panel_flux"].hists = dict(
hist_psf_flux="psfFlux", hist_ap09_flux="ap09Flux", hist_gaap1p0_flux="gaap1p0Flux"
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you want this newline here do you(?), if you want to keep all of the "panel_flux" produce code block together (as below with the "panel_sn" produce code block).

def validate(self):
super().validate()
if self.histDensity and self.expectedValue is None:
raise FieldValidationError("Must provide expectedValue if histDensity is True")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me at least, this line of code fails, giving this error:

TypeError: FieldValidationError.__init__() missing 2 required positional arguments: 'config' and 'msg'

I'm not super familiar with this error, but grepping around the rest of the stack, it seems you need to raise it by passing in three args:

raise FieldValidationError(field, config, msg)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that! It's a new error type for me too and I had very good intentions of testing it... 😔

I now get, e.g.:

FieldValidationError: Field 'produce.panels['panel_gaapSn'].referenceValue' failed validation: Must provide referenceValue if histDensity is True
For more information see the Field definition at:
  File pex/config/config.py:104 (__call__) and the Config definition at:
  File analysis/tools/actions/plot/histPlot.py:50 (<module>)

@@ -163,7 +217,7 @@ def _makeAxes(self, fig):
ncols = 2
nrows = int(np.ceil(num_panels / ncols))

gs = GridSpec(nrows, ncols, left=0.13, right=0.99, bottom=0.1, top=0.88, wspace=0.25, hspace=0.45)
gs = GridSpec(nrows, ncols, left=0.12, right=0.99, bottom=0.1, top=0.88, wspace=0.31, hspace=0.45)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably also recommend reformatting this line too:

        gs = GridSpec(
            nrows,
            ncols,
            left=0.12,
            right=0.99,
            bottom=0.1,
            top=0.88,
            wspace=0.31,
            hspace=0.45,
        )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, this one slipped through my net. I had my local black formatter set up incorrectly with regards line length. You can change this back to:

gs = GridSpec(nrows, ncols, left=0.12, right=0.99, bottom=0.1, top=0.88, wspace=0.31, hspace=0.45)

if you'd prefer? Sorry for the mix-up - I thought I'd caught all of these comments and removed them, but missed this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I figured you were giving me your preferred aesthetic for formatting (the PR checks had all passed, so we already knew black was “happy”). I don’t really have a preference on this one…it’s so close to being at max length that having it this way is arguably preferable in case anyone adds to it, so I’m inclined to leave the update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to get back in there, so I did switch it back...

@@ -66,6 +73,21 @@ class HistPanel(Config):
"is plotted, the percentile limit is the maximum value across all input data.",
default=98.0,
)
expectedValue = Field[float](
doc="Value at which to add a black solid vertical line. Ignored if set to None.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double space after period.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, 'cuz I'm down with the double space after period, and also, from: https://peps.python.org/pep-0008/#comments

You should use two spaces after a sentence-ending period in multi- sentence comments, except after the final sentence.

I could argue that the precedent is set for almost every file in the stack by the template preamble: https://github.com/lsst/templates/blob/main/file_templates/stack_license_preamble_py/example.py
...but I will change to single space if you insist 😉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stand corrected!! Perhaps the template needs updating to meet our standards 😆 (and I did try to search for this in the dev guide...my key word "period" resulted in me missing this!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the offending double spaces should now be removed!

optional=True,
)
histDensity = Field[bool](
doc="Whether to plot the histogram as a normalized probability distribution. Must also "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double space after period.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto 😆

panel_range = [minMed - 3.5 * maxMad, maxMed + 3.5 * maxMad]
if panel_range[1] - panel_range[0] == 0:
log.info(
"NOTE: panel_range for {} based on med/sigMad was 0. Computing using "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double space after period.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

# add a buffer to the top of the plot to allow headspace for labels
ylims = list(ax.get_ylim())
if ax.get_yscale() == "log":
ylims[1] = 10 ** (np.log10(ylims[1]) * 1.1)
else:
ylims[1] *= 1.1
ax.set_ylim(ylims[0], ylims[1])

# Draw a vertical line at expected value, if given. If histDensity
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double space after period.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

yAnchor0=0.0,
nth_row=0,
nth_col=0,
ncols=1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ncols arg doesn't seem to be being used in this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

nth_col -= 1
# Set some font sizes based on number of panels being plotted.
label_font_size = max(6, 10 - nrows)
legend_font_size = max(4, int(8 - len(self.panels[panel].hists) / 2 - nrows // 2)) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need a # type: ignore here? Looks fine on my side?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get:

$ mypy python/lsst/analysis/tools/actions/plot/histPlot.py 
python/lsst/analysis/tools/actions/plot/histPlot.py:196: error: <nothing> has no attribute "hists"  [attr-defined]
Found 1 error in 1 file (checked 1 source file)

if I take it out, so leaving it as is.

Comment on lines 167 to 196
label_font_size = max(6, 10 - nrows)
legend_font_size = max(4, int(8 - len(self.panels[panel].hists) / 2 - nrows // 2)) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do both label_font_size and legend_font_size need to be set inside this for loop? Can they be set once outside it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed the self.panels[panel].hists - that certainly does need to be assigned for each panel. Still not sure about label_font_size however.


def _makePanel(self, data, panel, ax, col, **kwargs):
def _makePanel(self, data, panel, ax, colors, label_font_size=9, legend_font_size=7, ncols=1, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, it doesn't look like these kwargs are passed anywhere, so perhaps they can be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing...this was preexisting, so I didn't remove it, but if you think it should go, I'm on board!

Comment on lines 297 to 312
if self.panels[panel].doPercentileRange:
panel_range = self._getPercentilePanelRange(data, panel)
else:
# Set the panel range to be extend 3.5 times the maximum sigmaMad
# for the datasets in the panel to the left[right] from the
# minimum[maximum] median value of all datasets in the panel.
maxMad = np.nanmax(mads)
maxMed = np.nanmax(meds)
minMed = np.nanmin(meds)
panel_range = [minMed - 3.5 * maxMad, maxMed + 3.5 * maxMad]
if panel_range[1] - panel_range[0] == 0:
log.info(
"NOTE: panel_range for {} based on med/sigMad was 0. Computing using "
"percentile range instead.".format(panel)
)
panel_range = self._getPercentilePanelRange(data, panel)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments here:

  1. I think much of this panel_range logic needs to be moved back into the method previously called getPanelRange (renamed here to getPercentilePanelRange). Having all the get-my-plotting-range logic in one method makes more sense to my eye.

  2. We probably also want to add an absolute range user input option, in addition to percentile scaling and sigma-mad scaling. This would help us reproduce plots with a fixed range on the x-axis, to aid in cross-comparisons. To achieve this, I'd recommend replacing doPercentileRange with a string field named something like rangeType. The default for this should be percent, with lower = 0 and upper = 100 (i.e., plotting everything, which probably makes the most sense as a generic default). We could use match/case logic to select the appropriate matching code block. This then allows us to think about:

  3. I think the sigma-mad range also needs to be configurable. To achieve this here, I would recommend changing pLower and pUpper to more generic labels, such as lower and upper. Thus, if the user sets rangeType to sigma, then lower would be the lower sigma range (and similarly for upper).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big yes to all of the above. Have a look at it now and see what you think.

@@ -66,6 +73,21 @@ class HistPanel(Config):
"is plotted, the percentile limit is the maximum value across all input data.",
default=98.0,
)
expectedValue = Field[float](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a better generic name would be referenceValue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To put this into context somewhat, if I want to add a line to my RA histogram plot to show a particular RA of interest, I'm not sure that expectedValue would seem to best describe that line of reference.

if self.panels[panel].histDensity:
expected_label = None
else:
expected_label = "${{\\mu_{{expected}}}}$: {}".format(self.panels[panel].expectedValue)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why mu_expected? If I want to add a generic reference line at RA=300 deg, I think it might be confusing to have this black line labeled as mu_expected. Does this need a string label at all, i.e., will just the line and the value suffice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've change expected -> reference. I'm inclined to leave the label in hopes of making sure it's clear that this curve is not data-based.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As head space at the top of these panels is in high demand, would mu_ref work for you? Not a deal breaker if not, but I think syncs nicely with P_norm opposite.

Comment on lines 351 to 380
if self.panels[panel].expectedValue is not None:
ax2 = ax.twinx()
ax2.axis("off")
ax2.set_xlim(ax.get_xlim())
ax2.set_ylim(ax.get_ylim())

if self.panels[panel].histDensity:
expected_label = None
else:
expected_label = "${{\\mu_{{expected}}}}$: {}".format(self.panels[panel].expectedValue)
ax2.axvline(
self.panels[panel].expectedValue, ls="-", lw=1, c="black", zorder=0, label=expected_label
)
if self.panels[panel].histDensity:
ref_x = np.arange(panel_range[0], panel_range[1], (panel_range[1] - panel_range[0]) / 100.0)
ref_mean = self.panels[panel].expectedValue
ref_std = 1.0
ref_y = (
1.0
/ (ref_std * np.sqrt(2.0 * np.pi))
* np.exp(-((ref_x - ref_mean) ** 2) / (2.0 * ref_std**2))
)
ax2.fill_between(ref_x, ref_y, alpha=0.1, color="black", label="P$_{{norm}}(0,1)$", zorder=-1)
# Make sure the y-axis extends beyond the data plotted and that
# both axes y-ranges are in sync.
y_max = max(max(ref_y), ax2.get_ylim()[1])
if ax2.get_ylim()[1] < 1.05 * y_max:
ax.set_ylim(ax.get_ylim()[0], 1.05 * y_max)
ax2.set_ylim(ax.get_ylim())
ax2.legend(fontsize=legend_font_size, handlelength=1.5, loc="upper right", frameon=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to move everything below if self.panels[panel].expectedValue is not None: into its own separate private method? Something like _addReferenceLines or similar? I think that would help the legibility of this method a fair amount.

@laurenam laurenam force-pushed the tickets/DM-37075 branch 5 times, most recently from e967c84 to c72ca5b Compare December 21, 2022 23:32
"the values of lowerRange and upperRange.",
allowed={
"percentile": "Upper and lower percentile ranges of the data.",
"sigmaMad": "Range is (sigmaMad - lowerRange*sigmiMad, sigmaMad + upperRange*sigmaMad).",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigmaMad typo

Copy link
Contributor

@leeskelvin leeskelvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM! Only a couple of new comments above - 1 typo and 1 reference to ref renaming suggestion. If you are diving in and feel like resolving that erroneously long line-wrap issue too, please do! Thanks for this, these are great updates!

This updates the histPlot action to allow for finer grained control
over the plot ranges.  The "rangeType" can now be selected as one of
"percentile", "sigmaMad", or "fixed".  The lower and upper bounds of
the range will then be set accordingly by the values in lowerRange and
upperRange as follows:

"percentile": (lowerRange percentile of data,
               upperRange percentile of data)

"sigmaMad": (min(medians) - lowerRange*max(sigmaMads),
             max(medians) + upperRange*max(sigmaMads))

"fixed": (lowerRange, upperRange)

It also updates the way the right-hand statistics legends are added
such that each entry gets a title (set to the x-axis label) and the
legends line up (reasonably well for up to ~6 panels with <~ 4
datasets per panel) with their respective rows.

In order to avoid confusion between col for color vs. column, all
references to col refer to the latter, and color is spelled out for
the former.
Also overplot a shaded curve representing the idealized Pnorm(0, 1)
distribution for reference.
This updates the skyObject and skySource analysisPlot classes to
adapt to the new range setting configs in the HistPlot action.
This also adds vertical solid black lines at a "reference" value
(0.0 here) in both the flux and S/N panels as well as a shaded
PDF(0, 1) distribution on the now density normalized S/N plots
for reference.

This also adds the GaaP 1p0 flux to the skyObject histograms (GaaP
fluxes are not measured for the visit-level sky sources).
@laurenam laurenam merged commit fcebaf1 into main Dec 23, 2022
@laurenam laurenam deleted the tickets/DM-37075 branch December 23, 2022 04:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants