Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-33961: Generalize histogram plotting task to n-panels #19

Merged
merged 1 commit into from Mar 16, 2022

Conversation

leeskelvin
Copy link
Contributor

No description provided.

@leeskelvin leeskelvin force-pushed the tickets/DM-33961 branch 3 times, most recently from c470a5e to c6aa564 Compare March 14, 2022 16:09
for action in actionStruct:
for col in action.columns:
columnNames.add(col)
selectorColumn = [x for x in [x for x in self.config.selectorActions][0].columns][0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can x be something more descriptive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this line to:

selectorColumns = [col for selector in self.config.selectorActions for col in selector.columns]

@@ -237,20 +199,20 @@ def run(self, catPlot, dataId, runName, skymap, tableName, bands, plotName):
# Get useful information about the plot
plotInfo = parsePlotInfo(dataId, runName, tableName, bands, plotName, SN)
# Calculate the corners of the patches and some associated stats
sumStats = generateSummaryStats(plotDf, self.config.summaryStatsLabel, skymap, plotInfo)
sumStats = generateSummaryStats(datPlot, self.config.summaryStatsColumn, skymap, plotInfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did the name get changed to datPlot? It would be nice to have consistency across all the plot types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed back to plotDf.

# Corner plot of patches showing summary stat in each
axCorner = plt.gcf().add_subplot(gs[70:96, 3:21])
gs = gridspec.GridSpec(240, 240)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to do this not with gridspec? Then you can just make an n by m set of plots, is it so that they can be different widths?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that doing it without gridspec would make it easier to read and involve less faffing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this comment Sophie. I played around with this for a long time, as initially, I agreed with you. In the end however, I discovered that doing what we want to do using subplots is actually more of a faff than using gridspec. For an N-panel figure, with a side panel axis reserved for the corner plot, the logic will always be a little complicated no matter if you use gridspec or subplots. If you don't object too strongly, and as this currently works as expected, I'll recommend keeping this as it is for now.

else:
ncols = 2
nrows = int(np.ceil(len(panels) / ncols))
col_bounds = np.linspace(0+summary_block+panel_pad, 240, ncols+1, dtype=int)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spaces needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and throughout, did you run PEP8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire histPlot.py script has been PEP8'ed. With the exception of E501 line too long, all PEP8 violations have been resolved.

for count, (hist, column) in enumerate(zip(hists, columns)):
isfinite = np.isfinite(datPlot[column])
datMed = np.nanmedian(datPlot[column][isfinite])
datMad = sigmaMad(datPlot[column][isfinite])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should omit the NaNs from this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The median values are now calculated further up the script, alongside the x-axis limits. The relevant median line now looks like this:

meds.append(np.nanmedian(hist_data))

I've removed the [isfinite] selection from the data selector, so use of nanmedian is required in case any non-finite values creep into the calculation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I went back to using the isfinite logic. This section now looks like this:

isfinite = np.isfinite(hist_data)
meds.append(np.median(hist_data[isfinite]))
mads.append(sigmaMad(hist_data[isfinite]))
nums.append(np.sum(isfinite))

col_starts, col_stops, row_starts, row_stops = [], [], [], []
for i in range(len(panels)):
col_starts.append(col_bounds[i%ncols]+panel_pad)
if (i == (len(panels)-1)) and (len(panels)%2 == 1):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spaces

for i in range(len(panels)):
col_starts.append(col_bounds[i%ncols]+panel_pad)
if (i == (len(panels)-1)) and (len(panels)%2 == 1):
col_stops.append(np.max(col_bounds)-panel_pad)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spaces

if (i == (len(panels)-1)) and (len(panels)%2 == 1):
col_stops.append(np.max(col_bounds)-panel_pad)
else:
col_stops.append(col_bounds[(i%ncols)+1]-panel_pad)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spaces, I know you wanted a comment everytime there is an error but there are too many.

# add legends
leg1 = ax.legend(loc="upper left", fontsize=7, frameon=False, handlelength=0, borderaxespad=0,
handletextpad=0, handleheight=0.7, framealpha=0.4)
_ = plt.gca().add_artist(leg1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need a return thing here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire legend section has now been refactored to something much simpler, using ax.text.

text.set_text(text.get_text().split(":::")[1])
text.set_color(plt.cm.tab10(count))
renderer = fig.canvas.get_renderer()
tlens = [t.get_window_extent(renderer).width for t in leg2.get_texts()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments here, and in other places, would be useful to know what is going on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section has been simplified. Additionally, a number of in-line comments have been added explaining the various steps of data processing.

@leeskelvin leeskelvin force-pushed the tickets/DM-33961 branch 3 times, most recently from 245fd5d to 1a73ec3 Compare March 16, 2022 06:14
@leeskelvin
Copy link
Contributor Author

Thanks for the above Sophie, much appreciated. In addition, you left me a couple of comments on Slack too:

If you have generalised this to multiple panels and different things are being plotted on each then does the corner plot make sense anymore?

Good question, and the answer will sometimes certainly be: definitely not. I propose that a future ticket adds a toggleable True/False flag asking if the user wants a corner plot to be added or not. For the purposes of this ticket however, I propose to maintain the functionality of the current task and leave it in.

the fontsize (for stats) could be smaller.

If I make it much smaller, it looks too small on the 1- and 2-panel plots. I could make this user-configurable however?

Thanks again. The whole thing has been pep-8'ed with pycodestyle, has passed the linter here on GH, and is currently running on Jenkins. Will post more on the Jira ticket.

@leeskelvin
Copy link
Contributor Author

Oh, and I forgot the most important question:

is there a way to add star/galaxy information?

Yes. I've now added a selectors property into the HistPlotConfig class which allows the user to specify a selector for each plotted histogram. The global selectors (selectorActions) still exist, and will still trim the data down across the entire plot. However, if the user wants to have unique selectors for each histogram, that is now possible. I'll attach an example of this to the Jira ticket, for reference. I've added the plotting task that produces this figure into the coaddQAPlots.yaml pipeline, for reference.

@leeskelvin
Copy link
Contributor Author

PS, you also mentioned in passing that it would be nice to change the colour scheme. I can also make this a user-configurable option, if desirable?

@leeskelvin leeskelvin force-pushed the tickets/DM-33961 branch 3 times, most recently from 9ebc971 to e53786b Compare March 16, 2022 16:36
@leeskelvin leeskelvin merged commit e8198aa into main Mar 16, 2022
@leeskelvin leeskelvin deleted the tickets/DM-33961 branch March 16, 2022 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants