Skip to content

Comparison Tool 4.5: Grid of Pie Charts across Cooling Times#179

Merged
gonuke merged 25 commits intosvalinn:mainfrom
eitan-weinstein:pie_chart_grid
Dec 29, 2025
Merged

Comparison Tool 4.5: Grid of Pie Charts across Cooling Times#179
gonuke merged 25 commits intosvalinn:mainfrom
eitan-weinstein:pie_chart_grid

Conversation

@eitan-weinstein
Copy link
Contributor

Closes #174.

Expanding on pie chart functionality added in #145 to create a grid of pie charts for all cooling times for a given run and response variable. An example of what one of these grids looks like is attached:

f2566abd-cc67-4bfb-8073-c92917a88e47

Copy link
Member

@gonuke gonuke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely formatted and put together here... I have a number of suggestions to

  • help separate concerns a little and
  • take advantage of dataframes without creating so many new data structures

# Build and populate table
table_ax = fig.add_subplot(gs[0,:])
table_ax.axis('off')
table_data = populate_table_rows(agg, all_nucs, times)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this can't just be a pivoted dataframe, instead of the list of lists that you generate? You could then use pandas.plotting.table() perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason other that I just didn't know about pandas.plotting.table() until now!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full disclosure to reveal my through process: I didn't know about it either. I just had a hunch that once we had a dataframe in the right shape (which pivoting does), it should be possible to sort it easily and then guessed that it was possible to generate a "pretty" version in output, so went hunting/googling for that....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just updated the PR with a bunch of changes, and after having messed around with the pandas.plotting.table() vs the matplotlib.pyplot.table(), I actually think that using the matplotlib version will work better because it allows for custom row and column labels, whereas the pandas version demanded the labels be the same as those from the input DataFrame, which would add unnecessary steps to arrive at the same end.

row_labels = [reformat_isotope(nuc) for nuc in all_nucs]

table_data_float = np.array([
[float(cell.strip('%')) for cell in row] for row in table_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you add the "%" when you process the data only to remove it here. Why not just put the float data in the table in the first place?

@@ -87,7 +90,7 @@ def preprocess_data(

return times, filtered, piv
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In hindsight, it's not clear to me why return times since those times will be available in filtered (and piv). From a data integrity point of view, it always seems safest to extract the times from those dataframes rather than pass them around as a separate data object


return times, agg

def populate_table_rows(agg, all_nucs, times):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment below, but it's not clear why you need this method when you could just use a dataframe rather than a list of lists?

Comment on lines +356 to +365
# Create table gridspec
left = 0.05
top = 0.95
gs = fig.add_gridspec(
nrows + 1, ncols,
height_ratios=([1.0] + [1]*nrows),
hspace=0.4,
wspace=0.3,
left=left, right=(1-left), top=top, bottom=(1-top)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separation of concerns... I think the GridSpec should be generated at a different level, and then the specific table_ax passed into this method to populate it.

None
'''

times, agg = pie_chart_aggregation(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separation of concerns: could we have the user do this aggregation and then pass the result into the method to generate the pie chart?

Comment on lines +403 to +408
# Color in index
table[(i+1, -1)].set_facecolor(rgba)

# Color all other columns
for j in range(len(times)):
table[(i+1, j)].set_facecolor(rgba)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can be combined into a single loop

Suggested change
# Color in index
table[(i+1, -1)].set_facecolor(rgba)
# Color all other columns
for j in range(len(times)):
table[(i+1, j)].set_facecolor(rgba)
# Color all columns in this row
for j in range(len(times)+1):
table[(i+1, j-1)].set_facecolor(rgba)

table.scale(1.0, 2.0)

# Format table for readability
for i in range(len(row_labels)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer row and col vs i and j... or at least r and c

Comment on lines +426 to +429
if np.any(fractions == 1.0):
for nuc, frac in zip(time_slice['nuclide'], fractions):
labels.append(reformat_isotope(nuc) if frac == 1.0 else '')
return labels
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this edge case already covered by the case below, as long as you handle Other more carefully? Maybe the condition is:

if frac >= threshold or (nuc == 'Other' && frac > 0):

Copy link
Member

@gonuke gonuke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more things to clarify...


return {lbl: cmap(i % cmap.N) for i, lbl in enumerate(sorted(all_nucs))}

def build_color_map_from_nucs(all_nucs, cmap_name):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious how this is substantially different from build_color_map_from_pivs?

None
'''

table_ax = fig.add_subplot(gs[0,:])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to do this before calling this method, and pass in table_ax. Then it's possible for someone to decide where in the gridspec if goes before calling this

plt.tight_layout(rect=[0, 0, 0.85, 1])
plt.show()

No newline at end of file
plt.close(fig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to close the figure? Is it a scope issue... because we created the figure here in this function?

Copy link
Member

@gonuke gonuke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small changes

Comment on lines +106 to +115
if np.array(all_nucs).size == 0:
if pivs:
all_nucs = set()
for piv in pivs:
all_nucs.update(piv.index)

else:
raise ValueError(
'Must input either all_nucs or pivs.'
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if np.array(all_nucs).size == 0:
if pivs:
all_nucs = set()
for piv in pivs:
all_nucs.update(piv.index)
else:
raise ValueError(
'Must input either all_nucs or pivs.'
)
all_nucs = set(all_nucs)
if len(all_nucs) == 0:
for piv in pivs:
all_nucs.update(piv.index)
if len(all_nucs) == 0:
raise ValueError(
'Must input either all_nucs or pivs.'
)

times = sorted(agg['time'].unique())
N = len(times)
nrows = int(np.ceil(N / ncols))
fig = plt.figure(figsize=(5 * ncols + 3, 5 * (nrows + 0.7)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these magic numbers 5, 3, 0.7?

Copy link
Contributor Author

@eitan-weinstein eitan-weinstein Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply formatting values that made the plot come out looking nice. Would you prefer they be optional parameters that the user can assign?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or give them some name that evokes some kind of meaning...

padding = 5
vspace = 3
hspace = 4.5


return wedges

def open_with_new_manager(fig):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this function?

Copy link
Member

@gonuke gonuke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks @eitan-weinstein

@gonuke gonuke merged commit dcc9657 into svalinn:main Dec 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Grid of multiple pie charts

2 participants