Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ptable plotters and add ptable_heatmap with diagonally-split tiles #131

Merged
merged 103 commits into from May 4, 2024

Conversation

DanielYang59
Copy link
Contributor

@DanielYang59 DanielYang59 commented Apr 6, 2024

closes #130

Summary

Good news first

  • New implementation for plotting nested ptable of course :) everything is now modular, I'm feeling very happy.
  • The requested evenly-split ptable :)

Breaking changes

  • ptable_plots replaced by ptable_lines and ptable_scatters with the new implementation, kind of good news :)
  • Use *_kwargs over *_kwds for consistency, previously a random mixture of both were used.

Urgently needed fixes:

Not-so-urgent fixes/enhancement:

  • data_preprocessor doesn't support missing value/infinity handling yet.
  • Replace other ptable plotters with new implementation.
  • Finalize data preprocessing tool
  • Add instruction for custom child plotters

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 10, 2024

Can you please give me some comments so far @janosh ? Thanks a lot!

Not sure why SVG Compression / tests (pull_request) failed?

Copy link
Owner

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impressive work! thanks so much @DanielYang59. 🙏 several comments below

i think the SVG compression action is failing because of this unaddressed limitation
actions/checkout#694

assets/ptable-diags.svg Outdated Show resolved Hide resolved
pymatviz/ptable.py Outdated Show resolved Hide resolved
pymatviz/ptable.py Outdated Show resolved Hide resolved
assets/ptable-diags.svg Outdated Show resolved Hide resolved
@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 12, 2024

I think we're almost there @janosh. A few issues left:

  • I just tried to merge this plotter into heatmap but only to find the final code very messy and confusing. The heatmap plotters have extensive code to process single values, if we then make it handle float | list[float], the code seems too complex to maintain. I would suggest still having them side by side (the new plotter could handle no-split case just fine) .
  • I used a random length data in asset to showcase that this new plotter could handle arbitrary separation, but the final plot seems quite messy. Maybe still use 2-split?
  • How to compress the svg? It added thousands of lines. (thought you did it before)
  • We might need to handle special values if we allow single value (like NaN or infinity)? This sounds like something heatmap should handle, do you want to prevent user from giving such values to the new plotter?

pymatviz/ptable.py Outdated Show resolved Hide resolved
@DanielYang59 DanielYang59 marked this pull request as ready for review April 24, 2024 12:30
@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Apr 24, 2024

@janosh. Can you please review this? Sorry to say but I'm feeling that I'm currently making this PR too huge for me to efficiently work on. Can we merge this first and I would chop up the remaining work into (multiple) follow-up PRs.

Update: summary moved to the top for visibility.

@DanielYang59
Copy link
Contributor Author

@janosh In case this PR get lost. Thanks for your time!

@janosh
Copy link
Owner

janosh commented May 2, 2024

sorry for the delay. i'll try to get on this today.

@DanielYang59
Copy link
Contributor Author

sorry for the delay. i'll try to get on this today.

No worries at all. No pressure. Thanks!

Copy link
Owner

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think everything in this PR looks great. you left some todos which IIUC you prefer to leave for subsqeuent PRs?

My intention was to make everything modular (grouped by objects of course, colorbar/colormap/element symbol/or such) so that it becomes clearer for users (contrary to packing all args into a single plotter call), and make it easier for us to maintain.

I think by separating everything by object, the args are more self-explanatory. Instead of:

i take your point! i'm still a bit reluctant about the PTableProjector class but maybe you're right. i was originally thinking to extract common logic between ptable_... functions into helper functions like data_preprocessor and handle_missing_and_anomaly and then give each ptable_... function kwargs for each helper function:

ptable_heatmap(data, data_preproces_kwargs=dict(...), handle_missing_kwargs=dict(...), ...)

to simplify the function signature and make it less overwhelming. users could jump between the helper functions to only see the keywords they currently care about while keeping everything as pure functions. maybe worth considering if PTableProjector could be rewritten that way?

pymatviz/ptable.py Outdated Show resolved Hide resolved
pymatviz/ptable.py Outdated Show resolved Hide resolved
@DanielYang59
Copy link
Contributor Author

DanielYang59 commented May 3, 2024

i think everything in this PR looks great. you left some todos which IIUC you prefer to leave for subsqeuent PRs?

Yes. There are too many changes to make for a single PR, I would address them shortly in a follow up PR.

My intention was to make everything modular (grouped by objects of course, colorbar/colormap/element symbol/or such) so that it becomes clearer for users (contrary to packing all args into a single plotter call), and make it easier for us to maintain.
I think by separating everything by object, the args are more self-explanatory. Instead of:

i take your point! i'm still a bit reluctant about the PTableProjector class but maybe you're right. i was originally thinking to extract common logic between ptable_... functions into helper functions like data_preprocessor and handle_missing_and_anomaly and then give each ptable_... function kwargs for each helper function:

ptable_heatmap(data, data_preproces_kwargs=dict(...), handle_missing_kwargs=dict(...), ...)

to simplify the function signature and make it less overwhelming. users could jump between the helper functions to only see the keywords they currently care about while still leaving working only with pure functions. maybe worth considering if PTableProjector could be rewritten that way?

I understand your intention here and I'm not sure which one would be easier for the user though. I think by packing everything into the PTableProjector class, what I was trying to achieve is simply to pack everything together (to make the module look cleaner, instead of having a bunch of helper functions loosely laid everywhere)? I think both should work almost the same because in my class, there isn't much (if any) shared states at all.

@DanielYang59 DanielYang59 requested a review from janosh May 3, 2024 02:24
Comment on lines +348 to +361
self.cmap: Colormap = colormap

# Preprocess data
self.data: pd.DataFrame = data

# Initialize periodic table canvas
n_periods = df_ptable.row.max()
n_groups = df_ptable.column.max()

# Set figure size
plot_kwargs = plot_kwargs or {}
plot_kwargs.setdefault("figsize", (0.75 * n_groups, 0.75 * n_periods))

self.fig, self.axes = plt.subplots(n_periods, n_groups, **plot_kwargs)
Copy link
Owner

@janosh janosh May 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both should work almost the same because in my class, there isn't much (if any) shared states at all.

i would consider any self.<attribute> as state. but rather than belabor the point, let's merge this PR as is and refactor later if needed.

Copy link
Owner

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot @DanielYang59, this is a huge PR.

@janosh janosh merged commit 4a983a5 into janosh:main May 4, 2024
7 of 8 checks passed
@DanielYang59 DanielYang59 deleted the ptable_split branch May 5, 2024 02:07
@DanielYang59
Copy link
Contributor Author

Thanks for reviewing! I would try to finish the remaining several TODOs soon.

I think both should work almost the same because in my class, there isn't much (if any) shared states at all.

i would consider any self.<attribute> as state. but rather than belabor the point, let's merge this PR as is and refactor later if needed.

Yes but nearly zero states are "shared". So it should be pretty easy if we really wish to refactor it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ptable_heatmap with diagonally-split tiles
2 participants