Skip to content

Added doc on how to adapt bin size in peakmap plots#66

Merged
jcharkow merged 4 commits intoOpenMS:mainfrom
Leandro1302:my_branch
Mar 25, 2025
Merged

Added doc on how to adapt bin size in peakmap plots#66
jcharkow merged 4 commits intoOpenMS:mainfrom
Leandro1302:my_branch

Conversation

@Leandro1302
Copy link
Contributor

@Leandro1302 Leandro1302 commented Mar 14, 2025

Added doc on how to adapt bin size in peakmap plots based on their functionality and what was written in the file PeakMap.ipynb.

I also changed the default value of bin_peaks, showed in the parameters table from "False" to "auto" because this is written to be the default value.

Summary by CodeRabbit

  • New Features
    • Introduced a demonstration tool that visualizes peak maps with varying binning configurations.
    • Offers side-by-side comparisons of different binning levels to help users better understand visualization differences.

@coderabbitai
Copy link

coderabbitai bot commented Mar 14, 2025

Walkthrough

The pull request adds a new Python script that demonstrates peak map visualization using multiple binning configurations. The script fetches a dataset from a URL, processes the data into a pandas DataFrame, and uses matplotlib with the "ms_matplotlib" backend to generate subplots for binning levels (10,10), (40,40), and (100,100). The layout is adjusted with individual subplot titles and an overall figure title.

Changes

File Path Change Summary
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py Introduces a script to fetch a dataset via HTTP GET, parse it into a DataFrame using pandas (with tab-separated values), and plot peak maps on three subplots for different binning configurations using matplotlib.

Sequence Diagram(s)

sequenceDiagram
    participant S as Script
    participant R as Requests (HTTP GET)
    participant P as Pandas Parser
    participant M as Matplotlib Plotter

    S->>R: Fetch dataset from URL
    R-->>S: Response text data
    S->>P: Parse data (tab-separated) into DataFrame
    P-->>S: DataFrame ready
    S->>M: Plot peak maps in 3 subplots\n(binning levels: (10,10), (40,40), (100,100))
    M-->>S: Rendered figure with adjusted layout
Loading

Suggested reviewers

  • jcharkow

Poem

In a field of code, I hop with glee,
New plots emerge as far as I see.
Binning magic in subplots bright,
Each data hop brings joy to light.
With a twitch of whiskers and a cheerful thump,
This rabbit celebrates with a happy jump! 🐇

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/Parameters/PeakMap.rst (2)

12-17: Low Binning Example is Well Illustrated

The low binning example using num_x_bins=10 and num_y_bins=10 is clearly presented with a code snippet. Consider verifying that these example values reflect common use cases in real-world scenarios to ensure optimal guidance.


18-23: High Binning Example Effectively Demonstrates Detail Enhancement

The high binning example is consistent with the low binning case and clearly shows how to increase visual detail via higher bin numbers. It would be beneficial to ensure that these examples are synchronized with the behavior of the underlying plotting function in case of any future changes.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40549e8 and fa0a4f9.

⛔ Files ignored due to path filters (1)
  • docs/Parameters/peakMapPlot.tsv is excluded by !**/*.tsv
📒 Files selected for processing (1)
  • docs/Parameters/PeakMap.rst (1 hunks)
🔇 Additional comments (3)
docs/Parameters/PeakMap.rst (3)

7-8: Clear Section Heading for Binning Adaptation Documentation

The new section title "Adapting Bin Size in Peak Map Plots" along with the underline is clearly demarcated. This improves readability and directly addresses the documentation enhancements described in the PR objectives.


10-11: Concise Explanation of Binning Benefits

The paragraph explains that binning is used to optimize visualization and helps users balance performance versus detail. This clear explanation aligns well with the intended documentation improvements.


24-24: Updated Default Explanation for bin_peaks Parameter

The explanation for the bin_peaks parameter now clearly states that using bin_peaks='auto' enables automatic binning for large datasets, while setting it to False disables it. This direct clarification aligns with the PR objective to modify the default parameter value.

@jcharkow
Copy link
Collaborator

For next time please link the corresponding issue in the description. I am assuming this is #63 ?

@jcharkow jcharkow linked an issue Mar 15, 2025 that may be closed by this pull request
@jcharkow jcharkow self-requested a review March 15, 2025 14:11
@Leandro1302
Copy link
Contributor Author

For next time please link the corresponding issue in the description. I am assuming this is #63 ?

Yes, I'm sorry.

Copy link
Collaborator

@jcharkow jcharkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start however I think a more appropriate place to put this is in the gallery so that it can be accompanied by visual examples. Also putting this in the getting started notebook would be an asset. Please see the comments above for more details on how to do this.

Comment on lines +7 to +24
Adapting Bin Size in Peak Map Plots
-----------------------------------

By default, the peak map uses binning to optimize visualization. You can **adjust the binning** level to balance between performance and detail.

- **Low binning** (fewer details, faster rendering):

.. code-block:: python

df_ms_experiment.plot(x="RT", y="mz", z="inty", kind="peakmap", num_x_bins=10, num_y_bins=10)

- **High binning** (more details, slower rendering):

.. code-block:: python

df_ms_experiment.plot(x="RT", y="mz", z="inty", kind="peakmap", num_x_bins=100, num_y_bins=100)

If ``bin_peaks='auto'``, binning is automatically enabled for large datasets exceeding ``num_x_bins * num_y_bins`` peaks. Setting ``bin_peaks=False`` disables binning, at the cost of performance for large datasets.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better place to put this would be in the getting started notebook and creating your own example. so that there are visual examples to go along with this. However linking that section of the getting started notebook here would be helpful.

The getting started notebook can be found here: https://github.com/OpenMS/pyopenms_viz/blob/main/docs/Getting%20Started.ipynb

Adding an example for automatic rendering should be a python script docs/gallery_scripts_template and you should name the file plot_peakmap_binning_demonstration_ms_matplotlib.py so that it can be rendered automatically by the documentation. Pushing to this PR should allow rendering of the documentation through readthedocs so you can test it out. The easiest way to format the script is likely to just let it create one figure in the end showing different types of binning. I would look at https://github.com/OpenMS/pyopenms_viz/blob/main/docs/gallery_scripts_template/plot_spyogenes_subplots_ms_matplotlib.py for inspiration, specifically setting the canvas parameter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, I'll start working on it, if there is a problem I will reach out you, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sir @jcharkow , I've almost done but I'm having this problem in viewing the work done. When I send the command on git bash "python docs/gallery_scripts_template/plot_spyogenes_subplots_ms_matplotlib.py" (that is one of the files already present in the "gallery_scripts_template" folder) for example, it shows me the figure but only the first subplot is filled. The others are empty subplots. And it's doing the same thing with the file too, so I can't see if everything is ok. What could be the problem, since as I said it seems it's not just a problem with my file?

Copy link
Collaborator

@jcharkow jcharkow Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

y

@Leandro1302 Thank you for noticing this! I observe the same bug on my end. I usually render the complete docs so I have not noticed this error in the past. Please open an issue on this (can basically just be the description you have above and include a screenshot of the plot you observe.) I don't know what the problem is of the top of my head but for the time being I would recommend working in a jupyter-lab environment.

Basically copy the entire script into a jupyter lab cell for development and then copy back to a python script when you are happy with the output. In the jupyterlab environment the plot will display twice, this is normal. When you commit the changes you should see it rendering on readthedocs so you can see if it renders correctly online.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you can

  1. "show_plot=False" to the run_df.plot()
  2. replace the end of the script with plt.tight_layout() and plt.show() to the end of the script

Note these changes might effect the documentation rendering though so best just to use them for testing purposes.

I am not sure if tight_layout() will be correctly applied though but it will give you a general idea of what the plot will look like.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok perfect sir, I will do this. :)

And I also noticed that, if I open the file getting_started.ipynb directly from GitHub, there are some plots which are not displayed but there is only the output "Loading BokehJS ..." . Is it a normal thing?
Because I'm also dealing with the error "Models must be owned by only a single document" of Bokeh, which removes the graphs once i close VSCODE and I don't know if it's the same reason why as I said before the graphs are not shown in the getting_started.ipynb file here on github.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm can you please include a link? It seems that the plots work for me. I'm not sure if the recent rebuild of the documentation solved this?

https://openms.github.io/pyopenms_viz/Getting%20Started.html

https://pyopenms-viz.readthedocs.io/en/latest/Getting%20Started.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant by opening the file getting_started.ipynb right here from github I can't see some plots but only "Loading BokehJS" as you can see from the image below. Instead from the links you sent me I can see both the plots and "Loading BokehJS" and everything is ok.

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately, github's notebook rendering is still buggy as I have had this issue across multiple repositories and I don't think there is much we can do about that.

…for binning example image in the doc and added a new section in the Getting Started.ipynb file to show various types of binning
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py (1)

8-12: Remove unused import.

The numpy module is imported but not used in this script.

import pandas as pd
import requests
from io import StringIO
-import numpy as np
import matplotlib.pyplot as plt
🧰 Tools
🪛 Ruff (0.8.2)

11-11: numpy imported but unused

Remove unused import: numpy

(F401)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa0a4f9 and a7b472d.

📒 Files selected for processing (1)
  • docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py

11-11: numpy imported but unused

Remove unused import: numpy

(F401)

🔇 Additional comments (5)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py (5)

1-6: Clear and informative docstring!

The docstring effectively communicates the purpose of this script, which is to demonstrate the impact of different binning levels on peak map visualization. This aligns well with the PR objective of improving documentation on bin size adaptation in peakmap plots.


16-19: Good data loading approach.

Using requests to fetch the data and raise_for_status() to handle potential HTTP errors is a robust approach. The data loading is well-implemented with proper error handling.


22-25: Well-structured visualization setup.

Creating three subplots with shared axes is an effective way to compare different binning levels side by side. The chosen binning levels (10, 10), (50, 50), and (100, 100) provide a good range to demonstrate the impact of binning on visualization detail.


27-39: Effective implementation of the plot loop.

The loop elegantly iterates through the binning levels and creates consistent visualizations with appropriate titles. The use of the num_x_bins and num_y_bins parameters effectively demonstrates how to adjust the binning level, which is the main focus of this PR.


42-43: Good figure formatting.

The suptitle clearly indicates the purpose of the visualization, and tight_layout() ensures proper spacing between subplots. This improves the overall readability of the demonstration.

@Leandro1302 Leandro1302 requested a review from jcharkow March 23, 2025 19:28
@Leandro1302
Copy link
Contributor Author

@jcharkow Sir I have committed the changes as you told me. Let me know if everything is ok or there is any problem, thank you!

Copy link
Collaborator

@jcharkow jcharkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just some minor changes that need addressing. The docs did not render online due to memory consumption, sometimes this happens. I will rerun the build after you address these changes and see if we can get it to build.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments:

For headings added please change from header 5 to header 4 for consistency (this is one level down from the previous header.

Specific headers that require changing.

  • Log-Scaling
  • Adapting Bin Size in Peak Map Plots


for ax, (num_x_bins, num_y_bins) in zip(axs, binning_levels):

df.plot(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to match 898b635
so plotting works on scripts as well.

This means changing show_plot=False and adding plt.show() to the end

df = pd.read_csv(StringIO(response.text), sep="\t")


fig, axs = plt.subplots(1, 3, figsize=(18, 6), sharex=True, sharey=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change to vertical oriented plot as this will render better in the documentation. e.g. plt.subplots(3, 1) ...

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py (1)

8-14: Remove unused numpy import.

The numpy module is imported but not used in the script.

import pandas as pd
import requests
from io import StringIO
-import numpy as np
import matplotlib.pyplot as plt
🧰 Tools
🪛 Ruff (0.8.2)

11-11: numpy imported but unused

Remove unused import: numpy

(F401)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a7b472d and f9a52d9.

⛔ Files ignored due to path filters (1)
  • docs/Parameters/peakMapPlot.tsv is excluded by !**/*.tsv
📒 Files selected for processing (1)
  • docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py

11-11: numpy imported but unused

Remove unused import: numpy

(F401)

🔇 Additional comments (5)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py (5)

1-6: Great documentation example for peakmap binning!

The docstring clearly states the purpose of this script, which aligns with the PR objectives to add documentation on how to adapt bin size in peakmap plots.


16-20: LGTM: Robust data loading implementation.

The code correctly fetches data from a URL and handles potential errors with raise_for_status(), which is a good practice for data loading.


22-25: LGTM: Vertical plot layout as requested.

The implementation uses a vertical plot layout (3 rows, 1 column) as suggested in previous review comments.


27-45: LGTM: Plot configuration properly implements previous feedback.

The plotting code correctly sets show_plot = False as requested in the previous review comment. The implementation shows different binning levels which fulfills the main purpose of this demonstration.


47-48: LGTM: Proper subplot adjustment and plt.show() added.

The subplot adjustment improves readability, and plt.show() is added at the end as requested in the previous review.

@Leandro1302 Leandro1302 requested a review from jcharkow March 24, 2025 16:33
@Leandro1302
Copy link
Contributor Author

@jcharkow I just committed these minor changes. I think everything is ok now but let me know if you want to make any other adjustments!

Copy link
Collaborator

@jcharkow jcharkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good ready to merge!

@jcharkow jcharkow merged commit 242c950 into OpenMS:main Mar 25, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add doc on how to adapt bin size in peakmap plots

2 participants