Added doc on how to adapt bin size in peakmap plots#66
Conversation
WalkthroughThe pull request adds a new Python script that demonstrates peak map visualization using multiple binning configurations. The script fetches a dataset from a URL, processes the data into a pandas DataFrame, and uses matplotlib with the "ms_matplotlib" backend to generate subplots for binning levels (10,10), (40,40), and (100,100). The layout is adjusted with individual subplot titles and an overall figure title. Changes
Sequence Diagram(s)sequenceDiagram
participant S as Script
participant R as Requests (HTTP GET)
participant P as Pandas Parser
participant M as Matplotlib Plotter
S->>R: Fetch dataset from URL
R-->>S: Response text data
S->>P: Parse data (tab-separated) into DataFrame
P-->>S: DataFrame ready
S->>M: Plot peak maps in 3 subplots\n(binning levels: (10,10), (40,40), (100,100))
M-->>S: Rendered figure with adjusted layout
Suggested reviewers
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docs/Parameters/PeakMap.rst (2)
12-17: Low Binning Example is Well IllustratedThe low binning example using
num_x_bins=10andnum_y_bins=10is clearly presented with a code snippet. Consider verifying that these example values reflect common use cases in real-world scenarios to ensure optimal guidance.
18-23: High Binning Example Effectively Demonstrates Detail EnhancementThe high binning example is consistent with the low binning case and clearly shows how to increase visual detail via higher bin numbers. It would be beneficial to ensure that these examples are synchronized with the behavior of the underlying plotting function in case of any future changes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
docs/Parameters/peakMapPlot.tsvis excluded by!**/*.tsv
📒 Files selected for processing (1)
docs/Parameters/PeakMap.rst(1 hunks)
🔇 Additional comments (3)
docs/Parameters/PeakMap.rst (3)
7-8: Clear Section Heading for Binning Adaptation DocumentationThe new section title "Adapting Bin Size in Peak Map Plots" along with the underline is clearly demarcated. This improves readability and directly addresses the documentation enhancements described in the PR objectives.
10-11: Concise Explanation of Binning BenefitsThe paragraph explains that binning is used to optimize visualization and helps users balance performance versus detail. This clear explanation aligns well with the intended documentation improvements.
24-24: Updated Default Explanation forbin_peaksParameterThe explanation for the
bin_peaksparameter now clearly states that usingbin_peaks='auto'enables automatic binning for large datasets, while setting it toFalsedisables it. This direct clarification aligns with the PR objective to modify the default parameter value.
|
For next time please link the corresponding issue in the description. I am assuming this is #63 ? |
Yes, I'm sorry. |
jcharkow
left a comment
There was a problem hiding this comment.
Good start however I think a more appropriate place to put this is in the gallery so that it can be accompanied by visual examples. Also putting this in the getting started notebook would be an asset. Please see the comments above for more details on how to do this.
docs/Parameters/PeakMap.rst
Outdated
| Adapting Bin Size in Peak Map Plots | ||
| ----------------------------------- | ||
|
|
||
| By default, the peak map uses binning to optimize visualization. You can **adjust the binning** level to balance between performance and detail. | ||
|
|
||
| - **Low binning** (fewer details, faster rendering): | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| df_ms_experiment.plot(x="RT", y="mz", z="inty", kind="peakmap", num_x_bins=10, num_y_bins=10) | ||
|
|
||
| - **High binning** (more details, slower rendering): | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| df_ms_experiment.plot(x="RT", y="mz", z="inty", kind="peakmap", num_x_bins=100, num_y_bins=100) | ||
|
|
||
| If ``bin_peaks='auto'``, binning is automatically enabled for large datasets exceeding ``num_x_bins * num_y_bins`` peaks. Setting ``bin_peaks=False`` disables binning, at the cost of performance for large datasets. |
There was a problem hiding this comment.
I think a better place to put this would be in the getting started notebook and creating your own example. so that there are visual examples to go along with this. However linking that section of the getting started notebook here would be helpful.
The getting started notebook can be found here: https://github.com/OpenMS/pyopenms_viz/blob/main/docs/Getting%20Started.ipynb
Adding an example for automatic rendering should be a python script docs/gallery_scripts_template and you should name the file plot_peakmap_binning_demonstration_ms_matplotlib.py so that it can be rendered automatically by the documentation. Pushing to this PR should allow rendering of the documentation through readthedocs so you can test it out. The easiest way to format the script is likely to just let it create one figure in the end showing different types of binning. I would look at https://github.com/OpenMS/pyopenms_viz/blob/main/docs/gallery_scripts_template/plot_spyogenes_subplots_ms_matplotlib.py for inspiration, specifically setting the canvas parameter.
There was a problem hiding this comment.
Perfect, I'll start working on it, if there is a problem I will reach out you, thanks!
There was a problem hiding this comment.
Sir @jcharkow , I've almost done but I'm having this problem in viewing the work done. When I send the command on git bash "python docs/gallery_scripts_template/plot_spyogenes_subplots_ms_matplotlib.py" (that is one of the files already present in the "gallery_scripts_template" folder) for example, it shows me the figure but only the first subplot is filled. The others are empty subplots. And it's doing the same thing with the file too, so I can't see if everything is ok. What could be the problem, since as I said it seems it's not just a problem with my file?
There was a problem hiding this comment.
y
@Leandro1302 Thank you for noticing this! I observe the same bug on my end. I usually render the complete docs so I have not noticed this error in the past. Please open an issue on this (can basically just be the description you have above and include a screenshot of the plot you observe.) I don't know what the problem is of the top of my head but for the time being I would recommend working in a jupyter-lab environment.
Basically copy the entire script into a jupyter lab cell for development and then copy back to a python script when you are happy with the output. In the jupyterlab environment the plot will display twice, this is normal. When you commit the changes you should see it rendering on readthedocs so you can see if it renders correctly online.
There was a problem hiding this comment.
Alternatively, you can
- "show_plot=False" to the run_df.plot()
- replace the end of the script with plt.tight_layout() and plt.show() to the end of the script
Note these changes might effect the documentation rendering though so best just to use them for testing purposes.
I am not sure if tight_layout() will be correctly applied though but it will give you a general idea of what the plot will look like.
There was a problem hiding this comment.
Ok perfect sir, I will do this. :)
And I also noticed that, if I open the file getting_started.ipynb directly from GitHub, there are some plots which are not displayed but there is only the output "Loading BokehJS ..." . Is it a normal thing?
Because I'm also dealing with the error "Models must be owned by only a single document" of Bokeh, which removes the graphs once i close VSCODE and I don't know if it's the same reason why as I said before the graphs are not shown in the getting_started.ipynb file here on github.
There was a problem hiding this comment.
Hmm can you please include a link? It seems that the plots work for me. I'm not sure if the recent rebuild of the documentation solved this?
https://openms.github.io/pyopenms_viz/Getting%20Started.html
https://pyopenms-viz.readthedocs.io/en/latest/Getting%20Started.html
There was a problem hiding this comment.
There was a problem hiding this comment.
Yes, unfortunately, github's notebook rendering is still buggy as I have had this issue across multiple repositories and I don't think there is much we can do about that.
…for binning example image in the doc and added a new section in the Getting Started.ipynb file to show various types of binning
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py (1)
8-12: Remove unused import.The
numpymodule is imported but not used in this script.import pandas as pd import requests from io import StringIO -import numpy as np import matplotlib.pyplot as plt🧰 Tools
🪛 Ruff (0.8.2)
11-11:
numpyimported but unusedRemove unused import:
numpy(F401)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py
11-11: numpy imported but unused
Remove unused import: numpy
(F401)
🔇 Additional comments (5)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py (5)
1-6: Clear and informative docstring!The docstring effectively communicates the purpose of this script, which is to demonstrate the impact of different binning levels on peak map visualization. This aligns well with the PR objective of improving documentation on bin size adaptation in peakmap plots.
16-19: Good data loading approach.Using
requeststo fetch the data andraise_for_status()to handle potential HTTP errors is a robust approach. The data loading is well-implemented with proper error handling.
22-25: Well-structured visualization setup.Creating three subplots with shared axes is an effective way to compare different binning levels side by side. The chosen binning levels (10, 10), (50, 50), and (100, 100) provide a good range to demonstrate the impact of binning on visualization detail.
27-39: Effective implementation of the plot loop.The loop elegantly iterates through the binning levels and creates consistent visualizations with appropriate titles. The use of the
num_x_binsandnum_y_binsparameters effectively demonstrates how to adjust the binning level, which is the main focus of this PR.
42-43: Good figure formatting.The suptitle clearly indicates the purpose of the visualization, and
tight_layout()ensures proper spacing between subplots. This improves the overall readability of the demonstration.
|
@jcharkow Sir I have committed the changes as you told me. Let me know if everything is ok or there is any problem, thank you! |
jcharkow
left a comment
There was a problem hiding this comment.
Looks good! Just some minor changes that need addressing. The docs did not render online due to memory consumption, sometimes this happens. I will rerun the build after you address these changes and see if we can get it to build.
There was a problem hiding this comment.
Minor comments:
For headings added please change from header 5 to header 4 for consistency (this is one level down from the previous header.
Specific headers that require changing.
- Log-Scaling
- Adapting Bin Size in Peak Map Plots
|
|
||
| for ax, (num_x_bins, num_y_bins) in zip(axs, binning_levels): | ||
|
|
||
| df.plot( |
There was a problem hiding this comment.
Please change to match 898b635
so plotting works on scripts as well.
This means changing show_plot=False and adding plt.show() to the end
| df = pd.read_csv(StringIO(response.text), sep="\t") | ||
|
|
||
|
|
||
| fig, axs = plt.subplots(1, 3, figsize=(18, 6), sharex=True, sharey=True) |
There was a problem hiding this comment.
please change to vertical oriented plot as this will render better in the documentation. e.g. plt.subplots(3, 1) ...
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py (1)
8-14: Remove unused numpy import.The numpy module is imported but not used in the script.
import pandas as pd import requests from io import StringIO -import numpy as np import matplotlib.pyplot as plt🧰 Tools
🪛 Ruff (0.8.2)
11-11:
numpyimported but unusedRemove unused import:
numpy(F401)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
docs/Parameters/peakMapPlot.tsvis excluded by!**/*.tsv
📒 Files selected for processing (1)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py
11-11: numpy imported but unused
Remove unused import: numpy
(F401)
🔇 Additional comments (5)
docs/gallery_scripts_template/plot_peakmap_binning_demonstration_ms_matplotlib.py (5)
1-6: Great documentation example for peakmap binning!The docstring clearly states the purpose of this script, which aligns with the PR objectives to add documentation on how to adapt bin size in peakmap plots.
16-20: LGTM: Robust data loading implementation.The code correctly fetches data from a URL and handles potential errors with
raise_for_status(), which is a good practice for data loading.
22-25: LGTM: Vertical plot layout as requested.The implementation uses a vertical plot layout (3 rows, 1 column) as suggested in previous review comments.
27-45: LGTM: Plot configuration properly implements previous feedback.The plotting code correctly sets
show_plot = Falseas requested in the previous review comment. The implementation shows different binning levels which fulfills the main purpose of this demonstration.
47-48: LGTM: Proper subplot adjustment and plt.show() added.The subplot adjustment improves readability, and
plt.show()is added at the end as requested in the previous review.
|
@jcharkow I just committed these minor changes. I think everything is ok now but let me know if you want to make any other adjustments! |
jcharkow
left a comment
There was a problem hiding this comment.
Looks good ready to merge!


Added doc on how to adapt bin size in peakmap plots based on their functionality and what was written in the file PeakMap.ipynb.
I also changed the default value of bin_peaks, showed in the parameters table from "False" to "auto" because this is written to be the default value.
Summary by CodeRabbit