Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hsplit, vsplit inconsistent with numpy? #29

Closed
ivirshup opened this issue Apr 5, 2024 · 3 comments
Closed

hsplit, vsplit inconsistent with numpy? #29

ivirshup opened this issue Apr 5, 2024 · 3 comments

Comments

@ivirshup
Copy link

ivirshup commented Apr 5, 2024

Description

Hey! While going over scverse/scanpy-tutorials#97 I noticed a couple things and thought I would follow them up here.

Marsilea's definition of hsplit and vsplit seem inconsistent with what I'd expect coming from numpy. They seem to actually act along opposite axes

Example

import numpy as np
import marsilea as mars

X = np.arange(12).reshape(4, 3)
display(X)
mars.Heatmap(X).render()
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])
display(np.hsplit(X, [2]))
m = mars.Heatmap(X)
m.hsplit([2], spacing=0.1)
m.render()
[array([[ 0,  1],
        [ 3,  4],
        [ 6,  7],
        [ 9, 10]]),
 array([[ 2],
        [ 5],
        [ 8],
        [11]])]

Suggestion

Personally, I think the numpy versions make more sense. However, I also mess this up frequently. Since Marsilea only needs to deal with two dimensional grids I would suggest moving to:

  • group_rows/ group_columns, since this is quite like a group by operation without the aggregate
  • split_rows and split_columns (a bit like ComplexHeatmap)
@Mr-Milk
Copy link
Member

Mr-Milk commented Apr 7, 2024

Yes, it's different. I want to make it more visually intuitive for the API name at first, but it looks like hsplit/vsplit may confuse others.

Thanks for the suggestions, I think it's a good idea to divide the current split into group_* and split_*. But Marsilea can split non-matrix plots like barplot or violin plot, so the endings in rows and columns may be confusing in these cases.

@ivirshup
Copy link
Author

ivirshup commented Apr 8, 2024

What would the difference between group_* and split_* be to you? My preference at the moment would be if there was only a group, especially since you can pass essentially the same argument here as you would pass to DataFrame.groupby.

To clarify, previously I was suggesting either split or group. I'm not sure I like "both" since they basically do the same thing, and it's nicer if there's only one way to do it.

But Marsilea can split non-matrix plots like barplot or violin plot, so the endings in rows and columns may be confusing in these cases.

I think rows and columns still makes sense for those plots. I believe you still end up with rows and columns, it's just that each entry can be a violin. Addmitedly, I don't think grouping by columns makes sense in a plot like this bar chart: https://marsilea.readthedocs.io/en/stable/auto_examples/plot_oil_well.html#sphx-glr-auto-examples-plot-oil-well-py

@Mr-Milk
Copy link
Member

Mr-Milk commented Apr 8, 2024

The signature for hsplit/vsplit is hsplit(cut=None, labels=None, order=None, spacing=0.01). Users can either specify cut to cut the plot using the index of data or specify labels to group the plot. split_* can be used to handle the cut parameter and group_* can be used for labels parameter.

rows and columns are indeed clearer than horizontal or vertical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants