Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Order bar charts #295

Closed
Akaban opened this issue Aug 1, 2019 · 5 comments
Closed

Order bar charts #295

Akaban opened this issue Aug 1, 2019 · 5 comments

Comments

@Akaban
Copy link

Akaban commented Aug 1, 2019

Hello,

I would like to reopen issue 94 I don't think it's really user friendly to not include automatic sorting in geom_bar function. If I understand correctly we need to make a pandas.Categorical each time we want to define the order of our graph.

That's something I find really painful. Can't we implement a keyword_argument sort_by="xxxx" in geom_bar?

I can do the MR if it's not too much complicated, I don't know plotnine code as of now :)

Thanks for the answer

By the way thanks for this incredible package, easily the best python package for graphs available :)

@Akaban
Copy link
Author

Akaban commented Aug 1, 2019

I think we should at least respect the order of the dataframe, it does not seem to be the case today

EDIT: Okay actually data is sorted before ggplot rendering, but it sort the x-axis alphabetically. We then should be able to pass sort=False in some kind of way to respect the order of the dataframe.

@Akaban
Copy link
Author

Akaban commented Aug 1, 2019

For readers just so you know I found a "hack" to get cool bar plots without much effort.

1/ Sort your dataframe as you wish
2/ Compute a column rank which indicates the rank of the line (you may want to do this with group by)
3/ Compute new string column df["column"] = df["rank"].map(chr) + " - " + df["column"]

This ensure the alphabetic ordering of the column matches the order of rank thanks to the chr function, if add an extra useless character in rendering though.

:) I would still have preferred an explicit way to do this through plot nine

@TyberiusPrime
Copy link
Contributor

Plotnine has inherited this from ggplot2, which respects the sort order of the datatypes involved, not the order of the data points.
I actually think that's a perfectly valid point of view that leads to fewer unexpected changes in plots.

Turning a column in a sorted dataframe into a categorical isn't that bad, though a bit verbose:
df.assign(column = pd.Categorical(df.column, df.column)).

Guess I'll be adding a convenient method for that in my dppd library...

@has2k1
Copy link
Owner

has2k1 commented Aug 2, 2019

Sorting is standard and is handled automatically by the scale depending on the datatype (mentioned by @TyberiusPrime) , not the geom. Plus the scales are independent, the x scale knows nothing about any other column!

@isabelizimm
Copy link
Contributor

For those stumbling upon this issue recently, you can do something like

(data
 >> ggplot()
        + geom_col(aes(x = 'reorder(col1, col2)', y = 'col2'))
)

after you have reordered your data as you wish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants