-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reworked bar plot #325
Reworked bar plot #325
Conversation
It is now possible to: - Stack bars of different colors. In fact this is the default behaviour when a color of fill is specified (like R's ggplot) - Reorder columns in a bar plot by adding a label parameter as so: geom_bar(labels=["blip","blop","dyt","båt"]) as discussed here: yhat#315 - Show only a select few of the columns (as above, just don't include all possible x-values) - Use bar plots when faceting (although faceting overwrites the x-labels for some reason: yhat#319) This addresses yhat#196 In order to do this I had to make a few changes. - Most importantly I added a method '_calculate_global' to stats.py. It is not required to implement it in subclasses but for stat_bin it makes it possible to read in all labels ahead of time and then when we are processing separate groups fill the labels not represented with 0's. - I added a new stat class 'stat_bar' which for simple x/y bar plots (not histograms) take advantage of '_calculate_global' to deliver consistent bar plots when using facets or when stacking colors - I changed geom_bar to use stat_bar per default and geom_histogram to use stat_bin. This differentiates the bar plot and the histogram in a way that makes intuitive sense (I would think at least) - I disabled sorting the x-axis by default to preserve user specified label orders. As far as I can see only geom_area needs the x-axis to be ordered. I've added code to sort the x-axis inside geom_area.
This looks really awesome! I'm inclined to merge this PR over my own then work in Could you speak a little more on |
That's great to hear. I was worried the work had been in vain. The idea behind The motivation behind it was a problem I had with Currently I pass the entire data column to Another benefit is that |
It turns out that geom_line also expects x-values to be sorted. This commit moves the function ```_sort_list_types_by_x``` to geom.py and renames it to ```sort_by_x``` and adds a call to it in ```geom_line```
It turns out that geom_line also expects x-values to be sorted. This commit moves the function ```_sort_list_types_by_x``` to geom.py and renames it to ```sort_by_x``` and adds a call to it in ```geom_line``` (amended to no longer break build)
Conflicts: ggplot/geoms/geom.py
If we are to conform with ggplot2, The |
@ericchiang, Sure I'll look in to that. Should I strike the @has2k1 I agree with you that I've looked in to the issues that you referenced and I realize that another pull request I've made (#327 - reworked legends) might also conflict with the refactoring that I see you and @JanSchulz discussing in #283. At the same time it corrects a lot of other small issues with legends that I think might be useful in any case. |
The way to think about it is, Getting the scales sorted out would allow us to add scale training -- which really means find the ranges of the data. The
I'll add a comment at #327. |
@has2k1, thanks for your comment. The code in I wonder though, would it not make sense to have two geoms ( |
That makes sense and I would prefer it that way, but it would be a major break in compatibility with ggplot2. One thing I thought of to stride between the two is be smart and do a bar plot if the |
I like that solution, but I don't think it's possible currently. |
The distinction would be in |
While there are issues to work out I'd still like to merge in the PR for the greater goal of having the ability to stack bar plots. Can we forego a little compatibility with ggplot2 for that in the meantime? |
Although we get the to stack bar plots, it comes at a cost. This PR introduces an internal API that will ultimately need to be replaced if further progress is to be made. This API will most likely be used to fix other bugs. When I considered doing something similar, I counted about 4 bugs that it would solve. So, whenever it gets removed there will be complications and regressions. |
In addition the geom_histogram is no longer a separate class
I've just pushed a commit with your suggested changes, @ericchiang. I've reverted the behaviour to use @has2k1, I liked the idea of defaulting to I can also see the argument against adding an internal API just to superseed it with something else. One solution might be to make the introduced API less useful so that it solves just the problem of stacking bar plots in which case it should lead to less regressions. |
sorry for the delay |
Any update on |
Rework of bar plot
I realize that @ericchiang just added another pull request for something similar here: #324 so in light of that I'm putting this up here so that we might merge the two. The differences as I can see between the two patches are:
It is now possible to:
when a color of fill is specified (like R's ggplot)
geom_bar(labels=["blip","blop","dyt","båt"]) as discussed here:
How should users reorder the x axis bins in a bar chart? #315
all possible x-values)
x-labels for some reason: facet_wrap() removes color legends #319)
This addresses facets with descrete values (e.g. geom_bar) does not work #196
In order to do this I had to make a few changes.
is not required to implement it in subclasses but for stat_bin it
makes it possible to read in all labels ahead of time and then when we
are processing separate groups fill the labels not represented with
0's.
(not histograms) take advantage of '_calculate_global' to deliver
consistent bar plots when using facets or when stacking colors
use stat_bin. This differentiates the bar plot and the histogram in a
way that makes intuitive sense (I would think at least)
label orders. As far as I can see only geom_area needs the x-axis to
be ordered. I've added code to sort the x-axis inside geom_area.
Examples
(from #196 but modified to show color)