[query] Refactor ggplot to use pandas, support alpha on histograms and bar charts#11317
Conversation
… to check how many groups there are
… It's a string, but it's not actually a discrete variable
|
I'm still figuring out what the abstractions should be here. This refactoring has resulted in the |
patrick-schultz
left a comment
There was a problem hiding this comment.
Looks good! Just an obsoleted comment.
| for s, i in zip(listified_agg_result, numbering): | ||
| numbered_result.append(s.annotate(group=i)) | ||
| listified_agg_result = numbered_result | ||
| df_agg_result["group"] = [numberer[tuple(x)] for _, x in subsetted_to_discrete.iterrows()] |
There was a problem hiding this comment.
You can leave this for future improvement, but I suspect you could do this more easily and faster with pandas groupby.
There was a problem hiding this comment.
I am going to do this in future improvement, since I'm currently in the midst of refactoring this a bit due to a bug report about running out of colors to plot with.
This PR:
listifyto return pandas dataframes, and every geom to take in pandas dataframes. This significantly simplifies the code and should also speed things up.apply_to_figmethod of most of the geoms to rely on a specification dict and then just loop over that. This simplifies adding a new argument / aesthethic. In the future, it may be best to just make all of the geoms take**kwargsso that arguments added to that dict will immediately be used for plotting, as right now I have to add something togeom_bar,GeomBar.__init__, and that dictionary for it to start showing up in plots.identityas a bar position, which means to plot bars on top of each other. This is useful along with thealphaargument added to bars and histograms, which sets transparency of points.