Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH/REF: Overhauled boxplots #2643

Merged
merged 44 commits into from Jan 24, 2014
Merged

Conversation

phobson
Copy link
Member

@phobson phobson commented Dec 3, 2013

In a nut shell here is what has happened:

  1. Created a function (cbook.boxplot_stats) that take any data that axes.boxplot could handle and generates a list of dicts containing the statistics needed to draw a set of boxplots.
  2. Then I created a new method (axes.bxp) that accepts dictionaries from cbook.boxplot_stats and allows the user to customize how the plots are rendering. In other words, dictionaries of style **kwargs can be passed configuring the style of the boxes/whisker/caps, the outliers, the medians, the means (new functionality).
  3. The new bxp method also allows the user to toggle the drawing of any element of the boxplot via **kwargs (i.e., shownotches, showboxes, showfliers, showmeans, showcaps).
  4. The new bxp looks for a label key in the dictionaries. If it's available, it'll use that as the x-tick label. Otherwise, the labels default to the boxplot's position.
  5. And finally, the original boxplot method is functionally unchanged, but now it relies on cbook.boxplot_stats and axes.bxp to do all the work.
  6. Added a bunch of tests

Potential remaining items:

  1. Well, I definitely need to draft up a good example to show all of this off. That's coming shortly (pending feedback on the existing portions of the PR).
  2. We might want to expose the full capability of bxp in boxplot (i.e., the style and show* kwargs). Currently, boxplot's API/capabiltiy is unchanged. I'm fine with this but I'll need to pretty much duplicate all of the bxp tests to make sure nothings gets lost in translation, (done)
  3. Rerunning boilerplate.py didn't seem to pick up on bxp. Do I need to clear it out first?
  4. We might want to add boxplot_stats to pyplot and pylab. (not gonna happen)
  5. And it looks like boilerplate.py added a bunch of unicode strings to the call sigs and now the 3.2 build is failing (dang it). (fixed that)

This PR addresses the following issue (or allows users to address them themselves):

  1. New boxplot features #1199
  2. New features for boxplot #217
  3. Adds option to plot average in boxplot, besides the median #2520
  4. Boxplot: allow whiskers to always cover entire range #1455

There is all #1027 out there. While we don't address that issue directly. I personally agree with matplotlib's approach to boxplots that brings up that issue. In other words, that's a feature, not a bug, IMO

@tacaswell
Copy link
Member

I don't think it is worth re-fitting boxplot to replicate the new functionality. It is important to keep it's interface as-is for backwards compatibility, but point users who want the new features to the new function (which is where the examples come in ;) )

You need to add it to the list _plotcommands in boilerplate.py to get picked up.

There was discussion about doing away with pylab in the last hangout and I don't think it is a good fit in pyplot.

@phobson
Copy link
Member Author

phobson commented Dec 3, 2013

@tacaswell To be fair, I think we can add the new stuff to boxplot without breaking anything. We just have to add bxp's kwargs to boxplot's call signature, document them, and pass them directly to bxp (which already happens in a very limited fashion),

Thanks for the other tips. I'll look into them soon.

def dopatch(xs, ys, **kwargs):
xs, ys = ys, xs # flip X, Y
return patch_list(xs, ys, **kwargs)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to have this logic shuffle/flip the data and only have the plotting once rather than creating inner functions which do both the data munging and the plotting (sorry if it become obvious later why you did it this way).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tacaswell I could definitely rearrange that as you describe. this is just an vestige of the original boxplot code. luckily I think i have enough tests now that I'll know if I break something :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tacaswell, my first reaction probably would be like yours, but looking farther down in the code, I think that using these local functions is actually the cleanest way to encapsulate the axis flip.

@tacaswell
Copy link
Member

I left a few picky comments that other should comment on before you do anything about.

I am also not sure about the * in the doc strings.

@phobson
Copy link
Member Author

phobson commented Dec 13, 2013

@tacaswell thanks for the feedback. As you suggested, I'll wait for more comments. The *s in the docstring are also vestiges of the original code. I think it'll be safe to proceed on converting the docstrings to the numpy format.

@tacaswell
Copy link
Member

Your comment re pandas does raise the interesting point that matplotlib.financial will currently make candlestick plots which should maybe be re-written using this. On the other hand, that module will be pulled out in 1.5.

This looks like a really nice PR.

@phobson
Copy link
Member Author

phobson commented Dec 13, 2013

Happy to take a look at the candlestick, as it should be pretty easy. However, that'll have to happen in a separate PR if the Repo Collabs decide it's worth it. Can you provide some guidance with my examples? Namely, do I need to include the output figures or does the doc-building process take care of that?

@tacaswell
Copy link
Member

pretty sure the doc-build process takes care of that, but I have not done much with it. The easy thing is to run the doc-build locally and see if it takes care of it.

Yes, I was not suggesting you do the candlestick in this PR (or even that you should do it). It would be good to find a person who does finance to take care of that module/toolkit, but that is irrelevant for this PR.

@phobson
Copy link
Member Author

phobson commented Jan 2, 2014

@tacaswell
Just chiming in to say that I built the docs and the new examples were included. Any other feedback, suggestions, requests, etc are welcome!

@tacaswell
Copy link
Member

@phobson This needs a re-base.

It looks good to merge to me, but I don't want to push the button on this much new code without someone else looking at it.

@phobson
Copy link
Member Author

phobson commented Jan 2, 2014

@tacaswell do i need to rebase to squash the commits, to sync with current master, or both?

relies on a new function (:func:`~matplotlib.cbook.boxplot_stats`), which
accepts any data structure currently compatible with
:func:`~matplotlib.pyplot.boxplot`, amd returns a list of dictaries
containing the positions of each of artists for the boxplots. Then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "each of the artists"? This is still not very clear.

tacaswell added a commit that referenced this pull request Jan 24, 2014
@tacaswell tacaswell merged commit cc6ae3d into matplotlib:master Jan 24, 2014
@tacaswell
Copy link
Member

@phobson Merged! Thanks for this.

@phobson
Copy link
Member Author

phobson commented Jan 24, 2014

@tacaswell wooo! thanks for the guidance and patience

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants