Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User-specified medians and conf. intervals in boxplots #906

Closed
wants to merge 22 commits into from

Conversation

phobson
Copy link
Member

@phobson phobson commented May 27, 2012

First, here's a test script: https://gist.github.com/2814818

I know I've submitted at least one other PR for this, but this time I've think I've got it right (or at least much closer).

Basically, the user provides a list of medians and confidence intervals where (if using numpy arrays)
usermedians.shape = (N,)
conf_intervals.shape = (N,2)

and the data to be plotted has: data.shape = (M,N).

All of this allows the user to compute the medians and its confidence intervals outside of the boxplot function using more statistically robust methods of his or her choice.

I've used a lot of assert statements to verify that the usermedians and conf_intervals inputs are compatible with the data being plotted (x in the axes.boxplot call signature).

Hope this is useful and can be incorporated into the library.

Thanks for all of the hard work
-paul

@pelson
Copy link
Member

pelson commented May 29, 2012

I've been thinking of implementing something similar recently, but you've beat me to it ;-)

I notice your spacing is not looking so hot. What editor are you using? Does it put tabs in instead of 4 spaces? Would it be easy for you be able to correct this before we go much further in the review?

@phobson
Copy link
Member Author

phobson commented May 29, 2012

Yikes! Yes. I'll fix that. I cooked up a quick VM at home this weekend
-- left the work machine at home :-) -- and it looks like i didn't get
my vimrc set up right.

Thanks for looking at the PR.
-paul

On Tue, May 29, 2012 at 7:33 AM, Phil Elson
reply@reply.github.com
wrote:

I've been thinking of implementing something similar recently, but you've beat me to it ;-)

I notice your spacing is not looking so hot. What editor are you using? Does it put tabs in instead of 4 spaces? Would it be easy for you be able to correct this before we go much further in the review?


Reply to this email directly or view it on GitHub:
#906 (comment)

@phobson
Copy link
Member Author

phobson commented May 29, 2012

OK. This is looking good on my system now. Rebuilt matplotlib from scratch, ran my script, and everything worked as expected.

bsIndex = np.random.random_integers(0,M-1,M)
bsData = data[bsIndex]
estimate[n] = mlab.prctile(bsData, 50)
CI = mlab.prctile(estimate, percentile)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just occurred to me that we should use numpy.percentile now (assuming it's available in minimum version of numy that MPL supports.

@mdboom
Copy link
Member

mdboom commented Jun 1, 2012

This looks good. Can you add an example and a unit test?

text_transform= mtransforms.blended_transform_factory(ax.transData,
ax.transAxes)
ax.set_xlabel('treatment')
ax.set_ylabel('response')
ax.set_ylim(-0.2, 1.4)
#ax.set_ylim(-0.2, 1.4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment serves no purpose other than to make it harder to read. Would you mind just nuking it?

@pelson
Copy link
Member

pelson commented Jun 15, 2012

Would you mind adding a simple unit test?

Other than that, this change gets my thumbs up. Great work!

@phobson
Copy link
Member Author

phobson commented Jun 15, 2012

Phil, I will happily address all of these comments shortly. It has been a
hectic two weeks at the office and I'm on vacation shortly. Thank you so
much for looking at the PR.

On Friday, June 15, 2012, Phil Elson wrote:

Would you mind adding a simple unit test?

Other than that, this change gets my thumbs up. Great work!


Reply to this email directly or view it on GitHub:
#906 (comment)

@pelson
Copy link
Member

pelson commented Jun 15, 2012

@phobson: No problems. Have a great break!

@phobson
Copy link
Member Author

phobson commented Jul 15, 2012

@pelson

I addressed your comments. Sorry for the delay. In summary:

  1. switch the vert/notch kwargs over to true/false (though 1/0 still work as in the demos) and updated the docstring to reflect this
  2. combined my conf_intervals if-logic into a single statement and removed the lines that became extraneous as a result
  3. added a unit test + baseline images
  4. modified a demo to show this functionality
  5. enhanced the docs about the dictionary that is returned

@WeatherGod
Copy link
Member

@pelson, has the OP addressed your concerns here?

msg2 = "usermedians' length must be compatible with x"
if usermedians is not None:
if hasattr(usermedians, 'shape'):
assert len(usermedians.shape) == 1, msg1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do asserts for checking user inputs. Raise a ValueError instead.

@WeatherGod
Copy link
Member

Neat work. We are close to getting this merged in. One more very important thing to add is a note in doc/users/whats_new.rst. Also, you need to run boilerplate.py in the matplotlib's source directory in order to regenerate the pyplot file.

# conf. intervals from user, if available
if conf_intervals is not None and \
conf_intervals[i] is not None:
notch_max = np.max(conf_intervals[i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indentation looks a little funny. Have tabs been used here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looked a little funny since the conditional was wrapped around the second line. python parsed it fine, but i've lined everything up to look nicer. will see with my next commit/push

@pelson
Copy link
Member

pelson commented Jul 22, 2012

@pelson, has the OP addressed your concerns here?

Yes. I am finding it hard to assert that the logic/statistics are identical (thanks to the improvement of layout). But what I can see looks good. Gets my +1.

@WeatherGod
Copy link
Member

@pelson, good to know. Tasks that remain: get pyplot.py regenerated, odd indentation fixed (or explained), and the new entry to the whats_new.rst,

@phobson
Copy link
Member Author

phobson commented Jul 22, 2012

@WeatherGod Sorry to require so much hand-holding, here. I ran boilerplate.py like this:
paul@flint ~/sources/matplotlib $ python boilerplate.py
[prints lots of stuff to terminal, then...]

paul@flint ~/sources/matplotlib $ git status

On branch manual-boxplots-2

nothing to commit (working directory clean)
paul@flint ~/sources/matplotlib $

There don't seem to be any calls to sys.argv, so i'm not sure how to get this to behave properly. Any advice would be much appreciated. Thanks.

@pelson
Copy link
Member

pelson commented Jul 22, 2012

It may be that your branch is based on a version before the bolierplate script was updated. My advice at this stage would be to rebase your branch (the rebase is needed anyway), and then run the boilerplate.py script. The workflow goes something like:

git checkout manual-boxplots-2
git fetch upstream
git rebase upstream/master
# Fix conflicts, if any
git push -f origin manual-boxplots-2

@phobson
Copy link
Member Author

phobson commented Jul 23, 2012

@pelson thanks for the git-fu! worked like a charm.

@@ -670,6 +670,7 @@ def test_hist_log():
ax.set_xticks([])
ax.set_yticks([])

<<<<<<< HEAD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you missed something while resolving conflicts. This means you didn't run the tests after rebasing. Please run the tests to make sure everything looks good. This may also impact the resulting test images because changes may have occurred to the rendering algorithms (line snapping, text anti-aliasing, etc.).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WeatherGod: Just cleared that up. Sorry for the slow response. All tests just passed.

@pelson
Copy link
Member

pelson commented Aug 11, 2012

I've just merged this by hand (447a7cc), squashing the commits and resolving the remaining conflicts.

@phobson: Thanks for doing this work! If you have any more enhancements, I would be more than happy to review them!

@pelson pelson closed this Aug 11, 2012
@phobson
Copy link
Member Author

phobson commented Aug 16, 2012

@pelson Thanks for squashing and merging! Really glad I could contribute back to matplotlib! I also really appreciate the feedback and guidance through the whole process from you and everyone else.

@phobson phobson deleted the manual-boxplots-2 branch January 27, 2014 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants