Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Boxplot stats w/ equal quartiles #5343
Conversation
tacaswell
added the
needs_review
label
Oct 29, 2015
mdboom
commented on an outdated diff
Oct 29, 2015
| @@ -1867,39 +1867,41 @@ def delete_masked_points(*args): | ||
| return margs | ||
| -def boxplot_stats(X, whis=1.5, bootstrap=None, labels=None): | ||
| - ''' | ||
| - Returns list of dictionaries of staticists to be use to draw a series of | ||
| - box and whisker plots. See the `Returns` section below to the required | ||
| - keys of the dictionary. Users can skip this function and pass a user- | ||
| - defined set of dictionaries to the new `axes.bxp` method instead of | ||
| - relying on MPL to do the calcs. | ||
| +def boxplot_stats(X, whis=1.5, autorange=False, bootstrap=None, | ||
| + labels=None): | ||
| + """ | ||
| + Returns list of dictionaries of staticists to be use to draw a |
|
|
mdboom
commented on an outdated diff
Oct 29, 2015
| @@ -1867,39 +1867,41 @@ def delete_masked_points(*args): | ||
| return margs | ||
| -def boxplot_stats(X, whis=1.5, bootstrap=None, labels=None): | ||
| - ''' | ||
| - Returns list of dictionaries of staticists to be use to draw a series of | ||
| - box and whisker plots. See the `Returns` section below to the required | ||
| - keys of the dictionary. Users can skip this function and pass a user- | ||
| - defined set of dictionaries to the new `axes.bxp` method instead of | ||
| - relying on MPL to do the calcs. | ||
| +def boxplot_stats(X, whis=1.5, autorange=False, bootstrap=None, | ||
| + labels=None): | ||
| + """ | ||
| + Returns list of dictionaries of staticists to be use to draw a | ||
| + series of box and whisker plots. See the `Returns` section below to | ||
| + the required keys of the dictionary. Users can skip this function | ||
| + and pass a user-defined set of dictionaries to the new `axes.bxp` | ||
| + method instead of relying on MPL to do the calcs. |
|
|
mdboom
commented on an outdated diff
Oct 29, 2015
| - `whis` will be automatically set to 'range' | ||
| - | ||
| - bootstrap : int or None (default) | ||
| - Number of times the confidence intervals around the median should | ||
| - be bootstrapped (percentile method). | ||
| - | ||
| - labels : sequence | ||
| - Labels for each dataset. Length must be compatible with dimensions | ||
| - of `X` | ||
| + As a float, determines the reach of the whiskers past the first | ||
| + and third quartiles (e.g., Q3 + whis*IQR, QR = interquartile | ||
| + range, Q3-Q1). Beyond the whiskers, data are considered outliers | ||
| + and are plotted as individual points. This can be set this to an | ||
| + ascending sequence of percentile (e.g., [5, 95]) to set the | ||
| + whiskers at specific percentiles of the data. Finally, `whis` | ||
| + can be the string 'range' to force the whiskers to the min and |
|
|
mdboom
and 1 other
commented on an outdated diff
Oct 29, 2015
| - Number of times the confidence intervals around the median should | ||
| - be bootstrapped (percentile method). | ||
| - | ||
| - labels : sequence | ||
| - Labels for each dataset. Length must be compatible with dimensions | ||
| - of `X` | ||
| + As a float, determines the reach of the whiskers past the first | ||
| + and third quartiles (e.g., Q3 + whis*IQR, QR = interquartile | ||
| + range, Q3-Q1). Beyond the whiskers, data are considered outliers | ||
| + and are plotted as individual points. This can be set this to an | ||
| + ascending sequence of percentile (e.g., [5, 95]) to set the | ||
| + whiskers at specific percentiles of the data. Finally, `whis` | ||
| + can be the string 'range' to force the whiskers to the min and | ||
| + max of the data. In the edge case that the 25th and 75th | ||
| + percentiles are equivalent, `whis` can be automatically set to | ||
| + 'range' via the ``autorange`` option. |
phobson
Member
|
mdboom
commented on an outdated diff
Oct 29, 2015
| - be bootstrapped (percentile method). | ||
| - | ||
| - labels : sequence | ||
| - Labels for each dataset. Length must be compatible with dimensions | ||
| - of `X` | ||
| + As a float, determines the reach of the whiskers past the first | ||
| + and third quartiles (e.g., Q3 + whis*IQR, QR = interquartile | ||
| + range, Q3-Q1). Beyond the whiskers, data are considered outliers | ||
| + and are plotted as individual points. This can be set this to an | ||
| + ascending sequence of percentile (e.g., [5, 95]) to set the | ||
| + whiskers at specific percentiles of the data. Finally, `whis` | ||
| + can be the string 'range' to force the whiskers to the min and | ||
| + max of the data. In the edge case that the 25th and 75th | ||
| + percentiles are equivalent, `whis` can be automatically set to | ||
| + 'range' via the ``autorange`` option. | ||
| + autorange : bool (default = False) |
|
|
mdboom
commented on an outdated diff
Oct 29, 2015
| - | ||
| - labels : sequence | ||
| - Labels for each dataset. Length must be compatible with dimensions | ||
| - of `X` | ||
| + As a float, determines the reach of the whiskers past the first | ||
| + and third quartiles (e.g., Q3 + whis*IQR, QR = interquartile | ||
| + range, Q3-Q1). Beyond the whiskers, data are considered outliers | ||
| + and are plotted as individual points. This can be set this to an | ||
| + ascending sequence of percentile (e.g., [5, 95]) to set the | ||
| + whiskers at specific percentiles of the data. Finally, `whis` | ||
| + can be the string 'range' to force the whiskers to the min and | ||
| + max of the data. In the edge case that the 25th and 75th | ||
| + percentiles are equivalent, `whis` can be automatically set to | ||
| + 'range' via the ``autorange`` option. | ||
| + autorange : bool (default = False) | ||
| + When True and the data are distributed such that the 25th and |
|
|
mdboom
commented on an outdated diff
Oct 29, 2015
| - labels : sequence | ||
| - Labels for each dataset. Length must be compatible with dimensions | ||
| - of `X` | ||
| + As a float, determines the reach of the whiskers past the first | ||
| + and third quartiles (e.g., Q3 + whis*IQR, QR = interquartile | ||
| + range, Q3-Q1). Beyond the whiskers, data are considered outliers | ||
| + and are plotted as individual points. This can be set this to an | ||
| + ascending sequence of percentile (e.g., [5, 95]) to set the | ||
| + whiskers at specific percentiles of the data. Finally, `whis` | ||
| + can be the string 'range' to force the whiskers to the min and | ||
| + max of the data. In the edge case that the 25th and 75th | ||
| + percentiles are equivalent, `whis` can be automatically set to | ||
| + 'range' via the ``autorange`` option. | ||
| + autorange : bool (default = False) | ||
| + When True and the data are distributed such that the 25th and | ||
| + 75th percentiles are equal, ``whis`` is set to "range" such that |
|
|
mdboom
commented on an outdated diff
Oct 29, 2015
| - Labels for each dataset. Length must be compatible with dimensions | ||
| - of `X` | ||
| + As a float, determines the reach of the whiskers past the first | ||
| + and third quartiles (e.g., Q3 + whis*IQR, QR = interquartile | ||
| + range, Q3-Q1). Beyond the whiskers, data are considered outliers | ||
| + and are plotted as individual points. This can be set this to an | ||
| + ascending sequence of percentile (e.g., [5, 95]) to set the | ||
| + whiskers at specific percentiles of the data. Finally, `whis` | ||
| + can be the string 'range' to force the whiskers to the min and | ||
| + max of the data. In the edge case that the 25th and 75th | ||
| + percentiles are equivalent, `whis` can be automatically set to | ||
| + 'range' via the ``autorange`` option. | ||
| + autorange : bool (default = False) | ||
| + When True and the data are distributed such that the 25th and | ||
| + 75th percentiles are equal, ``whis`` is set to "range" such that | ||
| + the whisker ends are at the min and max of the data. |
|
|
|
Cool. Much improved. |
|
@mdboom thanks for the comments. I think I got them all as they came in ...aaaaaand here's the part where I pitch the idea that we add in an option to pass your own bootstrapper. Something like: def boxplots_stats(..., bootstrap_fxn=None):
if bootstrap_fxn is None:
bootstrap_fxn = _bootstrap_median
# ...
def _compute_conf_interval(data, med, iqr, bootstrap):
if bootstrap is not None:
# Do a bootstrap estimate of notch locations.
# get conf. intervals around median
CI = bootstrap_fxn(data, N=bootstrap)
notch_min = CI[0]
notch_max = CI[1]
else:
N = len(data)
notch_min = med - 1.57 * iqr / np.sqrt(N)
notch_max = med + 1.57 * iqr / np.sqrt(N)
# yada yada |
|
I am starting to think that box plots are a complicated enough topic that they should be spun off into a sub-project (which is allowed to do things like require pandas and scipy). Probably take violin plots with them too. |
|
Isn't that essentially seaborn? |
|
We would also still want basic boxplot/violinplot functionality for those On Wed, Oct 28, 2015 at 10:28 PM, Elliott Sales de Andrade <
|
|
rebased with current master (branch had gotten stale) |
tacaswell
commented on an outdated diff
Nov 24, 2015
| If the function should adjust the xlim and xtick locations. | ||
| + autorange : bool, optional (False) | ||
| + When `True` and the data are distributed such that the 25th and | ||
| + 75th percentiles are equal, ``whis`` is set to ``'range'`` such | ||
| + that the whisker ends are at the minimum and maximum of the | ||
| + data. | ||
| + meanline : bool, optional (False) |
tacaswell
Owner
|
phobson
referenced
this pull request
Jan 26, 2016
Closed
Boxplot with zero IQR sets whiskers to max and min and leaves no outliers #5331
tacaswell
added this to the
1.5.2 (Critical bug fix release)
milestone
Jan 26, 2016
|
tagged this as a bug fix, but I am not sure if that is the correct tag for this. |
|
@tacaswell should I rebase on 1.5.X or 2? |
|
on to master then we will back-port the merge to where ever we decide this will go. |
|
It looks like you committed the conflicts in the SVG files:
|
|
gah -- thanks for the heads up. @QuLogic |
|
Any objections to removing the PDF/SVG files for this test entirely? |
|
No objection from me, makes the tests go faster :) |
|
unrelated failure in py3.4
|
phobson
closed this
Jan 27, 2016
phobson
reopened this
Jan 27, 2016
mdboom
added needs_review and removed needs_review
labels
Jan 27, 2016
phobson
added some commits
Oct 29, 2015
|
... another rebase with current master |
QuLogic
commented on an outdated diff
Feb 18, 2016
| If the function should adjust the xlim and xtick locations. | ||
| + autorange : bool, optional (False) | ||
| + When `True` and the data are distributed such that the 25th and | ||
| + 75th percentiles are equal, ``whis`` is set to ``'range'`` such | ||
| + that the whisker ends are at the minimum and maximum of the | ||
| + data. | ||
| + meanline : bool, optional (False) | ||
| + If `True` (and ``showmeans`` is `True`), will try to render | ||
| + the mean as a line spanning the full width of the box | ||
| + according to ``meanprops`` (see below). Not recommended if | ||
| + ``shownotches`` is also True. Otherwise, means will be shown | ||
| + as points. | ||
| + | ||
| + Additional Options | ||
| + --------------------- | ||
| + The following boolean options toogle the drawing of individual |
|
|
QuLogic
commented on an outdated diff
Feb 18, 2016
| + data. | ||
| + meanline : bool, optional (False) | ||
| + If `True` (and ``showmeans`` is `True`), will try to render | ||
| + the mean as a line spanning the full width of the box | ||
| + according to ``meanprops`` (see below). Not recommended if | ||
| + ``shownotches`` is also True. Otherwise, means will be shown | ||
| + as points. | ||
| + | ||
| + Additional Options | ||
| + --------------------- | ||
| + The following boolean options toogle the drawing of individual | ||
| + components of the boxplots: | ||
| + - showcaps: the caps on the ends of whiskers | ||
| + (default is True) | ||
| + - showbox: the central box (default is True) | ||
| + - showfliers: the outlierd beyone the caps (default is True) |
|
|
QuLogic
commented on an outdated diff
Feb 18, 2016
| @@ -1760,39 +1760,42 @@ def delete_masked_points(*args): | ||
| return margs | ||
| -def boxplot_stats(X, whis=1.5, bootstrap=None, labels=None): | ||
| - ''' | ||
| - Returns list of dictionaries of staticists to be use to draw a series of | ||
| - box and whisker plots. See the `Returns` section below to the required | ||
| - keys of the dictionary. Users can skip this function and pass a user- | ||
| - defined set of dictionaries to the new `axes.bxp` method instead of | ||
| - relying on MPL to do the calcs. | ||
| +def boxplot_stats(X, whis=1.5, autorange=False, bootstrap=None, | ||
| + labels=None): | ||
| + """ | ||
| + Returns list of dictionaries of statistics used to draw a series | ||
| + of box and whisker plots. See the `Returns` section below to the |
|
|
QuLogic
commented on an outdated diff
Feb 18, 2016
| + that the 25th and 75th percentiles are equivalent, *whis* | ||
| + will be automatically set to ``'range'``. | ||
| + bootstrap : int, optional | ||
| + Specifies whether to bootstrap the confidence intervals | ||
| + around the median for notched boxplots. If bootstrap==None, | ||
| + no bootstrapping is performed, and notches are calculated | ||
| + using a Gaussian-based asymptotic approximation (see McGill, | ||
| + R., Tukey, J.W., and Larsen, W.A., 1978, and Kendall and | ||
| + Stuart, 1967). Otherwise, bootstrap specifies the number of | ||
| + times to bootstrap the median to determine its 95% | ||
| + confidence intervals. Values between 1000 and 10000 are | ||
| + recommended. | ||
| + usermedians : array-like, optional | ||
| + An array or sequence whose first dimension (or length) is | ||
| + compatible with ``x``. This overrides the medians computed | ||
| + by matplotlib for each element of *usermedians* that is not |
QuLogic
Member
|
QuLogic
commented on an outdated diff
Feb 18, 2016
| + will be automatically set to ``'range'``. | ||
| + bootstrap : int, optional | ||
| + Specifies whether to bootstrap the confidence intervals | ||
| + around the median for notched boxplots. If bootstrap==None, | ||
| + no bootstrapping is performed, and notches are calculated | ||
| + using a Gaussian-based asymptotic approximation (see McGill, | ||
| + R., Tukey, J.W., and Larsen, W.A., 1978, and Kendall and | ||
| + Stuart, 1967). Otherwise, bootstrap specifies the number of | ||
| + times to bootstrap the median to determine its 95% | ||
| + confidence intervals. Values between 1000 and 10000 are | ||
| + recommended. | ||
| + usermedians : array-like, optional | ||
| + An array or sequence whose first dimension (or length) is | ||
| + compatible with ``x``. This overrides the medians computed | ||
| + by matplotlib for each element of *usermedians* that is not | ||
| + `None`. When an element of *usermedians* == None, the median |
|
|
QuLogic
commented on an outdated diff
Feb 18, 2016
| + using a Gaussian-based asymptotic approximation (see McGill, | ||
| + R., Tukey, J.W., and Larsen, W.A., 1978, and Kendall and | ||
| + Stuart, 1967). Otherwise, bootstrap specifies the number of | ||
| + times to bootstrap the median to determine its 95% | ||
| + confidence intervals. Values between 1000 and 10000 are | ||
| + recommended. | ||
| + usermedians : array-like, optional | ||
| + An array or sequence whose first dimension (or length) is | ||
| + compatible with ``x``. This overrides the medians computed | ||
| + by matplotlib for each element of *usermedians* that is not | ||
| + `None`. When an element of *usermedians* == None, the median | ||
| + will be computed by matplotlib as normal. | ||
| + conf_intervals : array-like, optional | ||
| + Array or sequence whose first dimension (or length) is | ||
| + compatible with ``x`` and whose second dimension is 2. When | ||
| + the current element of ``conf_intervals`` is not None, the |
QuLogic
Member
|
|
Bump -- just gave this another rebase to keep it current with master |
|
Build failures are unrelated. Something's wacky with colorbars, e.g., |
|
Yes, this came about due to a merge of a different PR last night, I think. On Thu, Feb 18, 2016 at 12:16 PM, Paul Hobson notifications@github.com
|
tacaswell
closed this
Feb 18, 2016
tacaswell
reopened this
Feb 18, 2016
tacaswell
added needs_review and removed needs_review
labels
Feb 18, 2016
|
Sorry, I broke all the branches. Merged a PR that passed before we put the zero-tolerance in. There were some regions of those images where the 8bit blue value changed by 1 |
|
The failure on appveyor is
which is known to be flaky |
|
Bump -- give me a shout if y'all want any changes made to this. |
tacaswell
commented on an outdated diff
Mar 9, 2016
| @@ -1760,39 +1760,42 @@ def delete_masked_points(*args): | ||
| return margs | ||
| -def boxplot_stats(X, whis=1.5, bootstrap=None, labels=None): | ||
| - ''' | ||
| - Returns list of dictionaries of staticists to be use to draw a series of | ||
| - box and whisker plots. See the `Returns` section below to the required | ||
| - keys of the dictionary. Users can skip this function and pass a user- | ||
| - defined set of dictionaries to the new `axes.bxp` method instead of | ||
| - relying on MPL to do the calcs. | ||
| +def boxplot_stats(X, whis=1.5, autorange=False, bootstrap=None, |
tacaswell
Owner
|
|
Read through this, other than my one comment The docstrings are much better. Can this get a note in
|
|
@tacaswell your understanding matches mine. and to make sure I'm clear -- I don't modify |
|
Yes. The individual files helps prevent rebase-due-to-doc-conflicts On Sun, Mar 13, 2016 at 6:41 PM Paul Hobson notifications@github.com
|
jenshnielsen
added a commit
that referenced
this pull request
Mar 14, 2016
|
|
jenshnielsen |
455cb92
|
jenshnielsen
merged commit 455cb92
into matplotlib:master
Mar 14, 2016
mdboom
removed the
needs_review
label
Mar 14, 2016
jenshnielsen
added a commit
to jenshnielsen/matplotlib
that referenced
this pull request
Mar 14, 2016
|
|
jenshnielsen + jenshnielsen |
54ec43f
|
|
Backport wasn't clean (conflict in removed svg file) so doing it via #6153 |
jenshnielsen
added a commit
that referenced
this pull request
Mar 14, 2016
|
|
jenshnielsen |
7266b4f
|
phobson
deleted the
phobson:bxp-equal-quartiles branch
Mar 14, 2016
|
thanks for the merge and help, everyone! |
tacaswell
added a commit
to tacaswell/matplotlib
that referenced
this pull request
May 22, 2016
|
|
jenshnielsen + tacaswell |
abe9561
|
phobson commentedOct 29, 2015
See #5331
Addresses the concern raised in the issue above and cleans up the docstring.
Previous behavior available through the
autorangekwarg.