Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documentation on boxplot and violinplot #27787

Merged
merged 1 commit into from
Feb 20, 2024

Conversation

timhoffm
Copy link
Member

  • write out defaults coming from rcParams
  • see also-link between boxplot/bxp and violinplot/violin
  • some rewording

@timhoffm timhoffm added the Documentation: API files in lib/ and doc/api label Feb 13, 2024
@timhoffm timhoffm added this to the v3.9.0 milestone Feb 13, 2024
@github-actions github-actions bot removed the Documentation: API files in lib/ and doc/api label Feb 13, 2024
@story645
Copy link
Member

Love love the little sketch.
I'm finding 'statistical parameters' to be a bit vague, so wondering about emphasizing that the difference is computation vs drawing. Maybe hinge the distinction on 'boxplot/violinplot -> takes data as input and computes the statistical parameters", 'bxp/violin' -> draws already computed parameters, e.g.

@timhoffm
Copy link
Member Author

I'm finding 'statistical parameters' to be a bit vague, so wondering about emphasizing that the difference is computation vs drawing. Maybe hinge the distinction on 'boxplot/violinplot -> takes data as input and computes the statistical parameters", 'bxp/violin' -> draws already computed parameters, e.g.

If you have an idea how to fit that into the one-line sentences for summary and See Also, I'm happy to take that. 'statistical parameters' was the best I could come up with.

@timhoffm timhoffm added the Documentation: API files in lib/ and doc/api label Feb 13, 2024
@@ -3958,6 +3958,7 @@ def boxplot(self, x, notch=None, sym=None, vert=None, whis=None,

See Also
--------
.Axes.bxp : Draw a boxplot based on statistical parameters (instead of data).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.Axes.bxp : Draw a boxplot based on statistical parameters (instead of data).
.Axes.bxp : Draw a boxplot based on pre-computed statistics (instead of raw data).

?

Copy link
Member

@story645 story645 Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Draw a boxplot based on pre-computed statistics (.i.e., the bxpstats input dictionary)

I think the instead of data confuses insteads of elucidates. But I like this wording in that the flip works too

.Axes.boxplot: Computes boxplot statistics and draws a boxplot

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rcomer. This is a good wording.

@story645, I've removed the instead-comment for this direction. While I see the appeal of "Computes boxplot statistics and draws a boxplot", I'd like to keep the "Draw ..." pattern because that's the primary objective. And I believe people think "here's my data, make a boxplot". It's implicitly understood that you have to calculate statistics, but explicitly mention that may even be distracting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's implicitly understood that you have to calculate statistics, but explicitly mention that may even be distracting.

I was just pulling the pattern we use for histogram https://matplotlib.org/devdocs/api/_as_gen/matplotlib.axes.Axes.hist.html#matplotlib.axes.Axes.hist

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also nit on raw - processed data is fine as input too (and technically the computed stats can be data)

Copy link
Member

@rcomer rcomer Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because the computed stats can be data that I thought the "raw" would help but yes I appreciate that point. Pretty much none of the data I analyse would be considered raw by an instrument scientist. Can't think of a better term right now.

Copy link
Member

@story645 story645 Feb 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was sorta why I was like put the emphasis on "boxplot-you provide the data and it computes the stats and draws, bxp-you provide the stats and it just draws"

Like normally yes the computation is expected but we have a viz method that doesn't do the computation and that's what we're trying to distinguish on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@story645 please check whether the current wording works for you:

Summary lines:

  • boxplot(): Draw a box and whisker plot.
  • bxp(): Draw a box and whisker plot based on pre-computed statistics.

Motivation: Since it's very rare that people have pre-computed statistics, boxplot() (including computation) is the canonical behavior and we don't need to qualify the type of input. bxp() has the addition "base on pre-computed statistics" because that special and needs explicit mentioning.

See also:

  • boxplot() -> bxp(): Draw a boxplot based on pre-computed statistics.
  • bxp() -> boxplot(): Draw a boxplot based on raw data (instead of pre-computed statistics).

Motivation: The first is just the summary line of bxp() (mentioning what's special). The second has the extended explanation, because the docstring is in the context of the bxp() special case, so just an unqualified "Draw a boxplot" seems not sufficient - we have to explicitly state what's different. I know, you nitted "raw", but I haven't found a better term, and I think with the addition "raw data (instead of pre-computed statistics)" it's clear enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I feel bad at how many cycles it's taken, the above seems fine except I'd just go with

data instead of pre-computed statistics.

Dropping the parenthetical b/c it's the important part.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny word sub & pulling violin plot into same form but otherwise thanks for your patience here.

@@ -4086,13 +4087,26 @@ def bxp(self, bxpstats, positions=None, widths=None, vert=True,
meanline=False, manage_ticks=True, zorder=None,
capwidths=None):
"""
Drawing function for box and whisker plots.
Draw a box and whisker plot based on pre-computed statistics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Draw a box and whisker plot based on pre-computed statistics.
Draw a box and whisker plot using pre-computed statistics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "using" better English here? To me, that sounds like "pre-computed statistics" is some sort of helper, but not the fundamental input.

Copy link
Member

@story645 story645 Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly I think from is probably the right word- based on kinda implies indirection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to from.

lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved
lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved
lib/matplotlib/axes/_axes.py Outdated Show resolved Hide resolved
@timhoffm
Copy link
Member Author

Merging based on @story645's approval.

@timhoffm timhoffm merged commit e4e6840 into matplotlib:main Feb 20, 2024
39 of 42 checks passed
@timhoffm timhoffm deleted the boxplot-doc branch February 20, 2024 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants