Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Making plot methods more uniform in their options #413

Closed
8 tasks
wesm opened this issue Nov 24, 2011 · 9 comments
Closed
8 tasks

API: Making plot methods more uniform in their options #413

wesm opened this issue Nov 24, 2011 · 9 comments
Labels
API Design Enhancement Refactor Internal refactoring of code Visualization plotting

Comments

@wesm
Copy link
Member

wesm commented Nov 24, 2011

from @lodagro:

Been thinking a bit on uniformering signatures for plot, hist, boxplot and what they do/return.
For reference, below an overview what pandas and matplotlib.pyplot have.

some things that come to mind

  • +1 for what you did with boxplot, by and column are handy, this could also be used for other kind of plots.
  • bar-plot goes through plot, why not bar() itself.
  • maybe pie() to?
  • plot offers possibility to control sharex and sharey, others don not have this control.
  • not all functions use the same subplot layout approach. DataFrame.plot() can plot all lines on a single axis or a nx1 layout. DataFrame.hist() uses nxn layout and DataFrame.boxplot is clever and can do nxm, but the user has no control. Maybe add nrow, ncol arguments? Default to None, meaning pandas can control layout, if either one defined pandas should compute the other one. Can get tricky for plot(), need to do something with subplots argument
  • usage of **kwds, e.g boxplot has it, does not use this. maybe add subplot_kw and figure_kw arguments for dispatching argmuments -- like matplotlib does.
  • Series.plot has style, DataFrame.plot not -- later could maybe use style/colum (like formatters in to_string)?
  • rot not used everywhere
  • probably many users of pandas are familiar with matplotlib too, in general align plotting signatures and return objects with matplotlib would be a good thing to do?

Maybe if i stare at it a bit longer i may have some more ideas, but this is getting long already. What do you think?

for reference

Series:

plot(self, label=None, kind='line', use_index=True, rot=30, ax=None, style='-',
     grid=True, **kwds)

hist(self, ax=None, grid=True, **kwds)

DataFrame:

boxplot(self, column=None, by=None, ax=None, fontsize=None,
            rot=0, grid=True, **kwds)

plot(self, subplots=False, sharex=True, sharey=False, use_index=True,
         figsize=None, grid=True, legend=True, rot=30, ax=None,
         kind='line', **kwds)

def hist(self, grid=True, **kwds):

matplotlib.pyplot

figure(num=None, figsize=None, dpi=None, facecolor=None, edgecolor=None,
             frameon=True, FigureClass=<class 'matplotlib.figure.Figure'>,
             **kwargs)

fig, ax = subplots(nrows=1, ncols=1, sharex=False, sharey=False, squeeze=True,
                   subplot_kw=None, **fig_kw)
---> always creates a new figure

plot(*args, **kwargs)
    returns list of matplotlib.lines.Line2D

boxplot(x, notch=0, sym='b+', vert=1, whis=1.5, positions=None,
            widths=None, patch_artist=False, bootstrap=None, hold=None)
    Returns a dictionary, mapping each component of the boxplot
    to a list of the :class:`matplotlib.lines.Line2D`
    instances created.

plt.pie(x, explode=None, labels=None, colors=None, autopct=None,
        pctdistance=0.6, shadow=False, labeldistance=1.1, hold=None)
   Return value:
      If *autopct* is None, return the tuple (*patches*, *texts*):

        - *patches* is a sequence of
          :class:`matplotlib.patches.Wedge` instances

        - *texts* is a list of the label
          :class:`matplotlib.text.Text` instances.

      If *autopct* is *None*, return the tuple (*patches*, *texts*)
      If *autopct* is not *None*, return the tuple (*patches*, *texts*, *autotexts*)

plt.bar(left, height, width=0.8, bottom=None, hold=None, **kwargs)
    Return value is a list of matplotlib.patches.Rectangle instances
@lodagro
Copy link
Contributor

lodagro commented Dec 21, 2011

For reference, interesting dicussion on pystatsmodel mailing list on what plotting functions should return.

@lodagro
Copy link
Contributor

lodagro commented Mar 4, 2012

Idea: remove groupby functionality in the plotting methods and define plotting methods on groupby object instead of relaying to the underlying group pandas opbjects. groupby() can group on columns and MultiIndex levels, the plotting methods only on columns.

@wesm
Copy link
Member Author

wesm commented Mar 8, 2012

That is certainly one option. I would support having this in addition to the current functions which take a by argument. I think eventually going to end up with some kind of grammar of graphics implementation to be honest

@lodagro
Copy link
Contributor

lodagro commented Mar 8, 2012

Interesting, is this grammar of graphics based on something already existing?

There is indeed no need to remove by argument; this one can be kept as is.

Did some mental scribbling a few days ago. Things i would like to have require a very big argument list to the plotting methods, not practical. So next was the idea of a graphical viewer class, with plenty of methods to do stuff.

Maybe "Wouter" needs to go into a cave for a while to implement some of the ideas ... rather unrealistic

@drewfrank
Copy link

FWIW, as a new user just figuring out how to use pandas I made the following attempts to achieve the "faceted plot" effect:

  1. Attempted to use the by keyword in hist -- surprised to find it wasn't there.
  2. Attempted to call the hist() method on a DataFrameGroupBy object, just because it seemed natural.
  3. Pivoted to make one column for each intended histogram, and called hist() on the resulting DataFrame. Success!

Basically, +1 to the ideas of making the signatures to the various plotting methods more uniform and adding the arguments you suggested, and also +1 to lodagro's idea of adding plot methods to groupby objects. And, a grammar of graphics implementation would be amazing!

@ycopin
Copy link

ycopin commented Jun 25, 2012

@drewfrank would you please explain about your point 3? As a fresh pandas newbie, I went through your steps 1. & 2., and failed ever since...

@waltonjones
Copy link

I am having a lot of problems styling dataframe.groupby boxplots. I can access the axes just fine, but since that call to the boxplot function returns a dict, how do you access the dict for styling purposes? For example bp['boxes'] or bp['whiskers'] don't work as expected. I get the following error:

TypeError: 'AxesSubplot' object has no attribute '__getitem__'

It also seems that none of the ways of accessing the plot I have tried let me use setp() like the standard matplotlib boxplot does.

I am on the newest stable pandas release.

@sorenwacker
Copy link

It would be nice I you could modify the boxplot method to support sharey and sharex keywords as in hist.
The y-axis is shared by default and there is no option to turn that off. In contrast, in hist the axis are not shared by default.

@wesm
Copy link
Member Author

wesm commented Jul 6, 2018

Closing as outdated

@wesm wesm closed this as completed Jul 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Enhancement Refactor Internal refactoring of code Visualization plotting
Projects
None yet
Development

No branches or pull requests

7 participants