Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stacked hist with histtype='step' (+ new kwargs for hist) #831

Closed
neggert opened this issue Apr 15, 2012 · 4 comments
Closed

Stacked hist with histtype='step' (+ new kwargs for hist) #831

neggert opened this issue Apr 15, 2012 · 4 comments

Comments

@neggert
Copy link
Contributor

neggert commented Apr 15, 2012

I submitted this to the mailing list, but I've since realized that this is probably a better place for it. So sorry for the double-post.

I've been wanting for some time the option to make stacked histograms using the step or stepfilled histtypes. I've started working on this in a fork, but I wanted to solicit some opinions on how to change the function call.

The most straightforward thing to do would be to add two new histtypes: stepstacked and stepfilledstacked, bringing the total number of possible histtypes to 6. However, that feels kind of clunky.

I'd like to propose that instead of increasing the number of histtypes, I add a new boolean kwarg called stacked which controls whether or not the histogram gets stacked. We then don't need to increase the number of histtypes. The barstacked histtype becomes redundant, as it's identical to histtype='bar', stacked=True, but I'll leave it in for backwards compatibility, to eventually be deprecated.

While I'm at it, filling a step histogram by using a different histtype (stepfilled) seems clunky and inconsistent with how one would deal with a bar histogram. To control the fill for a bar histogram, one uses the fill kwarg, which gets passed to the patch collection. I'd propose that we use the fill kwarg in a similar way for the step histogram and eliminate the stepfilled histtype.

In sumarry, here are the changes I'd like to make, as they would be reflected in the docstring

"""
         *histtype*: [ 'bar'  | 'step' ]
           The type of histogram to draw.

         *fill*:
           If *True*, the histogram will be filled.
           If *False* only the edges of the histogram will be drawn.

         *stacked*:
           If *True*, multiple data are stacked on top of each other
           If *False* multiple data are aranged side by side if
               histtype is 'bar' or on top of each other if histtype is 'step'
"""

We can of course leave in the barstacked and stepfilled histtypes for backwards compatability.

So, are there opinions on wether this proposal seems like a good idea? I'm mostly interested in getting stacked step histograms, which I've already implemented, but it seems like I might as well clean up the function call while I'm at it. If others agree that changing the function call is a good idea, I'll be happy to implement it and include it in my pull request.

@WeatherGod
Copy link
Member

Interesting idea. Note that we are just about to add a stacked plot feature. Maybe we should step back and think about "stackedness" in the big picture and strive for consistency?

@neggert
Copy link
Contributor Author

neggert commented Apr 16, 2012

Hmm, so to keep things consistent, we would want to either want to make this a new function, stackedhist, or change stackedplot to a kwarg for plot. Either approach seems reasonable to me.

@neggert
Copy link
Contributor Author

neggert commented Apr 16, 2012

Actually, now that I think about it, there is another issue I came across that might push us towards just making a new stackedhist function. I was planning on bringing this up later, but I think it's relevant now.

The behavior when norm=True using the current barstacked histtype doesn't make sense to me. Currently, it normalizes each histogram, then stacks them all on top of each other. I'd imagine that this is almost never what the user wants to do.

I think a more sensible thing to do would to be to normalize things so that the integral of all the histograms added together is 1. At least for my use cases, this is what I would want to happen when I normalized a stacked histogram. Is there a use case for the current implentation?

If we moved stacked histograms into a new function, it would be straightforward to default to this style of normalization for stacked histograms and we wouldn't need to put in lots of messy if..else statements looking at the kwargs.

@neggert
Copy link
Contributor Author

neggert commented Apr 21, 2012

Actually, if I created a stackedhist plot type, the code would be almost 100% duplication of the existing code in hist. I've implemented my initial proposal and submitted it as #847

@neggert neggert closed this as completed Sep 12, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants