-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: MonkeyClass pyplot #100
Conversation
I think this feature needs wider discussion by the Scikit-HEP community. I can see the advantage of monkey-patching the Axes object. We need to discuss the two options then regarding plotting of pre-binned histograms. We overload the normal
|
I don't know the Matplotlib interface well enough to say that monkey-patching is a good idea here (it usually isn't a good idea, but there are exceptions). Note that if you want to support Python 2, the new methods have to be added with The two options @HDembinski suggested are, if I'm reading it right, a choice to make after already having decided that we're going to monkey-patch As for noun or verb, it's good for them to be consistent, and I don't know how consistent Matplotlib is in its method names. The word "plot" can be a noun or a verb (more likely a verb if placed first, as @HDembinski has pointed out), but "show" and "draw" are unambiguous. On the other hand, the thing that distinguishes this histogram method from the standard one is that it draws prebinned data. Is there any way to get the word "prebinned" into it without making the method name long and hard to type? Looking at a list of methods, the existence of I know that |
I agree with @HDembinski that the primary guiding principle here should be
(which is also why I think we should support passing an array of histograms, even though the points raised in the other thread are very valid) I, however, draw the opposite conclusion and think that a separate method name, while keeping the rest of the API as close as possible to Finding a good (short) name is a bit tricky. I understand the verb argument, though matplotlib is not very consistent with/on this. There are
Maybe we could have something of a vote |
@jpivarski Thank you for commenting. I don't like monkey-patching much myself, but I am not against it in this case. You misunderstood option 1. I explicitly wrote in my description that the original pyplot.hist should still work. We keep the original method around and when you pass a sequence that the original method would have handled, our method would simply pass that array to the orignal method. We just wrap the original hist to also accept pre-binned histograms. This is possible without ambiguity, since the data structures are different. I added a code example to make this more clear. |
If you prefer to have a separate name, how about the name As a general rule, functions and methods should be verbs, but pyplot is a bit special and local consistency within pyplot is more important than global consistency, I agree with that. @andrzejnovak Your remark makes sense to me that many methods on pyplot read like "plot something"
So I think the following fits in very well
|
I like the name |
|
There's a tradeoff between brevity and clarity: clarity is some optimal point in the curve. I think "binned" is a bit too little because "hist" is binned as well. The new thing here is the "pre". |
For what it's worth I agree as well - I always go for clarity even if that means a bit longer a word. Honestly how much of a pain is 3 letters compared to what we have to write/code on a daily basis? It's not even a permille … ;-) |
So it seems that we converged for the name to "prebinned" and perhaps "prebinned2d". I am not sure whether we need the latter though, since pyplot.pcolormesh is doing a decent job of plotting prebinned 2D histograms. Todo:
|
@andrzejnovak I have some time, so is it ok if I make the changes to this branch? |
@HDembinski Sure, this makes sense to me. I am thinking now that maybe the way forward would be to have |
That's a possible compromise, but not a good one. I don't see the value in keeping Again: Design is about reducing interface bloat (and hence implementation bloat), to reduce mental load when learning a new library and to ease maintenance in the long run. One of the design goals of Scikit-HEP is to write small specialized tools that interoperate well with each other and with standard libs like numpy and matplotlib. This means we must resist the temptation to duplicate interfaces and functionality in the name of "convenenice". If it can be easily achieved with standard numpy/matplotlib, then we should not implement it. I am writing a lot of words to give you arguments why your choices are not optimal. You should argue for your case as well, not just say: I want |
One thing I think makes sense for |
That's all I ask :) I want to emphasize that I don't disagree with you on the underlying principles, I just don't arrive to the same conclusions. I am much less motivated by the time I've invested in mplhep already than by the time it's saving me on daily basis (if I think you are spot-on for the paramount principle here: "reduce mental load when learning" and like we agreed in the other thread, in this specific case it's more important to keep local consistency with If Could you elaborate on
|
I am curious on what your take is on whether if we go the monkey-patch route, we should also allow the functions to be used independently (like my original commit here). I know "There should be one-- and preferably only one --obvious way to do it.", but I don't know whether we should gate the functionality behind the monkey patch, especially if we also modify some existing fucntions like |
I like the monkey patch solution, it seems like the most straight-forward.
I took those options from the docs of pyplot.hist, you can look it up there.
That does not work if we pass in the whole histogram, and it is not necessary. If you want a skyline and error bars on top (which is not a recommended plotting style), then users can simply call |
Still one of us must be wrong. Either the logic is flawed or the assumptions are different or we apply different weights to the same values. But anyhow, perhaps we can converge. |
Ok
I thought I am against forcing users to make multiple calls to |
Both can track uncertainties, but don't do it by default, because it costs extra. If you don't call When you want to track uncertainties from weighted fills, you are supposed to use |
We should wrap this PR up to have the monkey patching in place. The discussion about the design of I am currently working on the improvements to patching code that I wrote about. I rebased to the current master and now I cannot run the tests.
I am using matplotlib-3.1.3. It looks like you are using matplotlib-2.x in CI testing. Perhaps |
Yeah it's a
What makes it look like that? I agree, monkey patching machinery and prebinned need not be in the same PR. |
I wanted to work on this patch, and I cannot pytest it, unless this
I tried to find out how this problem could pass CI. |
@HDembinski Why can we not require matplotlib-3.2? The previous solution is still available in the We could check the version on import.
I don't understand, the CI obviously uses 3.2, or it would be failing. |
@HDembinski Still planning to take a look at this, or should I put it back on my to-do list? The 3.2 backcomp was added :) |
You should do this by registering a custom projection which lets you control the (sub)class of the This is supported by every version of Matplotlib that is plausible to support. |
I am not sure I understand. The idea (which we, unfortunately, didn't follow up on) was to have a |
@andrzejnovak I will have a look at it. |
Allows for a more seamless matplotlib-like experience. As well as "patching" some objects in the future like
set_xlabel
before mpl catches up.One downside is, this updates the Axes mpl class so while I don't know why you would need to, you cannot use
matplotlib.pyplot
at the same time. @HDembinski Do you know if this works for jax? If so I'll take a look at how to implement it.Being able to do stuff like this main reason why I would prefer
histplot
not be renamed tohist
but I'd be fine withhist_plot
or some other non-conflicting name.