-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalising the y axis of the 1D hist Pdfs diagonals to unity! #86
Comments
Regarding weights, a single weight applies to a single sample (a single sample being a position in an n-dimensional space). So, |
Thank you very much for that. So how do I creat such a weight to normalize the y axis of the diagonal 1D PDFs for my sample. |
I'm a bit confused: I thought the y axis of the diagonals are not labelled anyway, so I don't see what normalizing these 1-d PDFs would do (other than perhaps change the y axis tick locations, which don't mean much). Maybe you're using an option to get y labels on the diagonals, or maybe defaults have changed since I last looked at it? |
Ah, I didn't understand that you were plotting two sets of samples. Dan will know better, but I think passing a weight array will indeed affect the relative scaling of the two sets of samples. Assuming you have 1000 samples, try passing |
Yes it does change scaling but it would be great to find out something consistent to use for all samples such that the sum under the PDF is equal to one or any way to normalize them without playing randomly, 2.0 * np.ones(1000) if not then 3.0 * np.ones(1000) ...etc. thanks for helping out. |
Is the problem that the two sets have different numbers of samples? If so, setting |
The nsmaples is the same! and I tried this but it doesn't help. There must be a way :( I will keep playing around, let me know if you find something. |
If |
Agreed - the blue and red histograms look like good approximations to On Wed, Sep 28, 2016 at 8:57 AM, Dan Foreman-Mackey <
|
So If I want the height of the 1D hist diagonals (blue and red) to be the same? should I incearse the bins number or I should use the same bin-width? It should be possible to have them in the same height! the question is how? still playing around... |
I think the default behavior should be that the histograms should be
represented so that they appear as approximations to PDFs, which is to say,
each normalised to unit area under the curve - no matter what the chosen
bin size is.
Asking for the distributions to be shown with equal peak heights sounds
like different functionality, but which could be provided via a (new?)
kwarg, like `equalize_peak_heights=True`. BTW if such an option were
implemented, you'd want the 2D histograms to be scaled to equal peak
density in the same way (otherwise the panels would become logically
disconnected.)
A more common use case might be to use a (new? and easier to implement?)
kwarg like `normalization=1512`, so that different populations can be
overlaid with their appropriate relative (or absolute) sizes (in this case,
1512 population members). The default would be 1.0. I only ever use
`corner` for samples from PDFs, but maybe others have done this?
|
Great thanks for that :). Could you maybe point out the link to those functions 'equalize_peak_heights=True' and 'normalization=1512', are these kwargs for hist OR hist2d? |
Those were suggestions for a pull request, if they do not exist already.
The API docs do not show any options like that yet, so you could be about
to become a contributor! :-)
Dan can advise more on exactly where to place your energies, so if I were
you I'd wait for a reply from him - but after that, I'd start extending the
source code to see if I could make the plot I wanted, in a way that would
enable others to follow suit.
|
Cool I would be very much happy to contribute and modify the code as this routine + emcee already have been providing a great help in my research, many thanks to the owner. |
The first thing that we need is a convincing argument for why you want this
|
Well, the only reason is that a much more better visualisation in comparing different samples in terms of the distribution shape and width. However, if this doesnt seem useful to be a part of the routine, then thats fine with me anyway. But I would still like to know how to do such a thing for my self. Any ideas? |
Let's get some more details – what are the samples that you're comparing and why is your suggestion better for comparison? If the histograms are properly normalized then a wider distribution will also be "shorter". I think that this actually makes the visualization clearer! I expect that this is also the same reason why neither matplotlib or numpy have native support for this. If you want to mock up a change, it will be easiest to do with
to
Note: that I still stand by my opinion that this would lead to a misleading result! |
I think if we are talking about samples from a PDF, then you always want to On Wed, Sep 28, 2016 at 2:04 PM, Dan Foreman-Mackey <
|
Totally! The current default behavior is actually to normalize to the number density and you can add |
That is good to know indeed! I went back and checked the API docs: this In the meantime, Sultan, it sounds as though you can get the behavior you |
I agree that it's worth saying something about the default behavior in the docs – I am actually inclined to change the default to density normalization! I'm not sure what you mean about the "normalization" of the 2D histograms. The contours are always at percentiles of the sample mass. I guess you could choose to have the contours defined in terms of numbers but that's some craziness! I don't want to go there. I also don't think that you'll ever be able to get the requested behavior by changing the sample size because the peak height in each panel actually depends on the bin sizes and the shape of the distribution. That's the whole reason why it's meaningless to give the "peaks" equal heights! |
Yeah - the word "judicious" can cover a lot of fiddling around... :-) I'd support a move to probability density normalization by default, On Wed, Sep 28, 2016 at 4:35 PM, Dan Foreman-Mackey <
|
|
1 - Is there any quick way to do this with corner? if not, then its unclear how to use this with weights, what is the shape of weights! because in the documents its written [nsmaples,] isn't should be [nsamples,ndim] as the same shape of x the sample array?
2 - whenever I try using weights, it gives: TypeError: hist() got multiple values for keyword argument 'weights'
Thanks in advance for your help.
The text was updated successfully, but these errors were encountered: