Normalising the y axis of the 1D hist Pdfs diagonals to unity! #86

sultan-hassan · 2016-09-27T18:33:53Z

1 - Is there any quick way to do this with corner? if not, then its unclear how to use this with weights, what is the shape of weights! because in the documents its written [nsmaples,] isn't should be [nsamples,ndim] as the same shape of x the sample array?

2 - whenever I try using weights, it gives: TypeError: hist() got multiple values for keyword argument 'weights'

Thanks in advance for your help.

kbarbary · 2016-09-27T20:23:38Z

Regarding weights, a single weight applies to a single sample (a single sample being a position in an n-dimensional space). So, weights[i] is the weight for the sample samples[i, :]. It wouldn't make sense to have different weights for different dimensions of a single sample.

sultan-hassan · 2016-09-27T20:43:11Z

Thank you very much for that. So how do I creat such a weight to normalize the y axis of the diagonal 1D PDFs for my sample.
Lets assume that my sample has a shape of (1000, 3) so the corresponding wieghts should be (1000), right? so I am confused on how to construct such a weight to do the normalization.

kbarbary · 2016-09-27T20:59:40Z

I'm a bit confused: I thought the y axis of the diagonals are not labelled anyway, so I don't see what normalizing these 1-d PDFs would do (other than perhaps change the y axis tick locations, which don't mean much).

Maybe you're using an option to get y labels on the diagonals, or maybe defaults have changed since I last looked at it?

sultan-hassan · 2016-09-27T21:21:12Z

Let me explain more. I am over plotting two samples on top of each other so without normalizing, you cant clearly see the pdfs of the two samples on the 1D pdf hist diagonals.
If you see the attached plot, I want to normalize the red and blue so then I can have them in the same length.

kbarbary · 2016-09-27T21:54:26Z

Ah, I didn't understand that you were plotting two sets of samples.

Dan will know better, but I think passing a weight array will indeed affect the relative scaling of the two sets of samples. Assuming you have 1000 samples, try passing 2.0 * np.ones(1000) for weights for one set and np.ones(1000) for the other set and see if it changes the relative scaling.

sultan-hassan · 2016-09-27T23:05:17Z

Yes it does change scaling but it would be great to find out something consistent to use for all samples such that the sum under the PDF is equal to one or any way to normalize them without playing randomly, 2.0 * np.ones(1000) if not then 3.0 * np.ones(1000) ...etc. thanks for helping out.

kbarbary · 2016-09-28T00:24:20Z

Is the problem that the two sets have different numbers of samples? If so, setting weights=np.ones(nsamples)/nsamples for each set should make the areas under the PDF the same regardless of the value of nsamples.

sultan-hassan · 2016-09-28T03:31:12Z

The nsmaples is the same! and I tried this but it doesn't help. There must be a way :( I will keep playing around, let me know if you find something.

dfm · 2016-09-28T15:57:44Z

If nsamples is the same then it is normalized! The integral under each of those histograms is the same.

drphilmarshall · 2016-09-28T18:27:31Z

Agreed - the blue and red histograms look like good approximations to
normalized PDFs to me!

On Wed, Sep 28, 2016 at 8:57 AM, Dan Foreman-Mackey <
notifications@github.com> wrote:

If nsamples is the same then it is normalized! The integral under each
of those histograms is the same.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AArY9xsEl0RhvOGsOR6s8_o2moi4Nf05ks5quo54gaJpZM4KIAYE
.

sultan-hassan · 2016-09-28T18:39:11Z

So If I want the height of the 1D hist diagonals (blue and red) to be the same? should I incearse the bins number or I should use the same bin-width? It should be possible to have them in the same height! the question is how? still playing around...

drphilmarshall · 2016-09-28T19:09:11Z

I think the default behavior should be that the histograms should be represented so that they appear as approximations to PDFs, which is to say, each normalised to unit area under the curve - no matter what the chosen bin size is. Asking for the distributions to be shown with equal peak heights sounds like different functionality, but which could be provided via a (new?) kwarg, like `equalize_peak_heights=True`. BTW if such an option were implemented, you'd want the 2D histograms to be scaled to equal peak density in the same way (otherwise the panels would become logically disconnected.) A more common use case might be to use a (new? and easier to implement?) kwarg like `normalization=1512`, so that different populations can be overlaid with their appropriate relative (or absolute) sizes (in this case, 1512 population members). The default would be 1.0. I only ever use `corner` for samples from PDFs, but maybe others have done this?

sultan-hassan · 2016-09-28T19:41:15Z

Great thanks for that :). Could you maybe point out the link to those functions 'equalize_peak_heights=True' and 'normalization=1512', are these kwargs for hist OR hist2d?

drphilmarshall · 2016-09-28T19:46:59Z

Those were suggestions for a pull request, if they do not exist already. The API docs do not show any options like that yet, so you could be about to become a contributor! :-) Dan can advise more on exactly where to place your energies, so if I were you I'd wait for a reply from him - but after that, I'd start extending the source code to see if I could make the plot I wanted, in a way that would enable others to follow suit.

sultan-hassan · 2016-09-28T19:53:45Z

Cool I would be very much happy to contribute and modify the code as this routine + emcee already have been providing a great help in my research, many thanks to the owner.

dfm · 2016-09-28T20:30:52Z

The first thing that we need is a convincing argument for why you want this
feature. I'm currently skeptical that it's actually something that we want
and I'm hesitant to add features that might be misleading so I'd love to
hear the specific use case and the story that you're trying to tell.
On Wed, Sep 28, 2016 at 12:54 PM Sultan Hassan notifications@github.com
wrote:

Cool I would be very much happy to contribute and modify the code as this
routine + emcee already have been providing a great help in my research,
many thanks to the owner.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAVYSpEQ5rChFMnlO1i3lVkcEk8OM0-Kks5qusXJgaJpZM4KIAYE
.

sultan-hassan · 2016-09-28T20:56:21Z

Well, the only reason is that a much more better visualisation in comparing different samples in terms of the distribution shape and width. However, if this doesnt seem useful to be a part of the routine, then thats fine with me anyway. But I would still like to know how to do such a thing for my self. Any ideas?

dfm · 2016-09-28T21:04:24Z

Let's get some more details – what are the samples that you're comparing and why is your suggestion better for comparison? If the histograms are properly normalized then a wider distribution will also be "shorter". I think that this actually makes the visualization clearer! I expect that this is also the same reason why neither matplotlib or numpy have native support for this.

If you want to mock up a change, it will be easiest to do with weights = np.ones(n) and modifying this line:

y0 = np.array(list(zip(n, n))).flatten()

to

y0 = np.array(list(zip(n, n))).flatten() / np.max(n)

Note: that I still stand by my opinion that this would lead to a misleading result!

drphilmarshall · 2016-09-28T21:13:09Z

I think if we are talking about samples from a PDF, then you always want to
normalize each histogram to 1. However, if you want to use corner to
visualize number density rather than probability density then I can see
how one might want to specify the relative normalizations of the datasets
being overlaid. I have never needed to normalize to equal peak height...

On Wed, Sep 28, 2016 at 2:04 PM, Dan Foreman-Mackey <
notifications@github.com> wrote:

Let's get some more details – what are the samples that you're comparing
and why is your suggestion better for comparison? If the histograms are
properly normalized then a wider distribution will also be "shorter". I
think that this actually makes the visualization clearer! I expect that
this is also the same reason why neither matplotlib or numpy have native
support for this.

If you want to mock up a change, it will be easiest to do with weights =
np.ones(n) and modifying this line
https://github.com/dfm/corner.py/blob/master/corner/corner.py#L248:

y0 = np.array(list(zip(n, n))).flatten()

to

y0 = np.array(list(zip(n, n))).flatten() / np.max(n)

Note: that I still stand by my opinion that this would lead to a
misleading result!

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AArY91jVkoaSgzQNTvcM__98B9ZeoavKks5qutZYgaJpZM4KIAYE
.

dfm · 2016-09-28T21:42:50Z

Totally! The current default behavior is actually to normalize to the number density and you can add hist_args=dict(normed=True) to get the PDF behavior.

drphilmarshall · 2016-09-28T23:26:30Z

That is good to know indeed! I went back and checked the API docs: this
default behavior is not explained anywhere, but perhaps it should. Also, to
get the PDF behavior do you need to specify hist2d_args=dict(normed=True)
as well? I can't think of any reason you would want to normalize the 2D and
1D histograms differently, can you? Maybe we need a normed=True kwarg
in corner.corner, that turns on both 1 and 2D normalizations?

In the meantime, Sultan, it sounds as though you can get the behavior you
want through judicious choice of sample sizes...

dfm · 2016-09-28T23:35:13Z

I agree that it's worth saying something about the default behavior in the docs – I am actually inclined to change the default to density normalization!

I'm not sure what you mean about the "normalization" of the 2D histograms. The contours are always at percentiles of the sample mass. I guess you could choose to have the contours defined in terms of numbers but that's some craziness! I don't want to go there.

I also don't think that you'll ever be able to get the requested behavior by changing the sample size because the peak height in each panel actually depends on the bin sizes and the shape of the distribution. That's the whole reason why it's meaningless to give the "peaks" equal heights!

drphilmarshall · 2016-09-28T23:56:53Z

Yeah - the word "judicious" can cover a lot of fiddling around... :-)

I'd support a move to probability density normalization by default,
especially since the contour levels are defined in terms of probability
mass! However if someone really was trying to visualize number density, I
guess they might want contours in absolute number density, but I agree
it's better to wait for that to be requested... In the meantime they could
still have the 2D grayscale, 2D scatter plot and 1D histograms all
represent absolute number density (which I think is the current default).
I bet they woudl still find it useful to be able to switch easily from
"number density" to "probabilty density" and back, though.

On Wed, Sep 28, 2016 at 4:35 PM, Dan Foreman-Mackey <
notifications@github.com> wrote:

I agree that it's worth saying something about the default behavior in the
docs – I am actually inclined to change the default to density
normalization!

I'm not sure what you mean about the "normalization" of the 2D histograms.
The contours are always at percentiles of the sample mass. I guess you
could choose to have the contours defined in terms of numbers but that's
some craziness! I don't want to go there.

I also don't think that you'll ever be able to get the requested behavior
by changing the sample size because the peak height in each panel actually
depends on the bin sizes and the shape of the distribution. That's the
whole reason why it's meaningless to give the "peaks" equal heights!

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#86 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AArY96UgWG2yOWh3CblJX2xpOSYQw_3oks5quvmxgaJpZM4KIAYE
.

sultan-hassan · 2016-10-11T15:52:45Z

Well, here's plot taken from Greig+Mesinger2015 21cmmc paper where the plot shows different Pdfs with equal heights. I thought this is a good representation to compare different pdfs and might be able to the same with corner....

jtlz2 · 2021-07-12T10:00:03Z

normed is now deprecated in matplotlib - rather use density.

drphilmarshall mentioned this issue Jan 26, 2017

plot_sample needs to normalize its 1D histograms drphilmarshall/OM10#52

Open

dfm closed this as completed Jul 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalising the y axis of the 1D hist Pdfs diagonals to unity! #86

Normalising the y axis of the 1D hist Pdfs diagonals to unity! #86

sultan-hassan commented Sep 27, 2016

kbarbary commented Sep 27, 2016

sultan-hassan commented Sep 27, 2016

kbarbary commented Sep 27, 2016

sultan-hassan commented Sep 27, 2016 •

edited

Loading

kbarbary commented Sep 27, 2016

sultan-hassan commented Sep 27, 2016

kbarbary commented Sep 28, 2016

sultan-hassan commented Sep 28, 2016

dfm commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016

sultan-hassan commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016 via email

sultan-hassan commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016 via email

sultan-hassan commented Sep 28, 2016

dfm commented Sep 28, 2016

sultan-hassan commented Sep 28, 2016

dfm commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016

dfm commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016

dfm commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016

sultan-hassan commented Oct 11, 2016

jtlz2 commented Jul 12, 2021

Normalising the y axis of the 1D hist Pdfs diagonals to unity! #86

Normalising the y axis of the 1D hist Pdfs diagonals to unity! #86

Comments

sultan-hassan commented Sep 27, 2016

kbarbary commented Sep 27, 2016

sultan-hassan commented Sep 27, 2016

kbarbary commented Sep 27, 2016

sultan-hassan commented Sep 27, 2016 • edited Loading

kbarbary commented Sep 27, 2016

sultan-hassan commented Sep 27, 2016

kbarbary commented Sep 28, 2016

sultan-hassan commented Sep 28, 2016

dfm commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016

sultan-hassan commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016 via email

sultan-hassan commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016 via email

sultan-hassan commented Sep 28, 2016

dfm commented Sep 28, 2016

sultan-hassan commented Sep 28, 2016

dfm commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016

dfm commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016

dfm commented Sep 28, 2016

drphilmarshall commented Sep 28, 2016

sultan-hassan commented Oct 11, 2016

jtlz2 commented Jul 12, 2021

sultan-hassan commented Sep 27, 2016 •

edited

Loading