Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional PDF for x_1 and c #12

Open
drphilmarshall opened this issue Mar 20, 2015 · 9 comments
Open

Conditional PDF for x_1 and c #12

drphilmarshall opened this issue Mar 20, 2015 · 9 comments

Comments

@drphilmarshall
Copy link

x_1 and c are SN parameters that, in the ensemble analysis, must be assumed to be drawn from some PDF. We can get some idea of how to model that PDF in our hierarchical inference by looking at the scatter plot of all samples from all emcee runs on all real supernovae. This distribution of points will be broader than the PDF for the 'true" x_1 and c values, but it might show us whether we need a bivariate function instead of two univariate ones (ie, we might see some correlation between x_1 and c). We can also plot the posterior means from each emcee run, but this will just make the plot less smooth.

Note that in the PGM below I made the simplest possible assignment - single Gaussians all round! But then I started wondering about correlations.

Phil's new PGM

@wmwv
Copy link

wmwv commented Mar 20, 2015

x0, x1, and c
are all SN parameters that should be drawn from population distributions.

Michael

On Mar 20, 2015, at 07:43, Phil Marshall notifications@github.com wrote:

x_1 and c are SN parameters that, in the ensemble analysis, must be assumed to be drawn from some PDF. We can get some idea of how to model that PDF in our hierarchical inference by looking at the scatter plot of all samples from all emcee runs on all real supernovae. This distribution of points will be broader than the PDF for the 'true" x_1 and c values, but it might show us whether we need a bivariate function instead of two univariate ones (ie, we might see some correlation between x_1 and c). We can also plot the posterior means from each emcee run, but this will just make the plot less smooth.

Note that in the PGM below I made the simplest possible assignment - single Gaussians all round! But then I started wondering about correlations.


Reply to this email directly or view it on GitHub.

@drphilmarshall
Copy link
Author

Agreed - but my understanding is that x_0 is going to get replaced by some
combination of M and mu, so I'm saving the population modeling for M. What
do you know about observed correlations between independently fitted x_1
and c pairs in samples of real supernovae? Are they correlated?

On Fri, Mar 20, 2015 at 8:21 AM, wmwv notifications@github.com wrote:

x0, x1, and c
are all SN parameters that should be drawn from population distributions.

Michael

On Mar 20, 2015, at 07:43, Phil Marshall notifications@github.com
wrote:

x_1 and c are SN parameters that, in the ensemble analysis, must be
assumed to be drawn from some PDF. We can get some idea of how to model
that PDF in our hierarchical inference by looking at the scatter plot of
all samples from all emcee runs on all real supernovae. This distribution
of points will be broader than the PDF for the 'true" x_1 and c values, but
it might show us whether we need a bivariate function instead of two
univariate ones (ie, we might see some correlation between x_1 and c). We
can also plot the posterior means from each emcee run, but this will just
make the plot less smooth.

Note that in the PGM below I made the simplest possible assignment -
single Gaussians all round! But then I started wondering about correlations.


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#12 (comment).

@rbiswas4
Copy link
Owner

My understanding was that those distributions in the PGM are priors. So, should we not be OK getting away with approximate distributions and it is fine if they don't look like the population distribution?

I did not think x1, and c population distributions are terribly correlated, and I believe that SN simulations currently have uncorrelated population distributions on x1 and c (I will recheck). I don't know how good that is, and would like to find the population distribution from data for simulation purposes as a mixture model, but introducing something like that here would complicate the inference (too many variables).

@wmwv
Copy link

wmwv commented Mar 20, 2015

By construction x_1 and c are meant to be uncorrelated.

Physically, yes, the intrinsic color depends on x_1.

In SALT2, "c" is redefined to be the color with respect to color(x_1).

  • Michael

On Mar 20, 2015, at 09:13 , rbiswas4 notifications@github.com wrote:

My understanding was that those distributions in the PGM are priors. So, should we not be OK getting away with approximate distributions and it is fine if they don't look like the population distribution?

I did not think x1, and c population distributions are terribly correlated, and I believe that SN simulations currently have uncorrelated population distributions on x1 and c (I will recheck). I don't know how good that is, and would like to find the population distribution from data for simulation purposes as a mixture model, but introducing something like that here would complicate the inference (too many variables).


Reply to this email directly or view it on GitHub.

@rbiswas4
Copy link
Owner

@wmwv,

I think you have gone into an area that I don't know about. When you have time, would you mind explaining those statements a little more or adding references? Thanks.

@wmwv
Copy link

wmwv commented Mar 20, 2015

Guy+2007
"SALT2: using distant supernovae to improve the use of Type Ia supernovae as distance indicators"
http://adsabs.harvard.edu/abs/2007A&A...466...11G
Section 2
"""
As for SALT, the optical depth is expressed using a color offset with respect to the average at the date maximum luminosity in B-band, c = (B−V)_MAX − . This parametrization models the part of the color variation that is independent of phase, whereas the remaining color variation with phase is accounted for by the linear components.
"""

("MAX" in the above means "at time of B-band maximum light")

  • Michael

On Mar 20, 2015, at 09:35 , rbiswas4 notifications@github.com wrote:

@wmwv,

I think you have gone into an area that I don't know about. When you have time, would you mind explaining those statements a little more or adding references? Thanks.


Reply to this email directly or view it on GitHub.

@rbiswas4
Copy link
Owner

@wmwv

OK, I see what you meant by color(x_1) and now understand the second two parts of the statement. But this does could still allow x_1 and c to be correlated, right?

@wmwv
Copy link

wmwv commented Mar 20, 2015

Not if your data set looks like the set used to train SALT2.

E.g., Betoule14 JLA sample

http://adsabs.harvard.edu/abs/2014A%26A...568A..22B

retrained SALT2 on the JLA sample. You can take a look at it

http://supernovae.in2p3.fr/sdss_snls_jla/ReadMe.html

Copy-and-paste:

curl -O http://supernovae.in2p3.fr/sdss_snls_jla/jla_likelihood_v6.tgz
tar xvzf jla_likelihood_v6.tgz

python
from astropy.io import ascii
import matplotlib.pyplot as plt
file='jla_likelihood_v6/data/jla_lcparams.txt'
jla=ascii.read(file)
plt.scatter(jla['x1'],jla['color'])
plt.xlabel('x1')
plt.ylabel('color')
plt.title('JLA Betoule14')
plt.savefig('JLA_Betoule14_x1_color.pdf')
plt.show()

and you'll get that attached plot which shows that x1 and c are uncorrelated in the JLA sample.

If your sample is different, then it's possible that there may be some correlation, but we can definitely ignore any correlation between x1 and c for now.

  • Michael

On Mar 20, 2015, at 10:35 , rbiswas4 notifications@github.com wrote:

@wmwv

OK, I see what you meant by color(x_1) and now understand the second two parts of the statement. But this does could still allow x_1 and c to be correlated, right?


Reply to this email directly or view it on GitHub.

jla_betoule14_x1_color

@drphilmarshall
Copy link
Author

Lovely - a pair of independent Gaussians it is then!

On Fri, Mar 20, 2015 at 11:26 AM, wmwv notifications@github.com wrote:

Not if your data set looks like the set used to train SALT2.

E.g., Betoule14 JLA sample

http://adsabs.harvard.edu/abs/2014A%26A...568A..22B

retrained SALT2 on the JLA sample. You can take a look at it

http://supernovae.in2p3.fr/sdss_snls_jla/ReadMe.html

Copy-and-paste:

curl -O http://supernovae.in2p3.fr/sdss_snls_jla/jla_likelihood_v6.tgz
tar xvzf jla_likelihood_v6.tgz

python
from astropy.io import ascii
import matplotlib.pyplot as plt
file='jla_likelihood_v6/data/jla_lcparams.txt'
jla=ascii.read(file)
plt.scatter(jla['x1'],jla['color'])
plt.xlabel('x1')
plt.ylabel('color')
plt.title('JLA Betoule14')
plt.savefig('JLA_Betoule14_x1_color.pdf')
plt.show()

and you'll get that attached plot which shows that x1 and c are
uncorrelated in the JLA sample.

If your sample is different, then it's possible that there may be some
correlation, but we can definitely ignore any correlation between x1 and c
for now.

  • Michael

On Mar 20, 2015, at 10:35 , rbiswas4 notifications@github.com wrote:

@wmwv

OK, I see what you meant by color(x_1) and now understand the second two
parts of the statement. But this does could still allow x_1 and c to be
correlated, right?


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#12 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants