Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix 8 schools example #71

Open
bob-carpenter opened this issue Jul 14, 2016 · 10 comments
Open

fix 8 schools example #71

bob-carpenter opened this issue Jul 14, 2016 · 10 comments
Labels

Comments

@bob-carpenter
Copy link
Contributor

@wds15 mentioned in an email on stan-dev that

https://github.com/stan-dev/example-models/blob/master/misc/eight_schools/eight_schools.stan

uses centered parameterization and gets lots of divergences.

We should fix the parameterization so that it works!

@jgabry
Copy link
Member

jgabry commented Jul 14, 2016

I think it's useful to have the centered parameterization too, as a
comparison. So maybe we could have the one called "eight_schools" be the
better parameterization (in this case non-centered) but also have version
("eight_schools_bad", "eight_schools_naive" or something like that?) to
illustrate what goes wrong if you don't use that parameterization?

On Thu, Jul 14, 2016 at 10:59 AM, Bob Carpenter notifications@github.com
wrote:

@wds15 https://github.com/wds15 mentioned in an email on stan-dev that

https://github.com/stan-dev/example-models/blob/master/misc/eight_schools/eight_schools.stan

uses centered parameterization and gets lots of divergences.

We should fix the parameterization so that it works!


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#71, or mute the thread
https://github.com/notifications/unsubscribe/AHb4Q-MT1AQWoCDCJEjz4D4Cf1hcoeDcks5qVk7EgaJpZM4JMhfQ
.

@wds15
Copy link

wds15 commented Jul 14, 2016

As long as the centered one is marked accordingly that it is bad and maybe even provide links to informational material which explains centered/non-centered, then that's a good thing, yes.

@betanalpha
Copy link

We desperately need to avoid names like “good” and “bad”, or
“right” and “wrong”, as the correct parameterization depends
on the model and the data. We have to convince the users
that MCMC can be fragile and they have to be careful — I know
many don’t want to year it, but it’s super important.

On Jul 14, 2016, at 4:11 PM, Jonah Gabry notifications@github.com wrote:

I think it's useful to have the centered parameterization too, as a
comparison. So maybe we could have the one called "eight_schools" be the
better parameterization (in this case non-centered) but also have version
("eight_schools_bad", "eight_schools_naive" or something like that?) to
illustrate what goes wrong if you don't use that parameterization?

On Thu, Jul 14, 2016 at 10:59 AM, Bob Carpenter notifications@github.com
wrote:

@wds15 https://github.com/wds15 mentioned in an email on stan-dev that

https://github.com/stan-dev/example-models/blob/master/misc/eight_schools/eight_schools.stan

uses centered parameterization and gets lots of divergences.

We should fix the parameterization so that it works!


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#71, or mute the thread
https://github.com/notifications/unsubscribe/AHb4Q-MT1AQWoCDCJEjz4D4Cf1hcoeDcks5qVk7EgaJpZM4JMhfQ
.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

@jgabry
Copy link
Member

jgabry commented Jul 15, 2016

On Friday, July 15, 2016, Michael Betancourt notifications@github.com
wrote:

We desperately need to avoid names like “good” and “bad”, or
“right” and “wrong”, as the correct parameterization depends
on the model and the data.

Good point, I was thinking of model + data but yeah the model code on its
own isn't good or bad without knowledge of the data.

@betanalpha
Copy link

In general these things are much more complex than many beginning
users want them to be. There’s whether or not expectations are
accurately estimated, there’s whether or not the model is a good
fit to the data, and the interactions of these loosely codified in the
Folk Theorem. I know the reality can scare people away to less
robust tools, but we can’t sugar coat things forever.

On Jul 15, 2016, at 9:12 AM, Jonah Gabry notifications@github.com wrote:

On Friday, July 15, 2016, Michael Betancourt notifications@github.com
wrote:

We desperately need to avoid names like “good” and “bad”, or
“right” and “wrong”, as the correct parameterization depends
on the model and the data.

Good point, I was thinking of model + data but yeah the model code on its
own isn't good or bad without knowledge of the data.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@sakrejda
Copy link

We could do a case study that exercises this idea enough/not enough data
for centered parameterization and then enough/too much for non-centered. I
have some examples from ODSC that come close to this but I never got to
where I could set a seed and generate a working/failing data set for each
parameterization, and my stuff isn't for 8-schools.

On Fri, Jul 15, 2016 at 11:36 AM Michael Betancourt <
notifications@github.com> wrote:

In general these things are much more complex than many beginning
users want them to be. There’s whether or not expectations are
accurately estimated, there’s whether or not the model is a good
fit to the data, and the interactions of these loosely codified in the
Folk Theorem. I know the reality can scare people away to less
robust tools, but we can’t sugar coat things forever.

On Jul 15, 2016, at 9:12 AM, Jonah Gabry notifications@github.com wrote:

On Friday, July 15, 2016, Michael Betancourt notifications@github.com
wrote:

We desperately need to avoid names like “good” and “bad”, or
“right” and “wrong”, as the correct parameterization depends
on the model and the data.

Good point, I was thinking of model + data but yeah the model code on its
own isn't good or bad without knowledge of the data.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#71 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAfA6ZSXt63g2dJpL1GNELQWlSe0Z4kRks5qV6kFgaJpZM4JMhfQ
.

@betanalpha
Copy link

You can always use the n-schools-ish model I used
for the HMC for hierarchical models paper.

On Jul 15, 2016, at 1:26 PM, Krzysztof Sakrejda notifications@github.com wrote:

We could do a case study that exercises this idea enough/not enough data
for centered parameterization and then enough/too much for non-centered. I
have some examples from ODSC that come close to this but I never got to
where I could set a seed and generate a working/failing data set for each
parameterization, and my stuff isn't for 8-schools.

On Fri, Jul 15, 2016 at 11:36 AM Michael Betancourt <
notifications@github.com> wrote:

In general these things are much more complex than many beginning
users want them to be. There’s whether or not expectations are
accurately estimated, there’s whether or not the model is a good
fit to the data, and the interactions of these loosely codified in the
Folk Theorem. I know the reality can scare people away to less
robust tools, but we can’t sugar coat things forever.

On Jul 15, 2016, at 9:12 AM, Jonah Gabry notifications@github.com wrote:

On Friday, July 15, 2016, Michael Betancourt notifications@github.com
wrote:

We desperately need to avoid names like “good” and “bad”, or
“right” and “wrong”, as the correct parameterization depends
on the model and the data.

Good point, I was thinking of model + data but yeah the model code on its
own isn't good or bad without knowledge of the data.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#71 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAfA6ZSXt63g2dJpL1GNELQWlSe0Z4kRks5qV6kFgaJpZM4JMhfQ
.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@jgabry
Copy link
Member

jgabry commented Jul 15, 2016

Maybe three different eight schools models: the first with the actual data
(NCP better), the second with y scaled up by 10 but sigma the same (CP is
better), and the third with both y and sigma scaled up (NCP better again) ?

On Fri, Jul 15, 2016 at 1:49 PM, Michael Betancourt <
notifications@github.com> wrote:

You can always use the n-schools-ish model I used
for the HMC for hierarchical models paper.

On Jul 15, 2016, at 1:26 PM, Krzysztof Sakrejda notifications@github.com
wrote:

We could do a case study that exercises this idea enough/not enough data
for centered parameterization and then enough/too much for non-centered.
I
have some examples from ODSC that come close to this but I never got to
where I could set a seed and generate a working/failing data set for each
parameterization, and my stuff isn't for 8-schools.

On Fri, Jul 15, 2016 at 11:36 AM Michael Betancourt <
notifications@github.com> wrote:

In general these things are much more complex than many beginning
users want them to be. There’s whether or not expectations are
accurately estimated, there’s whether or not the model is a good
fit to the data, and the interactions of these loosely codified in the
Folk Theorem. I know the reality can scare people away to less
robust tools, but we can’t sugar coat things forever.

On Jul 15, 2016, at 9:12 AM, Jonah Gabry notifications@github.com
wrote:

On Friday, July 15, 2016, Michael Betancourt <
notifications@github.com>
wrote:

We desperately need to avoid names like “good” and “bad”, or
“right” and “wrong”, as the correct parameterization depends
on the model and the data.

Good point, I was thinking of model + data but yeah the model code
on its
own isn't good or bad without knowledge of the data.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<
#71 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AAfA6ZSXt63g2dJpL1GNELQWlSe0Z4kRks5qV6kFgaJpZM4JMhfQ

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#71 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHb4QztH0rCm0w21lSkxPZA3gRhpKkwtks5qV8g5gaJpZM4JMhfQ
.

@betanalpha
Copy link

It’s the relative ratio that matters, so all you need to do
is scale the measured standard deviations relative to
the measured means. Which is exactly what I do in the
test model in the paper (the exact Stan program is in
the appendix).

On Jul 15, 2016, at 2:21 PM, Jonah Gabry notifications@github.com wrote:

Maybe three different eight schools models: the first with the actual data
(NCP better), the second with y scaled up by 10 but sigma the same (CP is
better), and the third with both y and sigma scaled up (NCP better again) ?

On Fri, Jul 15, 2016 at 1:49 PM, Michael Betancourt <
notifications@github.com> wrote:

You can always use the n-schools-ish model I used
for the HMC for hierarchical models paper.

On Jul 15, 2016, at 1:26 PM, Krzysztof Sakrejda notifications@github.com
wrote:

We could do a case study that exercises this idea enough/not enough data
for centered parameterization and then enough/too much for non-centered.
I
have some examples from ODSC that come close to this but I never got to
where I could set a seed and generate a working/failing data set for each
parameterization, and my stuff isn't for 8-schools.

On Fri, Jul 15, 2016 at 11:36 AM Michael Betancourt <
notifications@github.com> wrote:

In general these things are much more complex than many beginning
users want them to be. There’s whether or not expectations are
accurately estimated, there’s whether or not the model is a good
fit to the data, and the interactions of these loosely codified in the
Folk Theorem. I know the reality can scare people away to less
robust tools, but we can’t sugar coat things forever.

On Jul 15, 2016, at 9:12 AM, Jonah Gabry notifications@github.com
wrote:

On Friday, July 15, 2016, Michael Betancourt <
notifications@github.com>
wrote:

We desperately need to avoid names like “good” and “bad”, or
“right” and “wrong”, as the correct parameterization depends
on the model and the data.

Good point, I was thinking of model + data but yeah the model code
on its
own isn't good or bad without knowledge of the data.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<
#71 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AAfA6ZSXt63g2dJpL1GNELQWlSe0Z4kRks5qV6kFgaJpZM4JMhfQ

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#71 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHb4QztH0rCm0w21lSkxPZA3gRhpKkwtks5qV8g5gaJpZM4JMhfQ
.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@bgoodri
Copy link
Contributor

bgoodri commented Feb 6, 2017

see stan-dev/rstan#387

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants