-
-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to emmeans support: dpar = "mean" #993
Conversation
Thank you a lot! That was quick. :-D I moved the PR to be merged with the new |
Thanks @rvlenth for working on this so quickly! :) I have a couple of questions,
Where can I find this example?
|
Look in the help file for `emm_basis.brmsfit` in the patched version.
Previously, dpar = "mu" and dpar = "sigma" were already supported to obtain estimates of those parameters. Note that in the lognormal case, "mu" is the mean of log(Y), whereas "mean" is the mean of Y.
…Sent from my iPad
On Sep 8, 2020, at 12:50 AM, Rohan Puri <notifications@github.com> wrote:
Thanks @rvlenth<https://github.com/rvlenth> for working on this so quickly! :) I have a couple of questions,
1.
Note that I also added a detail to the documentation, and extendsd the example that was in place.
That example serves as a good illustration for dpar = "mean" because the family is lognormal, making for
a stark difference between the estimated mu parameter and the posterior_epred values of
exp(mu + sigma^2/2).
Where can I find this example?
1. Similar to dpar = "mean", theoretically dpar = "variance" or dpar = "sd" could be calculated too based on the mu and sigma parameters (continuing with the lognormal example)?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#993 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGMJPL2ZHX522SEJAT7KLLLSEXA2LANCNFSM4QSJOIZQ>.
|
I see... and in the lognormal case, the "mean" is calculated as Similarly, if "sigma" is the standard deviation of log(Y), could |
No -- dpar = "sd" gets you the SD of log(Y). It would take some extra coding to produce estimates of the SD of Y.
In general, 'dpar' is used to specify a MODEL parameter to estimate, and is an argument in predict() as well as emm_basis.brmsfit. dpar = "mean" is only allowed in the latter, and is handled as a special case. It appears possible that can cause confusion, and if so, something besides dpar should be chosen as the argument name when one wants the results of posterior_epred().
Russ
…Sent from my iPad
On Sep 8, 2020, at 4:24 PM, Rohan Puri <notifications@github.com> wrote:
Note that in the lognormal case, "mu" is the mean of log(Y), whereas "mean" is the mean of Y.
I see... and in the lognormal case, the "mean" is calculated as exp(mu + sigma^2/2), correct?
Similarly, if "sigma" is the standard deviation of log(Y), could dpar = "sd" give the standard deviation of Y? According to this website<https://brilliant.org/wiki/log-normal-distribution/#properties-of-the-log-normal-distribution>, that would be (exp(sigma^2) - 1) * mean^2 for a lognormal distribution...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#993 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGMJPL5RXF72ZMGQP7GNGW3SE2OJNANCNFSM4QSJOIZQ>.
|
Indeed, that is what I meant to say when I suggested |
Thank you again for this PR! It took me some time to work on this as I was on holiday. I merged it to a branch of brms and will make some changes before merging into master. I have one question @rvlenth with regard to your code:
The for loop over all formulas seems to be dangerous territory for me, especially |
I'm a little unclear on whether this has been resolved or not. But what is needed in this loop is to make sure we include all the variables that will be needed later in |
Ok, thank you. I will change the code so that all data variables are included and see how things work out and then report back. |
Thanks. I think I fixed it now. |
Paul,
We do need all the variables required for predicting the location, scale, and shape parts of the model. But I should have made it clearer that it could be a problem to have other variables NOT needed for that, because the software creates a grid of reference values based on those variables. FYI, that can be observed via emmeans::ref_grid(model)
Russ
…Sent from my iPad
On Sep 22, 2020, at 2:27 AM, Paul-Christian Bürkner <notifications@github.com> wrote:
Ok, thank you. I will change the code so that all data variables are included and see how things work out and then report back.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#993 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGMJPL6ZGDMCT5HDTH6IFITSHBGVDANCNFSM4QSJOIZQ>.
|
Which variables, for example, would not be required for location, scale and shape parts? |
I don't know. Maybe I misunderstood something you said earlier about looping over formulas, about additional parameters and unsupported features. My intent in looping over those formulas was to obtain the variables involved in the fixed-effects models for those different features. For example, the location model may involve Treatment and Rainfall, and the scale model may involve Treatment and Location. We need to make sure we have identified all three variables.
…Sent from my iPad
On Sep 22, 2020, at 7:59 AM, Paul-Christian Bürkner <notifications@github.com> wrote:
Which variables, for example, would not be required for location, scale and shape parts?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#993 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGMJPLZXYQ6O5CISJI45F63SHCNUVANCNFSM4QSJOIZQ>.
|
I see. yeah my response earlier was quite cryptic :-D I hope thinks should be ok right now but I am happy to adjust the code if problems occur. |
This PR adds the option
dpar = "mean"
to the support methods for the emmeans package. Whenemmeans()
or other package function is called withdpar = "mean"
, we obtain the expectation of the posterior predictive distribution at each grid point, rather than one of the model parameters.The implementation has two basic parts:
recover_data.brmsfit
, we combine all the predictors involved in modeling all fixed-effect parameters, so thatthe reference grid includes reference levels for all these predictors.
emm_basis.brmsfit
, the posterior sample (post.beta
slot) is obtained usingposterior_epred()
.The
bhat
andV
slots are the column means and covariance matrix, and theX
matrix is the identity.Note that I also added a detail to the documentation, and extendsd the example that was in place.
That example serves as a good illustration for
dpar = "mean"
because the family is lognormal, making fora stark difference between the estimated
mu
parameter and theposterior_epred
values ofexp(mu + sigma^2/2)
.