Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to emmeans support: dpar = "mean" #993

Merged
merged 5 commits into from
Sep 21, 2020

Conversation

rvlenth
Copy link
Contributor

@rvlenth rvlenth commented Sep 1, 2020

This PR adds the option dpar = "mean" to the support methods for the emmeans package. When emmeans() or other package function is called with dpar = "mean", we obtain the expectation of the posterior predictive distribution at each grid point, rather than one of the model parameters.

The implementation has two basic parts:

  • In recover_data.brmsfit, we combine all the predictors involved in modeling all fixed-effect parameters, so that
    the reference grid includes reference levels for all these predictors.
  • In emm_basis.brmsfit, the posterior sample (post.beta slot) is obtained using posterior_epred().
    The bhat and V slots are the column means and covariance matrix, and the X matrix is the identity.

Note that I also added a detail to the documentation, and extendsd the example that was in place.
That example serves as a good illustration for dpar = "mean" because the family is lognormal, making for
a stark difference between the estimated mu parameter and the posterior_epred values of
exp(mu + sigma^2/2).

@paul-buerkner paul-buerkner changed the base branch from master to emmeans-mean September 2, 2020 06:25
@paul-buerkner
Copy link
Owner

Thank you a lot! That was quick. :-D

I moved the PR to be merged with the new emmeans-mean branch I just created so that I can make changes to the PR before merging into master. I will take a detailed look this week.

@nahorp
Copy link
Sponsor

nahorp commented Sep 8, 2020

Thanks @rvlenth for working on this so quickly! :) I have a couple of questions,

  1. Note that I also added a detail to the documentation, and extendsd the example that was in place.
    That example serves as a good illustration for dpar = "mean" because the family is lognormal, making for
    a stark difference between the estimated mu parameter and the posterior_epred values of
    exp(mu + sigma^2/2).

Where can I find this example?

  1. Similar to dpar = "mean", theoretically dpar = "variance" or dpar = "sd" could be calculated too based on the mu and sigma parameters (continuing with the lognormal example)?

@rvlenth
Copy link
Contributor Author

rvlenth commented Sep 8, 2020 via email

@nahorp
Copy link
Sponsor

nahorp commented Sep 8, 2020

Note that in the lognormal case, "mu" is the mean of log(Y), whereas "mean" is the mean of Y.

I see... and in the lognormal case, the "mean" is calculated as exp(mu + sigma^2/2), correct?

Similarly, if "sigma" is the standard deviation of log(Y), could dpar = "sd" give the standard deviation of Y? According to this website, that would be (exp(sigma^2) - 1) * mean^2 for a lognormal distribution...

@rvlenth
Copy link
Contributor Author

rvlenth commented Sep 8, 2020 via email

@nahorp
Copy link
Sponsor

nahorp commented Sep 9, 2020

It would take some extra coding to produce estimates of the SD of Y.

Indeed, that is what I meant to say when I suggested dpar = "sd" or dpar = "variance" (apologies for the inarticulation). I understand the dpar argument in regards to it being a distributional parameter of the model and I agree that a different argument might be better suited for results of posterior_epred() in the long run.

@paul-buerkner paul-buerkner merged commit c1ad1fc into paul-buerkner:emmeans-mean Sep 21, 2020
@paul-buerkner
Copy link
Owner

Thank you again for this PR! It took me some time to work on this as I was on holiday. I merged it to a branch of brms and will make some changes before merging into master. I have one question @rvlenth with regard to your code:

preds <- all.vars(.extract_par_terms(object, resp, NULL, nlpar)$fe)
for (pf in object$formula$pforms)
    preds <- union(preds, all.vars(pf[-2]))
bterms <- list(fe = reformulate(preds))

The for loop over all formulas seems to be dangerous territory for me, especially all.vars(pf[-2]). This extracts all variables (on the RHS of the formula) and puts it into preds. Accordingly, it may contain a lot of variables of non-yet supported terms, for example, splines or GPs. Further, as it does extract variables without preprocessing by brms, it may even contain variables that are not even relating to any data (but to parameters or pure symbols used for other purposes by brms). Can you explain what the desired behavior should be (that is, what variables you want to extract here) so that I can change the code accordingly?

@rvlenth
Copy link
Contributor Author

rvlenth commented Sep 21, 2020

I'm a little unclear on whether this has been resolved or not. But what is needed in this loop is to make sure we include all the variables that will be needed later in posterior_epred(), in the section of emm_basis.brmsfit() that has the same conditions. If there are more things here than are needed, then I think that means that you need to use something besides dpar and nlpar to select what model params are in play. Note that, previous to this PR, the emm_basis.brmsfit() allowed the user to specify resp, dpar, and nlpar without any restrictions. So if there are restrictions on what the user should specify, those restrictions need to be imposed in both recover_data.brmsfit() and emm_basis.brmsfit().

@paul-buerkner
Copy link
Owner

Ok, thank you. I will change the code so that all data variables are included and see how things work out and then report back.

@paul-buerkner
Copy link
Owner

Thanks. I think I fixed it now.

@rvlenth
Copy link
Contributor Author

rvlenth commented Sep 22, 2020 via email

@paul-buerkner
Copy link
Owner

Which variables, for example, would not be required for location, scale and shape parts?

@rvlenth
Copy link
Contributor Author

rvlenth commented Sep 22, 2020 via email

@paul-buerkner
Copy link
Owner

I see. yeah my response earlier was quite cryptic :-D I hope thinks should be ok right now but I am happy to adjust the code if problems occur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants