Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: transform get_prior, make_stancode and make_standata into S3 methods #1604

Merged
merged 8 commits into from Mar 3, 2024

Conversation

venpopov
Copy link
Contributor

Hi @paul-buerkner, related to the recent discussion on the Stan forums about packages that build on top of brms, I'd like to propose the following.

Proposal

Transform get_prior(), make_stancode() and make_standata() into generic S3 functions, with the current functions assigned to a get_prior.brmsformula(), make_stancode.brmsformula() and make_standata.brmsformula() (and equivalent aliases for .formula and .mvbrmsformula)

Why?

Some feature shared by a few of the packages that build on top of brms in the list I posted, is that they:

  • generate custom stancode to pass to brms
  • provide default priors for their models
  • define a custom formula class (bmmformula, brm_formula

To deal with the above, some package also provide analogues to brms`s get_prior(), make_stancode() and make_standata() that work with the models, formulas and data transformation they have provided.

For example, in bmm, we have:

Similarly, @jsocolar's flocker package has analogues get_flocker_prior() and flocker_stancode(), that also take a formula as the first argument. This is the case in some other packages, but I haven't made a full list yet.

I've been thinking about more general package development based on brms, and this would be really streamlined if some of the user-facing functions like those mentioned, can be used across packages in a consistent way - to allow a general development framework, and to make it easier for users.

We were thinking of overloading these methods within bmm by providing a get_prior.default(), make_stancode.default() etc methods that call the current brms functions, so that when users load our package, they can still use the get_prior() with both brms and bmm.

This is made possible by the fact that we have a custom formula class, something present in other similar packages.

But rather than us overloading brms functions, I think it would be nicer, and more sustainble and generalizable, to transform these functions natively within brms

Example implementation

In this PR I've implemented as an example the S3 method for the get_prior() function. I have tested it both within brms and bmm and all the existing tests pass - when called with a formula, brmsformula or mvbrmsformula object, it applies the brms get_prior function, and when it is applied with a bmmformula object, it applies our method.

Let me know what you think - If you approve of the idea, I can do the other functions too.

Towards a meta-development framework around brms

With these changes, it becomes easier to work towards a broader development framework for packages that build on top of brms. A package of this type would generally have:

  • A custom pkgformula class, which allows all these methods to be adjusted as needed (in our case, bmmformula, but I've seen bayesnecformula, brm_formula, and a few others). Then they have some internal methods that translate their subject-matter formulas into valid brmsformulas
  • A custom pkgfit class, which is applied in addition to the brmsfit class, allowing for customizing the base generic functions such as summary, update, etc. For example, in bmm we needed to write a new update method, to make sure that all the transofmrations and preprocessing is reapplied before submitting to update.brmsfit.
  • A set of internal functions and objects that:
    • take the pkgformula, pkgfamily, the user data, to generate valid brms arguments, including setting default priors informed by subject knowledge, transformations of the data.

image

@paul-buerkner
Copy link
Owner

paul-buerkner commented Feb 28, 2024 via email

@venpopov
Copy link
Contributor Author

venpopov commented Feb 28, 2024

Cool! Yes, I initially had it coded like that, wasn't sure whether you'd prefer it to be more explicit. If the current function is the default, in principle there is no need for the aliases *.formula, *.brmsformula and *.mvbrmsformula, as all would be handled by the default. Or do you prefer to have them explicitly handled as aliases? I can change that and apply it to the other functions as well

@paul-buerkner
Copy link
Owner

Only the default method. No aliases

- use \code{\link[brms:get_prior.default]{get_prior}} ueverywhere where [brms:get_prior] was previously used to ensure links point to the default method documentation rather than the S3 method
- assign previous function to make_standata.default
- update links in docs to \code{\link[brms:make_standata.default]{make_standata}
- assign previous function to make_stancode.default
- update links in docs to \code{\link[brms:make_stancode.default]{make_stancode}
- fix typo on line99 of make_standata.R incorrectly referring to make_stancode instead of make_standata
@venpopov
Copy link
Contributor Author

venpopov commented Feb 28, 2024

Ok, should be done now! Do you want me to add an entry to NEWS.md if you approve the changes?

Following the example of how you handled documentation of other methods such as summary and update, I replaced all links from other docs to

\code{\link[brms:make_standata.default]{make_standata}}

and equivalent for the other two functions. That way it displays the name of the generic, but it links to the documentation of the brms default, so users always get that info


I also tested it with the now updated methods in our package, and it works great! 😃

@paul-buerkner
Copy link
Owner

Great, thank you! I will check it out in the next couple of days and make perhaps make some tiny edits before merging. Will then also update NEWS.

@paul-buerkner
Copy link
Owner

I have worked on this PR further and decided to make a bit more bold changes. Specifically, I changeed make_stancode and make_standata to be aliases of stancode and standata, respectively. This requires your package to now define methods for stancode and standata. The advantages of this is (a) nicer names IMO and (b) the possibility of using object as first argument name rather than formula (which is awkward for non-formula objects), and (c) no break in backwards compatibility.

I would appericiate if you could try it out and let me know if everything is working as expected. Also, it would be nice if you checked my changes to the documentation (consistency, correct links etc.) Before merging, I will then make some more edits such as changes some examples, tests, and file names, but nothing that is more than a search and replace.

@codecov-commenter
Copy link

codecov-commenter commented Mar 1, 2024

Codecov Report

Attention: Patch coverage is 96.96970% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 82.06%. Comparing base (81d9233) to head (40694b1).

❗ Current head 40694b1 differs from pull request most recent head baeaa20. Consider uploading reports for the commit baeaa20 to get more accurate results

Files Patch % Lines
R/priors.R 92.85% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1604      +/-   ##
==========================================
+ Coverage   82.02%   82.06%   +0.03%     
==========================================
  Files          70       70              
  Lines       19888    19886       -2     
==========================================
+ Hits        16314    16320       +6     
+ Misses       3574     3566       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@venpopov
Copy link
Contributor Author

venpopov commented Mar 2, 2024

OK, will do today

@venpopov
Copy link
Contributor Author

venpopov commented Mar 3, 2024

Done. And tested it with changes to our package, it works. Thanks!

The only things that I didn't change were:

  • the readme, because I'm not sure what wording you would prefer (line 262+):
### How can I extract the generated Stan code?

If you have already fitted a model, just apply the `stancode` method on the
fitted model object. If you just want to generate the Stan code without any
model fitting, use the `make_stancode` function.
  • the tests. They work with make_stancode and make_standata, and these just call the methods, so it's probably fine to leave them as is.

Maybe you also want to rename the R files to stancode.R and standata.R? Up to you

@paul-buerkner
Copy link
Owner

Great! Will make the last changes and then merge. I will do the file renaming on main after merging so we see the actual changes in this PR

@paul-buerkner paul-buerkner merged commit 2c00f71 into paul-buerkner:master Mar 3, 2024
4 of 5 checks passed
@venpopov venpopov deleted the get_prior_s3 branch March 3, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants