v0.1.0 - A major restructuring of the package #41

venpopov · 2024-02-01T20:23:25Z

Summary

A major restructuring of the package to support stable and generalizable development of future models. Prior to this, models were coded as a long series of if else statements in a single fit_model() function. This was unsustainable and prone to breakage when adding new models. The current pull request resolves this and closes #20.

The fit_model() function is abstracted away from the specifics of the models. It calls a series of new generic functions for:

setting model fitting options: configure_options()
validating the requested model: check_model()
validating the user-supplied formula: check_formula()
validating the user-supplied data, and performing data-preprocessing according to the selected model requirements: check_data()
constructing the model family, formula and prior according to the selected model: configure_model()
combining the user prior with the default prior for the model: combine_prior()

To achieve this, model's are coded as S3 classes of the form .model_modelname(). Each models has several class attributes, from most generic to most specific.

The check_model() function retrieves the corresponding model class based on a character input (e.g. for input model='IMMfull' it retrieves the object created by .model_IMMfull().
when this object is passed to the check_data(), and configure_model() S3 generic methods, they call class specific methods for the data preprocessing and model configuration
reduced code redunancy by having models share certain classes (e.g., part of the data preprocessing and argument checking for non_targets is the same for the 3p model and the IMM family of models

Additionally, multiple new utility functions improve the code readibility and stability.

Adding new models to the package will not require only to:

create the model class (e.g. .model_mynewmodel())
create S3 methods for check_data.mynewmodel() and configure_model.mynewmodel()
adding unit tests for the new model

This should ensure that any new model additions will not affect the existing ones.

Other

deprecated model_type in facor of model argument to fit_model()
add supported_models() function to list currently supported models

Tests

Added extensive new unit tests for most of the functionality in the package to ensure stability.

[x] Confirm that all tests passed
[x] Confirm that devtools::check() produces no errors

Release notes

New framework for future model development ensures package stability
model_type argument to fit_model() is deprecated. Please use model instead.
View which models are currently supported by bmm via supported_models()

- Add functions for listing available models - Add name and domain attributes of models - Dynamically update README to list available models - Add helper functions from brms with copyright and license notice

GidonFrischkorn · 2024-02-02T07:58:21Z

R/configure-models.R

+  # if there is setsize 1 in the data, set constant prior over thetant for setsize1
+  if ((1 %in% data$ss_numeric) && !is.numeric(data[[setsize_var]])) {
+    prior <- prior +
+      brms::prior_("constant(-100)", class="b", coef = paste0(setsize_var, 1), nlpar="thetant")


There is no error here, and given the current specification of the models using the 0 + setsize coding this should work fine. But we will have to find a more general solution should users specify the formula including an intercept and still have set size 1 in their data.
Of the top of my head I did not come up with a simple and generalizable solution, because a lot of this depends on the contrasts that are defined on the set size variable. For now, we should probably include a check if an intercept is included when set sizes vary and throw a warning. What do you think?

GidonFrischkorn · 2024-02-02T07:58:38Z

R/configure-models.R

+  # if there is setsize 1 in the data, set constant prior over thetant for setsize1
+  if ((1 %in% data$ss_numeric) && !is.numeric(data[[setsize_var]])) {
+    prior <- prior +
+      brms::prior_("constant(0)", class="b", coef = paste0(setsize_var, 1), nlpar="a")


Same as the comment above.

GidonFrischkorn · 2024-02-02T07:59:10Z

R/configure-models.R

+  # if there is setsize 1 in the data, set constant prior over thetant for setsize1
+  if ((1 %in% data$ss_numeric) && !is.numeric(data[[setsize_var]])) {
+    prior <- prior +
+      brms::prior_("constant(0)", class="b", coef = paste0(setsize_var, 1), nlpar="s")


Again. Probably throw same warning regarding the formula specification.

GidonFrischkorn · 2024-02-02T07:59:27Z

R/configure-models.R

+  if ((1 %in% data$ss_numeric) && !is.numeric(data[[setsize_var]])) {
+    prior <- prior +
+      brms::prior_("constant(0)", class="b", coef = paste0(setsize_var, 1), nlpar="a")+
+      brms::prior_("constant(0)", class="b", coef = paste0(setsize_var, 1), nlpar="s")


GidonFrischkorn · 2024-02-02T08:21:29Z

R/data-helpers.R

+#' @export
+check_data.vwm <- function(model, data, formula, ...) {
+  resp_name <- get_response(formula$formula)
+  if (max(abs(data[[resp_name]]), na.rm=T) > 10) {


We already discussed that. And I just wanted to note it down here. That we wanted to come up with a more general way of testing the scaling of the response variable.

GidonFrischkorn

From my perspective these changes all look good. I made some minor comments and created some issues for enhancements and generalizations.

Thanks for the great work!

GidonFrischkorn · 2024-02-02T08:21:50Z

R/data-helpers.R

+  }
+
+  non_targets <- dots$non_targets
+  if (max(abs(data[,non_targets]), na.rm=T) > 10) {


Same as above.

GidonFrischkorn · 2024-02-02T08:22:55Z

R/data-helpers.R

+  }
+
+  spaPos <- dots$spaPos
+  if (max(abs(data[,spaPos]), na.rm=T) > 10) {


Again, generalize scaling tests.

GidonFrischkorn · 2024-02-02T08:30:27Z

R/models-list.R

+#' @param model A string with the name of the model supplied by the user
+#' @return A function of type .model_*
+#' @details the returned object is a function. To get the model object, call the
+#'   returned function, e.g. `get_model("2p")()`


I think we should add this as an additional function to make it ease to print out some basic information about the models implemented in the package.

GidonFrischkorn · 2024-02-02T08:33:24Z

R/prior.R

+combine_prior <- function(config_args, user_prior) {
+  if (!is.null(user_prior)) {
+    default_prior <- config_args$prior
+    combined_prior <- dplyr::anti_join(default_prior, user_prior, by=c('class', 'dpar','nlpar','coef','group','resp'))


We should clearly specify how anti_join deals with duplicate specifications of priors. In its current implementation does it prioritize default priors of user priors or vice versa? I would probably delete default priors as soon as user priors are specified for a certain parameter.

In this case, we might want to throw a warning message about the deleted default prior and communicate to users that in some cases this might compromise model identification. Or do you think this is unnecessary?

GidonFrischkorn · 2024-02-02T08:35:22Z

README.Rmd

+
+```{r}
+bmm::supported_models()
+```


Once, we have a get_model_info function, we should add here that you can print out some basic information about the model using this to get an idea about the specification and maybe also a reference to the tutorial that describes its use.

venpopov added 17 commits February 1, 2024 17:17

Merge branch 'develop' of https://github.com/venpopov/bmm into develop

132c01b

initial restructuring of fit_model() - skeleton code

168f5c8

deprecate 'model_type' argument. Use 'model' instead

674b21a

Add functions for listing available models

1ac2c14

- Add functions for listing available models - Add name and domain attributes of models - Dynamically update README to list available models - Add helper functions from brms with copyright and license notice

add get_model() function

d95bb80

generalize check_data() with S3 methods

e17a879

make check_data S3 methods consistent to pass check()

029195b

complete the generalization of check_data()

bcfd847

fix how the deprecation of model_type is handled

21692e0

Add tests for check_data() and check_model()

96e6791

Finish generalization of fit_model()

0a1609f

Fix argument chains not found in configure_options

5f56bef

Remove copy of do_call, fix config_options

4d99da9

add tests for fit_model

5400612

small fixes to README

c5d82be

extensive documentation of restructured functions

d74e316

update package version to 0.1.0

41ee974

venpopov added the PR - minor Pull-request should update minor version label Feb 1, 2024

venpopov added this to the 1.0.0 milestone Feb 1, 2024

venpopov linked an issue Feb 1, 2024 that may be closed by this pull request

Rework the fit_model function #20

Closed

Update readm with info about supported_models()

990475b

venpopov linked an issue Feb 1, 2024 that may be closed by this pull request

Write automatic tests for functions #23

Closed

venpopov mentioned this pull request Feb 1, 2024

generalize the model specific arguments #43

Closed

GidonFrischkorn reviewed Feb 2, 2024

View reviewed changes

GidonFrischkorn approved these changes Feb 2, 2024

View reviewed changes

GidonFrischkorn merged commit 4a90040 into develop Feb 2, 2024

GidonFrischkorn deleted the feature/issue-20-rework-the-fit_model-function branch February 2, 2024 08:59

venpopov mentioned this pull request Feb 3, 2024

v0.1.1 - Generalize how model-specific arguments are provided #48

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.0 - A major restructuring of the package #41

v0.1.0 - A major restructuring of the package #41

venpopov commented Feb 1, 2024

GidonFrischkorn Feb 2, 2024

GidonFrischkorn Feb 2, 2024

GidonFrischkorn Feb 2, 2024

GidonFrischkorn Feb 2, 2024

GidonFrischkorn Feb 2, 2024

GidonFrischkorn left a comment

GidonFrischkorn Feb 2, 2024

GidonFrischkorn Feb 2, 2024

GidonFrischkorn Feb 2, 2024

GidonFrischkorn Feb 2, 2024

GidonFrischkorn Feb 2, 2024

v0.1.0 - A major restructuring of the package #41

v0.1.0 - A major restructuring of the package #41

Conversation

venpopov commented Feb 1, 2024

Summary

Other

Tests

Release notes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GidonFrischkorn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment