Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

find_ (get_?) algorithm #38

Open
DominiqueMakowski opened this issue Feb 12, 2019 · 11 comments
Open

find_ (get_?) algorithm #38

DominiqueMakowski opened this issue Feb 12, 2019 · 11 comments
Labels
Help us 👀 Extra attention is needed

Comments

@DominiqueMakowski
Copy link
Member

Although the fitting algorithm plays an important role, it is often unreported/uncared about. Surprisingly, its access is not really straightforward.

What do you think about a function that does that?

Here's a draft:

#' @export
find_algorithm <- function(model, ...) {
  UseMethod("find_algorithm")
}


#' @export
find_algorithm.merMod <- function(model) {
  if(model@resp$REML == 0){
    algorithm <- "ML"
  } else{
    algorithm <- "REML"
  }

  out <- list(
    "algorithm" = algorithm,
    "optimizer" = as.character(model@optinfo$optimizer)
  )

  return(out)
  }



#' @export
find_algorithm.stanreg <- function(model) {

  info <- model$stanfit@sim

  out <- list(
    "algorithm" = model$algorithm,
    "chains" = info$chains,
    "iterations" = info$iter,
    "warmup" = info$warmup
  )

  return(out)
}
@strengejacke
Copy link
Member

Yes, would be a function that fits into insight. I bet it's less straightforward for brms-models... For which models whould this make sense?

@DominiqueMakowski
Copy link
Member Author

DominiqueMakowski commented Feb 12, 2019

Especially bayesian (distinguishing between MCMC fullrank and meanfield), and for frequentist where there are customizable parameters (lme4). For fixed algorithms, we could hard code the used algorithm (for instance, "OLS" for lm).

However, my view is much more narrower than yours regarding the different packages and models, so I am not sure of the other cases of application.

But I still think it's worth it to start with a few supported models, and then eventually expand it depending on time, demand and so on...

@strengejacke
Copy link
Member

I had especially mixed models in mind, so functions like glmmTMB, lmer, glmer, lme, mixed_model, glmmPQL?!?

@DominiqueMakowski
Copy link
Member Author

Well for lme4's lmer and glmer from what I understood it's respectively either ML or REML, or ML.

For the others, I don't know...

And there are apparently additional differences:
image

@strengejacke
Copy link
Member

Especially the optimizers differ, I guess, not much the algorithm.

@strengejacke
Copy link
Member

Ok, I implemented a basic draft, but I have the feeling we should ask some mixed-models experts about what might be important to return.

@DominiqueMakowski
Copy link
Member Author

That's super cool, great work! Maybe we could post an issue on lme4 and glmmTMB to ask for confirmation and thoughts?

@strengejacke strengejacke modified the milestones: 0.2, later Mar 18, 2019
@strengejacke
Copy link
Member

I think we can take the current implementation for now, and then later check #38 (comment) more in detail.

@alexpghayes
Copy link

I believe the solution here is to differentiate between estimands, estimators, and estimation algorithms.

  • Estimands are the things you are trying to compute. For example, model parameters.

  • Estimators are how you are trying to compute those things. For example, in a simple linear regression, trying to estimate the model parameters, we might use the OLS estimator, the RIDGE estimator, the LASSO estimator, the MLE estimator, or the REML estimator. You can think of estimators as functions where you input data, and out pops an estimate. For example:

beta_mle = argmax (log-likelihood of normal linear model)
  • Estimation algorithms tell use the step-by-step sequence of instructions to follow to evaluate the estimator at a particular point. For example, we can evaluate at the OLS estimator at our data using a QR decomposition, or gradient descent, or stochastic gradient descent, etc.

Now, the situation with mixed models is somewhat more complex because we start approximating things. We start by picking either a REML or an ML estimator. But calculating these things out exactly isn't feasible or desirable for some reason, so instead we come up with a new estimator that approximates the original estimator. Whether you want to think about these approximations as the same as the original estimator or a new separate thing is sort of hazy. The approximations have different properties than the original estimator, but morally they're trying to be the same thing.

Anyway, if someone told me they fit a mixed model, I would want to know:

  • What parameters they were trying to estimate (the estimand)
  • Which estimator they were using (i.e. ML or REML)
  • Which algorithm they use to evaluate the estimator at the data.
  • Additionally: Which approximation to the estimator they were using if they weren't doing an exact computation. You might call the approximation they use the "algorithm". I don't know of any standard language for this but I'll ask around some. In a lot of cases, there is one best approximation to the estimator that everyone uses, but for mixed models, different approximations may be more appropriate at different times, so you get more into the thick of it.

I have a paper draft that goes into much more detail that I would be happy to share if you'd like.

@DominiqueMakowski
Copy link
Member Author

@alexpghayes Thanks for the clarification!

From that it seems that our find_algorithm function currently returns the estimator rather than the estimation algorithm. On the concrete side, in regards to insight, I wonder if changes as the following could potentially address this terminological discrepancy:

  1. find_algorithm as the master function:

    • Renaming the current find_algorithm -> find_estimator
    • Adding find_estimation to attempt to retrieve it when possible
    • find_algorithm would become a "general" function that would return a list containing the estimator and the estimation algorithm.
  2. find_estimation as the master function: same as above but find_estimation is the general function and find_algorithm the method specific for estimation algorithm.

However, these are breaking changes, hence must be carefully considered and thoroughly described.

  • For mixed models, from what I understand, it comes down to a design decision from our side as to how (if?) to we want to classify the approximation aspect. In order to maintain some continuity between the models (contributing to the unified vision proposed by the package), I would personally tend to "omit" it, and classify things based on the desired/philosophical/"moral" resemblance. In other words, I would classify the approximated ML estimator for a mixed model as ML, as the fact that it is approximated is implied by the nature of the model itself (the fact that it is a mixed model). Nevertheless, we could also make it explicit by adding a variable in the list returned by the master function, e.g. approximated = TRUE or approximation = "approx-method".
  • Importantly, we must also account for the case of Bayesian models. @alexpghayes, How in your opinion does the Bayesian sampling "algorithm" fit into this categorization? Is MCMC the estimator? The estimation algorithm? Or a separate category of "sampling algorithm" that does not overlap with the previous ones?

I have a paper draft that goes into much more detail that I would be happy to share if you'd like.

That's great, please do so :)

@alexpghayes
Copy link

From that it seems that our find_algorithm function currently returns the estimator rather than the estimation algorithm. On the concrete side, in regards to insight, I wonder if changes as the following could potentially address this terminological discrepancy: ...

I think there are lots of reasonable ways to split the functions, but I think in the end users will want to know both the estimator and the estimation algorithm. I would probably return both of these pieces of information in a list from a function estimation_details() if I were to implement this myself.

For mixed models, from what I understand, it comes down to a design decision from our side as to how (if?) to we want to classify the approximation aspect. In order to maintain some continuity between the models (contributing to the unified vision proposed by the package), I would personally tend to "omit" it, and classify things based on the desired/philosophical/"moral" resemblance. In other words, I would classify the approximated ML estimator for a mixed model as ML, as the fact that it is approximated is implied by the nature of the model itself (the fact that it is a mixed model). Nevertheless, we could also make it explicit by adding a variable in the list returned by the master function, e.g. approximated = TRUE or approximation = "approx-method".

I think going be moral resemblance is very reasonable for mixed models. I like the idea of explicitly telling the user the approximation method as well.

Importantly, we must also account for the case of Bayesian models. @alexpghayes, How in your opinion does the Bayesian sampling "algorithm" fit into this categorization? Is MCMC the estimator? The estimation algorithm? Or a separate category of "sampling algorithm" that does not overlap with the previous ones?

I don't know enough about Bayes to distinguish between estimators and estimation algorithms in the MCMC world. I imagine someone from the Stan crew could clarify pretty quickly, though.

I have a paper draft that goes into much more detail that I would be happy to share if you'd like.

Will you shoot me an email at alexpghayes@gmail.com and I'll send the draft.

@strengejacke strengejacke removed this from the 0.3 milestone Jun 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Help us 👀 Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants