What is a model? #19
Coffee-addled rant here, but bear with me. I think it'll be really valuable as the
In classic statistics land, if you have some data
(Aside: calling a particular
To me, a key question is how do we express this idea in code. For example, if we write out a linear model:
Sure, for the penalized regression methods you have to estimate the penalization parameter, but this is a hyperparameter, which I think we can broadly think about as a parameter that we have to estimate but that we don't really care what value it takes on. So these all have the same parameter space, but different hyperparameter spaces. Another way to express this same idea is that what differentiates MCP from LASSO from OLS, etc, etc is not that they are different models but rather that they are different techniques for estimating the same model.
(Aside: one interesting question is whether or not hierarchical models belong on the list above. I think it depends on whether or not you care about the group level parameters, in which case you are now in a new parameter space. OLS with HC errors is another interesting case to think about. In this case the model is still the linear model, but now we're more explicitly declaring that we want to estimate the covariance matrix, and also that we are going to use, say, HC1 to do so. I'd still call this a linear model, but only if the original definition of the linear model specified covariance as an estimand).
If I'm going to actually implement things in code, I want to work with an object that specifies the estimation method, which likely is closely tied to a hyperparameter space.
I think that a parsnip model specification shouldn't work with the classical stats sense of a model like we're defined above, but rather should encapsulate all the things you need to do to get parameters back. Parsnip is already doing a lot of this, but I think there's a lot of value in being very clear about what a parsnip object should specify. In my mind this includes, at the minimum:
For now I think it makes sense to call this a
I strongly believe that a good interface very clearly differentiates between a group of
Related issue: canonical modelling examples
A related issue is to find canonical modelling examples that are sufficient to develop our intuition about what the code objects should look like. OLS is too simple because it doesn't need a lot of the machinery that other models need. I think that a good starting place is to have one canonical example where we can employ the submodel trick (penalized regression seems like a good place to start), and one where we can't (maybe SVMs here?). Another way to think about this: we should have one canonical example where there is exploitable structure in the hyperparameter space, and one canonical example where there isn't.
I'll take a bit of time to digest this in detail. It's a great point and @lionel- and I have had similar discussions.
My current headspace on this:
Anyway, I'll comment in more detail later about some specific things above.