Evaluating parametric terms with factor by-variable #68

Excidion · 2020-05-12T20:02:05Z

When calling evaluate_parametric_term on a a term of two factor variables, the information on wich values belongs to which combination seems to get lost.

Imagine a model with a formula that includes y~factorvar1:factorvar2. When calling

evaluate_parametric_term(model, "factorvar1:factorvar2")

the resulting table contains no information on which row belongs to which combination of levels of the factors. The column "term" only contains the entry "factorvar1:factorvar2" in every row.

Also the table has the same number of rows as my dataset, which is not the behavior of evaluate_parametric_term or evaluate_smooth that I am used to.

Am I missing something or have I been interpreting the purpose of these functions wrong? Or is this indeed a bug?

The text was updated successfully, but these errors were encountered:

gavinsimpson · 2020-05-13T17:10:44Z

There's a bug here somewhere (either by design or by problem with the code). I'll take a look. Thanks for letting me know.

gavinsimpson · 2020-05-14T14:56:10Z

@Excidion I figured out what the issue was but fixing this means you won't be able to do what you wanted with the form of interaction. R treats f1:f2 as an order 2 term even if it is the only term (beyond the constant) in the model, and I'm using the order according to R to stop evaluate_parametric_term() with an error.

You could achieve what you want using interaction(f1, f2, drop = TRUE) to create a single factor from the interaction of f1 and f2 in the data before fitting the model. R would consider that term order 1.

I'd welcome some input on what you would like to have happen if evaluate_parametric-term() were to support interaction terms. The original behaviour was to plot the partial effect of a term (as plot.gam() would and as per termplot()) and it seems that handling interactions this way isn't something that makes sense.

I could literally return the contribution to the fitted model for the f1:f2 term only and in value return

the usual R generated labels you see for combinations of factor levels in interactions: f1Level1:f2Level2, or
just the concatenate the pair of levels, one per factor into a string: Level1-Level2.

The other option, which would be a departure from what evaluate_smooth does, would be to return what are often called the estimated marginal means; i.e. return predictions for all combinations of levels in the data. This would no longer be a partial effect for a single term; I'd need to include the main effects of the factors involved in the interaction. They'd also be conditional on some values for the other terms in the model as I have to provide something for the other terms in the model to be able to use predict(). I feel there are good existing ways to do this (emmeans), however.

Thoughts on what behaviour you were expecting?

Excidion · 2020-05-14T16:27:39Z

I am definitely more fond of returning the contributions to the fitted model. This is, at least for me, closer to some of the core aspects why i value GAM models - the intuitive interpretability of their results.

On how to handle the entries of the value column: The first bullet point seems very R-ish and more in line with how evaluate_smooth() handles this (eg. a term with s(var):fvar has s(var):f1level1 in the smooth column and an extra by_variable column).

Hope I picked up on your questions in the right way and my answers can be helpful.

gavinsimpson self-assigned this May 13, 2020

gavinsimpson added the bug Something isn't working label May 13, 2020

gavinsimpson changed the title ~~Evaluating paremetric terms with factory by-variable~~ Evaluating parametric terms with factor by-variable May 13, 2020

gavinsimpson added a commit that referenced this issue May 14, 2020

final fixes for #68

94c65b8

gavinsimpson added a commit that referenced this issue May 14, 2020

Merge branch 'issue-#68': fixes the main issue of #68

ca9fe68

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluating parametric terms with factor by-variable #68

Evaluating parametric terms with factor by-variable #68

Excidion commented May 12, 2020

gavinsimpson commented May 13, 2020

gavinsimpson commented May 14, 2020 •

edited

Loading

Excidion commented May 14, 2020

Evaluating parametric terms with factor by-variable #68

Evaluating parametric terms with factor by-variable #68

Comments

Excidion commented May 12, 2020

gavinsimpson commented May 13, 2020

gavinsimpson commented May 14, 2020 • edited Loading

Excidion commented May 14, 2020

gavinsimpson commented May 14, 2020 •

edited

Loading