Make `predict` easier? #134

nilshg · 2020-10-22T15:11:11Z

predict currently raises an error, advising the user to estimate the model with the save = :fe kwarg and the access the fixed effects using fe().

If I'm not mistaken, this advise only covers the retrieval of fixed effects, but doesn't actually give a prediction. I'm currently doing:

fem = reg(df, @formula(y ~ x1 + x2 + fe(x3) + fe(x4)), save = :fe)

[df.x1 df.x2] * coef(fem) .+ fe(fem).fe_x3 .+ fe(fem).fe_x4

which I believe produces the correct predictions from the model (do correct me if I'm getting something wrong here!)

Could this be turned into a predict method, which accepts the fem object directly provided it has been estimated with save = :fe?

The text was updated successfully, but these errors were encountered:

nilshg · 2020-10-22T16:49:00Z

FWIW I'm now using this function:

function predict(m::FixedEffectModel, df)
	!isempty(m.fe) || error("Model object does not contain fixed effects, to predict estimate the model with save = :fe")
	Matrix(df[!, m.coefnames]) * coef(m) .+ sum(Matrix(m.fe), dims = 2)
end

which seems to do the job, but clearly is lacking any sort of uncertainty estimates / prediction intervals

matthieugomez · 2020-10-31T01:14:36Z

Yes, would be nice to define predict in this case.
The issue is that predict should work on any dataset, not only the one that was originally passed to the function. Yet, your solution only works on the original dataframe. A better way would be to do a join on the variables used to define fixed effects.

You'd need to modify this function

FixedEffectModels.jl/src/FixedEffectModel.jl

Line 72 in 8e79116

function StatsBase.predict(x::FixedEffectModel, df)

nilshg · 2020-10-31T18:07:59Z

I'll have a look at this and try to get a PR together

nilshg · 2020-11-07T16:13:42Z

Okay I got stuck and am wondering whether I'm missing something:

predict is supposed to accept two arguments, the fitted model and some data for the covariates (could be old or new)
The only information we have on the fixed effects is in the fe field of the struct, which is an n-by-fe DataFrame

Based on this, it seems to me that it's not in general possible to make a prediction from the model object and some new data - the only way to know which number in fe belongs to which level of a fixed effect is by having the old data available (to match up the columns in fe with the columns holding the original categories.

If this is correct, it seems to me there are two options here to allow predictions from new data:

Require passing the old data to predict as well to identify which estimates belong to which category for each fixed effects; or
Change the construction of augmentdf in reg() to include the info about the original categories. This could be done by either just including those columns as well and make fe n-by-2*fe, or by changing this more drastically to have a smaller DataFrame or Dict with just the mapping category => estimate of fixed effect instead.

I suppose (2) is potentially breaking to the extent that people have relied on the format of the fe field, while (1) could be implemented without changes to the layout of the fields in the model, but at the cost of diverging somewhat from the usual predict API.

Any comments appreciated - happy to implement either of these or both, or indeed none of them if I've missed something and there's a better way!

matthieugomez · 2021-04-12T18:59:41Z

Sorry for the late response — you're exactly right. I think Option2 is best. fe(x::FixedEffectModel, keepkeys = true) now returns a DataFrame with the original categories (#164). It is not breaking, and it should allow you to write a better predict using DataFrames leftjoin

matthieugomez · 2021-11-24T19:01:14Z

Fixed by #185

junder873 mentioned this issue Apr 2, 2021

Missing constant coefficient when fixed effects used? #87

Open

matthieugomez closed this as completed Nov 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make `predict` easier? #134

Make `predict` easier? #134

nilshg commented Oct 22, 2020

nilshg commented Oct 22, 2020 •

edited

Loading

matthieugomez commented Oct 31, 2020

nilshg commented Oct 31, 2020

nilshg commented Nov 7, 2020

matthieugomez commented Apr 12, 2021 •

edited

Loading

matthieugomez commented Nov 24, 2021

Make predict easier? #134

Make predict easier? #134

Comments

nilshg commented Oct 22, 2020

nilshg commented Oct 22, 2020 • edited Loading

matthieugomez commented Oct 31, 2020

nilshg commented Oct 31, 2020

nilshg commented Nov 7, 2020

matthieugomez commented Apr 12, 2021 • edited Loading

matthieugomez commented Nov 24, 2021

Make `predict` easier? #134

Make `predict` easier? #134

nilshg commented Oct 22, 2020 •

edited

Loading

matthieugomez commented Apr 12, 2021 •

edited

Loading