-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
predict
for fixed effects
#243
Comments
It does not work if there are missing variables in the original dataframe or if fixed effects are of the form fe(id)&fe(year) (i.e. id-year fixed effects). It would be awesome if you could write a code that handles these two things. Here is some background: #204 |
Setting the julia> using DataFrames, FixedEffectModels
julia> df = let
halfX = allcombinations(DataFrame, :a => 1:3, :b => 10:10:30)
X = vcat(halfX, halfX)
d = DataFrame(X)
d.y = rand(nrow(d))
d
end
18×3 DataFrame
Row │ a b y
│ Int64 Int64 Float64
─────┼─────────────────────────
1 │ 1 10 0.634415
2 │ 2 10 0.10137
3 │ 3 10 0.619162
4 │ 1 20 0.308558
5 │ 2 20 0.673735
6 │ 3 20 0.0323582
7 │ 1 30 0.0197685
8 │ 2 30 0.22085
9 │ 3 30 0.875045
10 │ 1 10 0.747533
11 │ 2 10 0.150399
12 │ 3 10 0.82051
13 │ 1 20 0.259925
14 │ 2 20 0.728193
15 │ 3 20 0.340064
16 │ 1 30 0.983969
17 │ 2 30 0.376881
18 │ 3 30 0.799643
julia> m = FixedEffectModels.reg(df, @formula(y ~ fe(a) * fe(b)), save = true)
FixedEffectModel
==============================================================
Number of obs: 18 Converged: true
dof (model): 0 dof (residuals): 3
R²: 0.668 R² adjusted: -0.880
F-statistic: NaN P-value: NaN
R² within: -0.000 Iterations: 3
==============================================================
Estimate Std. Error t-stat Pr(>|t|) Lower 95% Upper 95%
──────────────────────────────────────────────────────────────
==============================================================
julia> m.fe
18×5 DataFrame
Row │ a b fe_a fe_b fe_a&fe_b
│ Int64 Int64 Float64? Float64? Float64?
─────┼────────────────────────────────────────────────
1 │ 1 10 0.487636 0.0146608 0.188678
2 │ 2 10 0.429074 0.0146608 -0.31785
3 │ 3 10 0.53202 0.0146608 0.173155
4 │ 1 20 0.487636 -0.046219 -0.157175
5 │ 2 20 0.429074 -0.046219 0.318109
6 │ 3 20 0.53202 -0.046219 -0.29959
7 │ 1 30 0.487636 0.0315582 -0.0173249
8 │ 2 30 0.429074 0.0315582 -0.161766
9 │ 3 30 0.53202 0.0315582 0.273766
10 │ 1 10 0.487636 0.0146608 0.188678
11 │ 2 10 0.429074 0.0146608 -0.31785
12 │ 3 10 0.53202 0.0146608 0.173155
13 │ 1 20 0.487636 -0.046219 -0.157175
14 │ 2 20 0.429074 -0.046219 0.318109
15 │ 3 20 0.53202 -0.046219 -0.29959
16 │ 1 30 0.487636 0.0315582 -0.0173249
17 │ 2 30 0.429074 0.0315582 -0.161766
18 │ 3 30 0.53202 0.0315582 0.273766
julia> unique(m.fe)
9×5 DataFrame
Row │ a b fe_a fe_b fe_a&fe_b
│ Int64 Int64 Float64? Float64? Float64?
─────┼────────────────────────────────────────────────
1 │ 1 10 0.487636 0.0146608 0.188678
2 │ 2 10 0.429074 0.0146608 -0.31785
3 │ 3 10 0.53202 0.0146608 0.173155
4 │ 1 20 0.487636 -0.046219 -0.157175
5 │ 2 20 0.429074 -0.046219 0.318109
6 │ 3 20 0.53202 -0.046219 -0.29959
7 │ 1 30 0.487636 0.0315582 -0.0173249
8 │ 2 30 0.429074 0.0315582 -0.161766
9 │ 3 30 0.53202 0.0315582 0.273766
julia> fes = leftjoin(df, unique(m.fe); on=m.fekeys, makeunique=true)
18×6 DataFrame
Row │ a b y fe_a fe_b fe_a&fe_b
│ Int64 Int64 Float64 Float64? Float64? Float64?
─────┼───────────────────────────────────────────────────────────
1 │ 1 10 0.634415 0.487636 0.0146608 0.188678
2 │ 2 10 0.10137 0.429074 0.0146608 -0.31785
3 │ 3 10 0.619162 0.53202 0.0146608 0.173155
4 │ 1 20 0.308558 0.487636 -0.046219 -0.157175
5 │ 2 20 0.673735 0.429074 -0.046219 0.318109
6 │ 3 20 0.0323582 0.53202 -0.046219 -0.29959
7 │ 1 30 0.0197685 0.487636 0.0315582 -0.0173249
8 │ 2 30 0.22085 0.429074 0.0315582 -0.161766
9 │ 3 30 0.875045 0.53202 0.0315582 0.273766
10 │ 1 10 0.747533 0.487636 0.0146608 0.188678
11 │ 2 10 0.150399 0.429074 0.0146608 -0.31785
12 │ 3 10 0.82051 0.53202 0.0146608 0.173155
13 │ 1 20 0.259925 0.487636 -0.046219 -0.157175
14 │ 2 20 0.728193 0.429074 -0.046219 0.318109
15 │ 3 20 0.340064 0.53202 -0.046219 -0.29959
16 │ 1 30 0.983969 0.487636 0.0315582 -0.0173249
17 │ 2 30 0.376881 0.429074 0.0315582 -0.161766
18 │ 3 30 0.799643 0.53202 0.0315582 0.273766
julia> combine(fes, AsTable(Not(m.fekeys)) => sum => :prediction)
18×1 DataFrame
Row │ prediction
│ Float64
─────┼────────────
1 │ 1.32539
2 │ 0.227254
3 │ 1.339
4 │ 0.592799
5 │ 1.3747
6 │ 0.218569
7 │ 0.521638
8 │ 0.519716
9 │ 1.71239
10 │ 1.43851
11 │ 0.276283
12 │ 1.54035
13 │ 0.544166
14 │ 1.42916
15 │ 0.526274
16 │ 1.48584
17 │ 0.675747
18 │ 1.63699
|
Hmm.. maybe what was missing was interaction with continuous variable, like y & fe(a)? |
I had completely forgotten about #204 and the discussion had died down after my suggestion for dealing with the |
predict
is not implemented for models with fixed effects but I would like to use this functionality.FixedEffectModels.jl/src/FixedEffectModel.jl
Lines 132 to 139 in 851eca9
That code looks okay to me but the comment says it's wrong, so I'm reluctant to try implementing it myself lest I get it wrong. What is the problem with this code?
The text was updated successfully, but these errors were encountered: