I think I may have stumbled across a bug. If you run the code below, you'll see that for a small number of groups for a single fixed effect, the function partial_out correctly demeans the variables within the fixed effect group and the fixed effect regression returns the correct coefficient. For large number of groups the demeaning and regression is incorrect. Interestingly the demeaning returns the same values for any number of groups. The switch over happens when there is more than 15 groups within the fixed effect. I assume this is actually a problem within FixedEffects.jl. If you have any suggestions for where to start digging I'm happy to poke around or if you can tell me if I'm using this incorrectly. Thanks!
using DataFrames, FixedEffectModels, Distributions, Random
# m is the number of groups
function test(m)
Random.seed!(123)
n = 1000 # number of obs per individuals
ids = repeat(1:m, inner=n)
fes = repeat(rand(Normal(0,1), m), inner=n)
x = rand(Normal(), m * n) .+ fes
y = 2*x .+ fes .+ rand(m * n)
df = DataFrame(x=x, y=y, ids=ids, fes=fes)
df2 = partial_out(df, @formula(y + x + fes ~ fe(ids)))[1]
df2[!, :ids] = df[!, :ids]
println(aggregate(groupby(df2,:ids), mean))
println(reg(df, @formula(y~x+fe(ids))))
end
# Variables should be mean 0 and coef on x should equal 2
test(5) #correct
test(15) # correct
test(16) # wrong
test(50) # wrong but = test(16)
I think I may have stumbled across a bug. If you run the code below, you'll see that for a small number of groups for a single fixed effect, the function partial_out correctly demeans the variables within the fixed effect group and the fixed effect regression returns the correct coefficient. For large number of groups the demeaning and regression is incorrect. Interestingly the demeaning returns the same values for any number of groups. The switch over happens when there is more than 15 groups within the fixed effect. I assume this is actually a problem within FixedEffects.jl. If you have any suggestions for where to start digging I'm happy to poke around or if you can tell me if I'm using this incorrectly. Thanks!