I have replicated the problem I was having in #95 in a smaller example.
On
DataFrames v0.19.4
FixedEffectModels v0.7.4
FixedEffects v0.1.2
running
using Random, DataFrames, FixedEffectModels, BenchmarkTools
Random.seed!(0)
n = 5_000_000
df = DataFrame(y = rand(Float64, n), x = rand(Float64, n), z = rand(Float64, n), fe1 = rand(collect(0:1_000), n), fe2 = rand(collect(0:1_000), n))
df[:fe1] = categorical(df[:fe1]);
df[:fe2] = categorical(df[:fe2]);
@btime rr1s1 = reg(df, @model(y ~ x , fe = fe1*z + fe2*z), save = :residuals)
yields an benchmark time of 2.223s
On the other hand, on
DataFrames v0.20.2
FixedEffectModels v0.10.5
FixedEffects v0.7.2
running
using Random, DataFrames, FixedEffectModels, BenchmarkTools
Random.seed!(0)
n = 5_000_000
df = DataFrame(y = rand(Float64, n), x = rand(Float64, n), z = rand(Float64, n), fe1 = rand(collect(0:1_000), n), fe2 = rand(collect(0:1_000), n))
df[:fe1] = categorical(df[:fe1]);
df[:fe2] = categorical(df[:fe2]);
@btime rr1s1 = reg(df, @formula(y ~ x + fe(fe1)*z + fe(fe2)*z), save = :residuals)
yields a benchmark time of 2.95s.
Both are with nthreads of 4 (it's similar with 1).
I've also tried this on larger examples (500_000 and 10_000 levels of fe1 and fe2 respectively) and the time differences are similar, 350s vs 495s.
Any sense of what the reason could be?
I have replicated the problem I was having in #95 in a smaller example.
On
running
yields an benchmark time of 2.223s
On the other hand, on
running
yields a benchmark time of 2.95s.
Both are with
nthreadsof 4 (it's similar with 1).I've also tried this on larger examples (500_000 and 10_000 levels of
fe1andfe2respectively) and the time differences are similar, 350s vs 495s.Any sense of what the reason could be?