Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict.fixest() does not use offset from newdata with models fit with femlm #309

Closed
jl-flores opened this issue May 17, 2022 · 1 comment

Comments

@jl-flores
Copy link

Hello again!

I've realized when using predict.fixest() on models fit with femlm, the function basically ignores the offset values and behaves as if offset=1 when the offset is supplied inside the fmla argument in the femlm() function.

library(MASS)
library(fixest)
data(Seatbelts)

# fit models using base/standard functions
fit_glm.pois <- glm(DriversKilled ~ law + PetrolPrice + offset(log(kms)), data = Seatbelts, family = "poisson")
fit_glm.nb <- glm.nb(DriversKilled ~ law + PetrolPrice + offset(log(kms)), data = Seatbelts)

# fit models using fixest functions (with the offest in the formula)
fit_fixest.pois.1 <- femlm(DriversKilled ~ law + PetrolPrice + offset(log(kms)), data = Seatbelts, family = "poisson")
fit_fixest.nb.1 <- femlm(DriversKilled ~ law + PetrolPrice + offset(log(kms)), data = Seatbelts, family = "negbin")

# model predictions do not match between base and fixest calls
predict(fit_glm.pois, newdata = as.data.frame(head(Seatbelts, 5)), type = "response")
        1         2         3         4         5 
 78.03598  66.54794  86.49766  96.08875 103.57159 
predict(fit_fixest.pois.1, newdata = as.data.frame(head(Seatbelts, 5)), type = "response")
[1] 0.008614193 0.008659459 0.008681889 0.008771223 0.008760178

predict(fit_glm.nb, newdata = as.data.frame(head(Seatbelts, 5)), type = "response")
        1         2         3         4         5 
 80.68902  68.81155  89.44058  99.36119 107.09844 
predict(fit_fixest.nb.1, newdata = as.data.frame(head(Seatbelts, 5)), type = "response")
[1] 0.008907056 0.008954008 0.008977274 0.009069940 0.009058483

# predict.fixest() behaves as if the offset was always 1
seatbelts_offset1 <- as.data.frame(head(Seatbelts, 5))
seatbelts_offset1$kms <- 1

# only showing poisson here, but same behaviour is observed with negative binomial
predict(fit_glm.pois, newdata = seatbelts_offset1, type = "response")
          1           2           3           4           5 
0.008614193 0.008659459 0.008681889 0.008771223 0.008760178 
predict(fit_fixest.pois.1, newdata = seatbelts_offset1, type = "response")
[1] 0.008614193 0.008659459 0.008681889 0.008771223 0.008760178

However, this issue doesn't happen if I use the offset argument when the model is fitted in femlm(), and the predict.fixest() calls return the same values as the base/MASS models.

# now fit models with the offset given as a separate argument
fit_fixest.pois.2 <- femlm(DriversKilled ~ law + PetrolPrice, data = Seatbelts, 
                           family = "poisson", offset = ~log(kms))
fit_fixest.nb.2 <- femlm(DriversKilled ~ law + PetrolPrice, data = Seatbelts, 
                         family = "negbin", offset = ~log(kms))

# model predictions match now
predict(fit_glm.pois, newdata = as.data.frame(head(Seatbelts, 5)), type = "response")
        1         2         3         4         5 
 78.03598  66.54794  86.49766  96.08875 103.57159 
predict(fit_fixest.pois.2, newdata = as.data.frame(head(Seatbelts, 5)), type = "response")
[1]  78.03598  66.54794  86.49766  96.08875 103.57159

predict(fit_glm.nb, newdata = as.data.frame(head(Seatbelts, 5)), type = "response")
        1         2         3         4         5 
 80.68902  68.81155  89.44058  99.36119 107.09844 
predict(fit_fixest.nb.2, newdata = as.data.frame(head(Seatbelts, 5)), type = "response")
[1]  80.68902  68.81155  89.44058  99.36119 107.09844

Maybe this is simply the intended behaviour (since offsets do work when supplied in the offset = argument)? However, I do believe this is a bug, since the femlm does seem to recognize the offset term inside of fmla, given that all the coefficients are the same.

coefficients(fit_glm.pois)
(Intercept)         law PetrolPrice 
 -3.8679098  -0.3680158  -8.6085139 
coefficients(fit_fixest.pois.1)
(Intercept)         law PetrolPrice 
 -3.8679098  -0.3680158  -8.6085139 
coefficients(fit_fixest.pois.2)
(Intercept)         law PetrolPrice 
 -3.8679098  -0.3680158  -8.6085139

I think this could be related to issue #270, I am using version 0.10.4. Thanks in advance for any attention into this issue.

PS: thanks a lot for publishing this package! It has been very helpful in speeding up some model fitting in my work :)

@lrberge lrberge closed this as completed in af4fb63 Feb 9, 2024
@lrberge
Copy link
Owner

lrberge commented Feb 9, 2024

Thanks for the report and the very clear reproducible example! Very helpful! :-)
And apologies for the immense delay!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants