Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feols(): divergence of the fixed-effects algorithm without warning #323

Closed
ja-ortiz-uniandes opened this issue Jun 18, 2022 · 4 comments
Closed

Comments

@ja-ortiz-uniandes
Copy link

ja-ortiz-uniandes commented Jun 18, 2022

Hello, I am using feols() to estimate a relatively simple model with 3 types of fixed effects. Here is my call:

feols(y ~ x | fe1 + fe2 + fe3, data = my_data)

However after doing this estimation I get the following results when using screenreg() (results are the same with etable()):

=============================================================
                       Model 1                               
-------------------------------------------------------------
x                                                   -2.80 ***
                                                    (0.00)   
-------------------------------------------------------------
Num. obs.                                       270563       
Num. groups: fe1                                    17       
Num. groups: fe2                                     9       
Num. groups: fe3                                   120       
R^2 (full model)       -751725941916247967602464286020.00    
R^2 (proj model)                                     1.00    
Adj. R^2 (full model)  -752126242693695971924244602284.00    
Adj. R^2 (proj model)                                1.00    
=============================================================
*** p < 0.001; ** p < 0.01; * p < 0.05

I believe this is problematic since the model I have has no restrictions for the slope or the intercept and as such I don't see why I'm producing a negative R2. Possibly, I'm not understanding something.

I also tried using lfe::felm() and got quite a different result in much less time. Here is my call to felm():

lfe::felm(y ~ x | fe1 + fe2 + fe3, data = my_data)

And here are the results (using screenreg()):

====================================
                       Model 1      
------------------------------------
x                           0.07 ***
                           (0.01)   
------------------------------------
Num. obs.              270563       
R^2 (full model)            0.05    
R^2 (proj model)            0.00    
Adj. R^2 (full model)       0.05    
Adj. R^2 (proj model)      -0.00    
Num. groups: fe1           17       
Num. groups: fe2            9       
Num. groups: fe3          120       
====================================
*** p < 0.001; ** p < 0.01; * p < 0.05

As for the execution time I've been using Sys.time() and while felm() takes around 1.3 seconds, feols() takes over a minute (anywhere between 1.02 mins to 1.4 mins).

Finally, I will say that when using etable() I get the following warning: In FUN(X[[i]], ...) : probable complete loss of accuracy in modulus however this is not always the case, as different model specifications still yield negative R2s but no error messages. The necessary steps to reproduce these results are bellow.

@ja-ortiz-uniandes ja-ortiz-uniandes changed the title R^2 (full model) estimates to -2788552965008500550244446466.00 R^2 estimates to -2788552965008500550244446466.00 Jun 18, 2022
@ja-ortiz-uniandes ja-ortiz-uniandes changed the title R^2 estimates to -2788552965008500550244446466.00 feols() sometimes takes too long and produces incorrect results Jun 18, 2022
@ja-ortiz-uniandes
Copy link
Author

ja-ortiz-uniandes commented Jun 18, 2022

Here is how to replicate the problem with this data set:

# Packages
library(texreg)
library(fixest)
library(tidyverse)
library(data.table)

# Import data
my_data <- fread("my_data.csv")

# Model with feols()
start <- Sys.time()
feols(y ~ x | fe1 + fe2 + fe3, data = my_data) %>% screenreg
Sys.time() - start

# Model with felm()
start <- Sys.time()
lfe::felm(y ~ x | fe1 + fe2 + fe3, data = my_data) %>% screenreg
Sys.time() - start

@ja-ortiz-uniandes ja-ortiz-uniandes changed the title feols() sometimes takes too long and produces incorrect results IMPORTANT - feols() sometimes takes too long and produces incorrect results Jun 20, 2022
@lrberge lrberge changed the title IMPORTANT - feols() sometimes takes too long and produces incorrect results feols(): divergence of the fixed-effects algorithm without warning Jun 21, 2022
@statzhero
Copy link

I was curious by this issue given my recent question: #363 Your example replicates for me.

I don't have an answer except that demeaning goes wrong somewhere, but I'll post my thought process.

  • My first suspicion was that the FEs are almost a perfect linear combination. But there is no obvious clear evidence. The t-statistics for fe1 are huge though and you can solve the issue by removing the variable.
cor(my_data, method = "spearman")
est <- lm(y ~ x + factor(fe1) + factor(fe2) + factor(fe3), data = my_data)
car::vif(est)

                  GVIF  Df GVIF^(1/(2*Df))
x             5.336705   1        2.310131
factor(fe1) 156.353203  16        1.171024
factor(fe2) 185.266706   8        1.385917
factor(fe3) 333.291358 119        1.024708

est_res <- feols(y ~ x | fe2 + fe3, data = my_data) 

est_res |> etable()
                           est_res
Dependent Var.:                  y
                                  
x               0.0736*** (0.0104)
Fixed-Effects:  ------------------
fe2                            Yes
fe3                            Yes
_______________ __________________
S.E.: Clustered            by: fe2
Observations               270,563
R2                         0.04208
Within R2                  0.00026
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
  • My second try was to remove the most extreme observation and the feols() coefficient changes to -3.1 but no change for the OLS model. Again the feols algorithm seems to be not stable in this case.
est2 <- update(est, data = my_data[-48710,])
  • I can get better point estimates by combining some bins, e.g. (and there is only one singleton)
my_data$fe2_b <- floor(my_data$fe2 / 2)

feols(y ~ x | fe1 + fe2_b + fe3, data = my_data, fixef.rm = 'both') |> 
  etable()
  • A look at the demeaned variables
est_d <- feols(y ~ x | fe1 + fe2 + fe3, data = my_data, 
      demeaned = TRUE) 

collapse::qsu(est_t[c("y_demeaned", "X_demeaned")])
                 N             Mean              SD
y_demeaned  270563   1.53867429e+13  5.47790209e+26
X_demeaned  270563  -2.61835466e+13  1.95315321e+26
                        Min             Max
y_demeaned  -2.19166815e+28  1.47794708e+28
X_demeaned  -5.26963981e+27  7.81442171e+27

Lastly, fixest::feols() has parameters that change the algorithm (collin.tol and so forth) but I haven't played around with them.

@lrberge
Copy link
Owner

lrberge commented Feb 14, 2024

Dear @ja-ortiz-uniandes, thanks a lot for this issue and especially the reproducible example.

Two things:

  1. there was a bug preventing a warning to pop when the demeaning algorithm didn't converge
  2. thanks to your example I could improve the internal algorithm. Originally, for some reason, the algorithm did not converge with your example data. I reworked the algorithm and now it works.

To be clear, I have added a few tricks in the internal demeaning algorithm to help it get out of difficult cases. You can access the new settings with the (new) function demeaning_algo. But with the new defaults, your use case just works now.

Thanks again all and very sorry for the immense delay.

@ja-ortiz-uniandes
Copy link
Author

ja-ortiz-uniandes commented Feb 14, 2024

Dear @lrberge thank you. Your package has made science more open and accessible. As I have expressed many times before, I believe there is no better way to do FE estimations than with fixest. Your continued work on this package is what has made fixest into the amazing package it is today. Thanks for taking care of the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants