Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

felm: centering threads hang-up #23

Open
averydo opened this issue Aug 22, 2019 · 0 comments
Open

felm: centering threads hang-up #23

averydo opened this issue Aug 22, 2019 · 0 comments

Comments

@averydo
Copy link

averydo commented Aug 22, 2019

Hello, I'm currently running a model on gridded/tabularized spatial data where each row, for example, corresponds to a 5x5 km square area in a given year -- in a given country. I'm running a felm model where I am estimating the outcome of a particular kind of conflict event (ACLED battles, protests, civil unrest, etc.) with country-year fixed effects:

# just an example 
fmla <- "any_acled_battles ~ any_infrastructure + any_electrification | 
cell_number_5x5 + country_year | 0 | cell_number_5x5" %>% as.formula()

felm(fmla, data = my_data)

This model runs perfectly fine at the 5x5 km level. So, my issue with centering threads (demeaning) occurs at the 10x10 km level. When I aggregate up, the model seems to run indefinitely (which is peculiar, because aggregating up would result in less rows/data to process), and I have not attempted to leave it overnight or let it keep going because it drives my CPU temperatures to 90C+ after ~20 minutes (I hear the CPU fans running loudly and I'm on a higher-spec 2017 iMac). When I cancel, the console returns something similar to ".. stopping centering threads.." which I assume is evidence of the process taxing itself by attempting to demean something and getting stuck. On the other hand, it takes about 2 minutes (at most) for 5x5 outputs to complete. Looking through documentation, I've been able to successfully run the 10x10 models if I set tolerance levels to options(lfe.eps = 1e-2). It completes in a reasonable amount of time (similar or little less than 5x5 equivalent at default 1e-8).

Considering that precision is important to our team's work, we gave demeanlist() a shot and ran a model using demeaned values. With demeanlist, we can set eps tolerance to anything -- 1e-8 works (there is no difference between the resulting objects of demeanlist() when tolerance is, for example, set to 1e-1 or 1e-50). However, when feeding the demeaned dataset back into felm(), we still have to set global options lfe.eps = 1e-2 or else we get computation hangs. Note: running felm() on the default/original dataset is producing different estimates than running it on the demeanlist() version, but of course, only very slightly different (mostly beyond 0.0001 values - but indeed different).

Would you have any recommendations as to what's going on or insight on best ways to move forward, namely whether these tolerances really matter if demeanlist() is properly outputting a demeaned dataset? I apologize in advance if this is vague and/or an unhelpful illustration of our problem, the nature of our data is proprietary so it can be tricky to have a reproducible issue. I can provide more information/potentially anonymize the data if anybody thinks it would help to have some actual examples (from the outset, I have no idea what's going on so I couldn't easily re-create an example). Many thanks for your time.

Screen Shot 2019-08-22 at 9 46 10 AM

[left: modeling default data, right: modeling "demeaned" data]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant