You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, I would like to thank you for making this package available.
Some time ago I wrote a system solution to replicate the examples in the book on Structural Gravity in R. I recently discovered the existence of the library you wrote. So I was trying to replicate your example on R.
However, I am finding some differences in the first step (estimation). I tested using three different functions in R and the three generate slightly different results than those generated in your package.
One detail I noticed is that your estimate uses 870 observations, while the database has 900 observations. It was not clear to me at what point these observations are eliminated.
Below is the code I am using. Thanks!
library(readr)
#> Warning: package 'readr' was built under R version 4.0.3
library(fixest)
#> Warning: package 'fixest' was built under R version 4.0.5
library(dplyr)
#> #> Attaching package: 'dplyr'#> The following objects are masked from 'package:stats':#> #> filter, lag#> The following objects are masked from 'package:base':#> #> intersect, setdiff, setequal, union
library(estimatr)
#> Warning: package 'estimatr' was built under R version 4.0.3data<- read_csv("https://gist.githubusercontent.com/peter-herman/13b056e52105008c53faa482db67ed4a/raw/83898713b8c695fc4c293eaa78eaf44f8e880a85/sample_gravity_data.csv")
#> #> -- Column specification --------------------------------------------------------#> cols(#> exporter = col_character(),#> importer = col_character(),#> year = col_double(),#> trade = col_double(),#> Y = col_double(),#> E = col_double(),#> pta = col_double(),#> contiguity = col_double(),#> common_language = col_double(),#> lndist = col_double(),#> international = col_double()#> )
head(data)
#> # A tibble: 6 x 11#> exporter importer year trade Y E pta contiguity common_language#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>#> 1 GBR AUS 2006 4310 9.26e5 362227 0 0 1#> 2 FIN AUS 2006 514 1.43e5 362227 0 0 0#> 3 USA AUS 2006 16619 5.02e6 362227 1 0 1#> 4 IND AUS 2006 763 5.49e5 362227 0 0 1#> 5 SGP AUS 2006 8756 3.30e5 362227 1 0 1#> 6 CHE AUS 2006 1318 3.81e5 362227 0 0 1#> # ... with 2 more variables: lndist <dbl>, international <dbl>fit<-fixest::fepois(
trade~pta+contiguity+common_language+lndist+international|exporter+importer,
data=data
)
summary(fit, se="hetero")
#> Poisson estimation, Dep. Var.: trade#> Observations: 900 #> Fixed-effects: exporter: 30, importer: 30#> Standard-errors: Heteroskedasticity-robust #> Estimate Std. Error z value Pr(>|z|) #> pta 0.471138 0.111639 4.220200 2.4e-05 ***#> contiguity 0.891577 0.137645 6.477400 9.33e-11 ***#> common_language 0.032625 0.087180 0.374225 0.708237 #> lndist -0.389862 0.075690 -5.150800 2.59e-07 ***#> international -3.412600 0.223079 -15.298000 < 2.2e-16 ***#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> Log-Likelihood: -1,648,634.3 Adj. Pseudo R2: 0.975414#> BIC: 3,297,703.1 Squared Cor.: 0.994273fit_alpaca<-alpaca::feglm(
trade~pta+contiguity+common_language+lndist+international|exporter+importer,
data=data,
family= poisson(),
control=list(dev.tol=1e-10)
)
summary(fit_alpaca, type="sandwich")
#> poisson - log link#> #> trade ~ pta + contiguity + common_language + lndist + international | #> exporter + importer#> #> Estimates:#> Estimate Std. error z value Pr(> |z|) #> pta 0.47114 0.10760 4.379 1.19e-05 ***#> contiguity 0.89158 0.13266 6.721 1.81e-11 ***#> common_language 0.03262 0.08402 0.388 0.698 #> lndist -0.38986 0.07295 -5.344 9.09e-08 ***#> international -3.41258 0.21501 -15.872 < 2e-16 ***#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> #> residual deviance= 3288638.61,#> null deviance= 134106681.53,#> n= 900, l= [30, 30]#> #> Number of Fisher Scoring Iterations: 8fit_glm<- glm(
trade~-1+pta+contiguity+common_language+lndist+international+exporter+importer,
data=data,
family= quasipoisson()
)
lmtest::coeftest(
fit_glm,
vcov.=sandwich::vcovHC(fit_glm, type="HC1")
)[1:5,]
#> Estimate Std. Error z value Pr(>|z|)#> pta 0.47113830 0.11164055 4.2201359 2.441551e-05#> contiguity 0.89157690 0.13764612 6.4773123 9.337080e-11#> common_language 0.03262497 0.08718028 0.3742242 7.082376e-01#> lndist -0.38986231 0.07569223 -5.1506252 2.596195e-07#> international -3.41258441 0.22308201 -15.2974433 7.951958e-53
Thanks for reaching out about this. I spent sometime looking into this issue and have identified the problem.
It is arising as a result of the pre-estimation diagnostics that the gme package does as part of the estimate() method. Part of that routine appears to not have been as robust as intended because it is failing here. The issue is that after it flags one fixed effect as collinear ('importer_ZAF' in this case because there is a full set of 2*N fixed effects and it is alphabetically the last one), it is incorrectly determining that there is insufficient variation in ZAF imports. The problem is that this second determination is erroneously triggering because there are NO zero-valued trade flows when it should only trigger when they are ALL zero-values. This was simply an oversight in how it was coded. The result is that the 30 observations with ZAF as the importer get dropped.
I touched base with a colleague of mine that wrote that routine in gme and it has apparently already been fixed but we have not yet released the update. I'm going to try to see that that update gets released soon, which should resolve the issue.
Hi, Peter!
First of all, I would like to thank you for making this package available.
Some time ago I wrote a system solution to replicate the examples in the book on Structural Gravity in R. I recently discovered the existence of the library you wrote. So I was trying to replicate your example on R.
However, I am finding some differences in the first step (estimation). I tested using three different functions in R and the three generate slightly different results than those generated in your package.
One detail I noticed is that your estimate uses 870 observations, while the database has 900 observations. It was not clear to me at what point these observations are eliminated.
Below is the code I am using. Thanks!
Created on 2021-05-05 by the reprex package (v2.0.0)
The text was updated successfully, but these errors were encountered: