# Lecture 14

Our data comes from An Advanced Guide to Trade Policy Analysis: The Structural Gravity Mode. We will estimate the gravity model using optimal transport as well as using Poisson regression.

In [5]:
library(tictoc)

In [1]:
thepath = getwd()
tradedata = read.csv("1_TraditionalGravity_from_WTO_book.csv")   
head(tradedata)

exporter,importer,pair_id,year,trade,DIST,CNTG,LANG,CLNY,ln_trade,...,IMPORTER_TIME_FE407,IMPORTER_TIME_FE408,IMPORTER_TIME_FE409,IMPORTER_TIME_FE410,IMPORTER_TIME_FE411,IMPORTER_TIME_FE412,IMPORTER_TIME_FE413,IMPORTER_TIME_FE414,X_est_fes,X_est_ppml
LKA,ARG,115,1986,0.006416,15078.428,0,0,0,-5.04896,...,0,0,0,0,0,0,0,0,1,1
GRC,ARG,56,1986,0.0988345,11772.039,0,0,0,-2.314308,...,0,0,0,0,0,0,0,0,1,1
QAT,ARG,203,1986,0.0,13482.186,0,0,0,,...,0,0,0,0,0,0,0,0,0,1
HKG,ARG,60,1986,5.704937,18685.815,0,0,0,1.741332,...,0,0,0,0,0,0,0,0,1,1
FIN,ARG,45,1986,14.3477831,12969.024,0,0,0,2.663595,...,0,0,0,0,0,0,0,0,1,1
BRA,ARG,8,1986,386.8603245,2391.846,1,0,0,5.958064,...,0,0,0,0,0,0,0,0,1,1


Let's prepare the data so that we can use it. We want to construct 
* $D_{ni,t}^k$ which is the $k$th pairwise distance between importer $n$ and exporter $i$ at time $t$

* $X_{n,t}$ total value of expenditure of importer $n$ at time $t$
 
* $Y_{i,t}$ total value of production of exporter $i$ at time $t$

In [2]:
# Unique list of importers
countrylist = sort(unique(tradedata$importer))
# Unique list of exporters
exportercountrylist = sort(unique(tradedata$exporter))
if (!identical(countrylist, exportercountrylist)) {
    stop("exporter and importer country lists do not coincide")
}

# regressorsIndices = 4:13
regressorsIndices = c("ln_DIST", "CNTG", "LANG", "CLNY")
yearslist = c(1986, 1990, 1994, 1998, 2002, 2006)

regressors_raw = tradedata[regressorsIndices]
regressorsNames = names(regressors_raw)
flow_raw = tradedata$trade

nbt = length(yearslist)  # number of years
nbk = dim(regressors_raw)[2]  # number of regressors
nbi = length(countrylist)  # number of countries
yearsIndices = 1:nbt

Dnikt = array(0, dim = c(nbi, nbi, nbk, nbt))  # basis functions
Xhatnit = array(0, dim = c(nbi, nbi, nbt))  # trade flows from n to i

missingObs = array(0, dim = c(0, 2, nbt))

for (year in 1:nbt) {
    theYear = yearslist[year]
    # print(theYear)
    for (dest in 1:nbi) {
        theDest = as.character(countrylist[dest])
        # print(theDest)
        for (orig in 1:nbi) {
            if (orig != dest) {
                theOrig = as.character(countrylist[orig])
                extract = (tradedata$exporter == theOrig) & (tradedata$importer == 
                  theDest) & (tradedata$year == theYear)
                line = regressors_raw[extract, ]
                
                if (dim(line)[1] == 0) {
                  missingObs = rbind(missingObs, c(theOrig, theDest))
                }
                
                if (dim(line)[1] > 1) {
                  stop("Several lines with year, exporter and importer.")
                }
                
                if (dim(line)[1] == 1) {
                  Dnikt[orig, dest, , year] = as.numeric(line)
                  Xhatnit[orig, dest, year] = flow_raw[extract]
                }                
            }
        }
    }
}
if (length(missingObs) > 0) {
    stop("Missing observations")
}
Xnt = apply(X = Xhatnit, MARGIN = c(1, 3), FUN = sum)
Yit = apply(X = Xhatnit, MARGIN = c(2, 3), FUN = sum)

We will solve this model by fixing a $\beta$ and solving the matching problem using IPFP. Then in an outer loop we will solve for the $\beta$ which minimizes the distance between model and empirical moments.

In [3]:
sigma = 1  # sigma for IPFP
maxiterIpfp = 1000  # max numbers of iterations
tolIpfp = 1e-12  # tolerance for IPFP
tolDescent = 1e-12  # tolerance for gradient descent

totmass_t = rep(sum(Xnt)/nbt, nbt)  # total mass
p_nt = t(t(Xnt)/totmass_t)  # proportion of importer expenditure
q_nt = t(t(Yit)/totmass_t)  # proportion of exporter productions
IX = rep(1, nbi)
tIY = matrix(rep(1, nbi), nrow = 1)

f_nit = array(0, dim = c(nbi, nbi, nbt))
g_nit = array(0, dim = c(nbi, nbi, nbt))
pihat_nit = array(0, dim = c(nbi, nbi, nbt))

sdD_k = rep(1, nbk)
meanD_k = rep(0, nbk)

for (t in 1:nbt) {
    f_nit[, , t] = p_nt[, t] %*% tIY
    g_nit[, , t] = IX %*% t(q_nt[, t])
    pihat_nit[, , t] = Xhatnit[, , t]/totmass_t[t]
}

for (k in 1:nbk) {
    meanD_k[k] = mean(Dnikt[, , k, ])
    sdD_k[k] = sd(Dnikt[, , k, ])
    Dnikt[, , k, ] = (Dnikt[, , k, ] - meanD_k[k])/sdD_k[k]
}


v_it = matrix(rep(0, nbi * nbt), nbi, nbt)
beta_k = rep(0, nbk)

t_s = 0.03  # step size for the prox grad algorithm (or grad descent when lambda=0)
iterCount = 0

tic()

while (1) {
    thegrad = rep(0, nbk)
    pi_nit = array(0, dim = c(nbi, nbi, nbt))
    
    for (t in 1:nbt) {
        D_ij_k = matrix(Dnikt[, , , t], ncol = nbk)
        Phi = matrix(D_ij_k %*% matrix(beta_k, ncol = 1), nrow = nbi)
        contIpfp = TRUE
        iterIpfp = 0
        v = v_it[, t]
        f = f_nit[, , t]
        g = g_nit[, , t]
        K = exp(Phi/sigma)
        diag(K) = 0
        gK = g * K
        fK = f * K
        
        
        while (contIpfp) {
            iterIpfp = iterIpfp + 1
            u = sigma * log(apply(gK * exp((-IX %*% t(v))/sigma), 1, sum))
            vnext = sigma * log(apply(fK * exp((-u %*% tIY)/sigma), 2, sum))
            error = max(abs(apply(gK * exp((-IX %*% t(vnext) - u %*% tIY)/sigma), 
                1, sum) - 1))
            if ((error < tolIpfp) | (iterIpfp >= maxiterIpfp)) {
                contIpfp = FALSE
            }
            v = vnext
        }
        v_it[, t] = v
        pi_nit[, , t] = f * gK * exp((-IX %*% t(v) - u %*% tIY)/sigma)
        if (iterIpfp >= maxiterIpfp) {
            stop("maximum number of iterations reached")
        }
        
        thegrad = thegrad + c(c(pi_nit[, , t] - pihat_nit[, , t]) %*% D_ij_k)
        
    }
    # take one gradient step
    beta_k = beta_k - t_s * thegrad
    
    theval = sum(thegrad * beta_k) - sigma * sum(pi_nit[pi_nit > 0] * log(pi_nit[pi_nit > 
        0]))
    
    iterCount = iterCount + 1
    
    if (iterCount > 1 && abs(theval - theval_old) < tolDescent) {
        break
    }
    theval_old = theval
}

beta_k = beta_k/sdD_k

toc()
print(beta_k)

   user  system elapsed 
  91.16    0.05   92.61 

[1] -0.8409237  0.4374486  0.2474767 -0.2224904


## Comparison

We can compare the results and speed of our computation to that of Poisson regression packages. As a warning, these give the same results, but at the cost of a much longer run time, so use at your own risk.

We can solve instead using the Poisson regression from the glm package.

In [4]:
tic()

glm_pois = glm(as.formula(
        paste("trade ~ ", 
               paste(grep("PORTER_TIME_FE", names(tradedata), value=TRUE), collapse=" + "), 
               " + ln_DIST + CNTG + LANG + CLNY")),
        family = quasipoisson,
        data=subset(tradedata, exporter!=importer) )

toc()

glm_pois$coefficients[regressorsIndices]

Which gives the same results but is much slower.

We can also use the `pplm` function from the `gravity` package.

In [15]:
#install.packages("gravity")
library(gravity)
tic()

grav_pois = ppml('trade', 'DIST', c(grep("PORTER_TIME_FE", names(tradedata), value=TRUE), 'CNTG', 'LANG', 'CLNY'),
    vce_robust = FALSE, data = subset(tradedata, exporter!=importer))

toc()

grav_pois$coefficients[c("dist_log", "CNTG", "LANG", "CLNY"), 1]

383.08 sec elapsed


Again which gives the same results but it is much slower!