Generalized Method of Moments
=============================

*Generalized method of moments* (GMM) is an estimation principle that
extends *method of moments*. It seeks the parameter that minimizes a
quadratic form of the moments. It is particularly useful in estimating
structural models in which moment conditions can be derived from
economic theory. GMM emerges as one of the most popular estimators in
modern econometrics, and it includes conventional methods like the
two-stage least squares (2SLS) and the three-stage least square as
special cases.

**R Example**

The CRAN packge [gmm](http://cran.r-project.org/web/packages/gmm/index.html) provides an interface for GMM estimation. In this document we demonstrate it in a nonlinear model. 

[Bruno Rodrigues](http://www.brodrigues.co/pages/aboutme.html) shared [his example](http://www.brodrigues.co/blog/2013-11-07-gmm-with-rmd/) with detailed instruction and discussion. 
(update: as Aug 19, 2018, his linked data no longer works. I track to the original dataset and do the conversion to make it work.) 
Unfortunately, I find his example cannot reflect the essence of GMM. The blunder was that he took the *method of moments* as the *generalized method of moments*. He worked with a just-identified model, in which the choices of **type** and **wmatrix** in his call
```
my_gmm <- gmm(moments, x = dat, t0 = init, type = "iterative", crit = 1e-25, wmatrix = "optimal", method = "Nelder-Mead", control = list(reltol = 1e-25, maxit = 20000))
```
is simplily irrelevant. Experimenting with different options of **type** and **wmatrix**, we will find exactly the same point estimates and variances.

Below I illustrate the nonlinear GMM in an over-identified system. First we import the data and add a constant. 

In [4]:
# load the data
library(Ecdat, quietly = TRUE, warn.conflicts = FALSE)
data(Benefits)
g = Benefits

g$const <- 1 # add the constant
g1 <- g[, c("ui", "const", "age", "dkids", "dykids", "head", "sex", "married", "rr") ] 

head(g)

# to change the factors into numbers
for (j in c(1, 4, 5, 6, 7, 8) ){
    g1[,j] = as.numeric( g1[,j] ) -1
}

stateur,statemb,state,age,tenure,joblost,nwhite,school12,sex,bluecol,smsa,married,dkids,dykids,yrdispl,rr,head,ui,const
4.5,167,42,49,21,other,no,no,male,yes,yes,no,no,no,7,0.290631,yes,yes,1
10.5,251,55,26,2,slack_work,no,no,male,yes,yes,no,yes,yes,10,0.520202,yes,no,1
7.2,260,21,40,19,other,no,yes,female,yes,yes,yes,no,no,10,0.4324895,yes,yes,1
5.8,245,56,51,17,slack_work,yes,no,female,yes,yes,yes,no,no,10,0.5,no,yes,1
6.5,125,58,33,1,slack_work,no,yes,male,yes,yes,yes,yes,yes,4,0.390625,yes,no,1
7.5,188,11,51,3,other,no,no,male,yes,yes,yes,no,no,10,0.4822007,yes,yes,1


R's OLS function **lm** adds the intercept in the default setting. In contrast,we have to specify the moments from scratch in **gmm**. The constant, a column of ones, must be included explicitly in the data matrix.

Next, we define the logistic function and the moment conditions. 

In [2]:
logistic <- function(theta, data) {
  return(1/(1 + exp(-data %*% theta)))
}

moments <- function(theta, data) {
  y <- as.numeric(data[, 1])
  x <- data.matrix(data[, c(2:3, 6:8)])
  z <- data.matrix( data[, c(2,4, 5:9) ] )  # more IVs than the regressors. Over-identified.
  m <- z * as.vector((y - logistic(theta, x)))
  return(cbind(m))
}

Here I naively adapt Bruno Rodrigues's example and specify the momemts as
$$
E[z_i \epsilon_i] = E[ z_i ( y_i - \mathrm{ logistic }(x_i \beta ) )] = 0
$$
However, such a specification is almost impossible to be motivated from the economic theory of random utility models.


Eventually, we call the GMM function and display the results. An initial value must be provided for a numerical optimization algorithm. It is recommended to try at least dozens of initial values in general unless one can show that the minimizer is unique in the model.

In [3]:
library(gmm)  # load the library "gmm"

init <- (lm(ui ~ age + dkids + head + sex, data = g1 ))$coefficients
my_gmm <- gmm(moments, x = g1, t0 = init, type = "twoStep", wmatrix = "optimal")
summary(my_gmm)

Loading required package: sandwich



Call:
gmm(g = moments, x = g1, t0 = init, type = "twoStep", wmatrix = "optimal")


Method:  twoStep 

Kernel:  Quadratic Spectral(with bw =  0.38316 )

Coefficients:
             Estimate     Std. Error   t value      Pr(>|t|)   
(Intercept)   1.5557e-01   2.6093e-01   5.9621e-01   5.5103e-01
age           1.6488e-02   7.6263e-03   2.1620e+00   3.0619e-02
dkids        -1.4357e-01   8.4954e-02  -1.6900e+00   9.1036e-02
head         -6.6180e-02   8.5809e-02  -7.7125e-01   4.4056e-01
sex           2.8576e-01   7.0287e-02   4.0656e+00   4.7903e-05

J-Test: degrees of freedom is 2 
                J-test    P-value 
Test E(g)=0:    5.160301  0.075763

Initial values of the coefficients
(Intercept)         age       dkids        head         sex 
 0.24913747  0.01357652 -0.11525498 -0.08022626  0.28346400 

#############
Information related to the numerical optimization
Convergence code =  0 
Function eval. =  486 
Gradian eval. =  NA 

In the summary, the $J$ statistics indicates that the moment conditions are unlikely to hold. The model requires further modification. 

P.S.: According to my personal experience, caution must be executed when using **gmm** in R for nonlinear models. Sometimes the estimates can be unreliable, perhaps due to the shape of the criterion function in several parameters. Simulation experiments are highly suggested before we believe the estimates.