The following question about GMM estimation where we know
\begin{equation*}
\hat{\sigma} = \frac{\left( 1\;\; 2\frac{1}{n}\sum_{i=1}^n x_i  \right) \hat{\mathbf{W}} \left( \begin{array} \frac{1}{n}\sum_{i=1}^n x_i \\\ \frac{1}{n}\sum_{i=1}^n x^2_i \end{array} \right)}{\left( 1\;\; 2\frac{1}{n}\sum_{i=1}^n x_i  \right) \hat{ \mathbf{W}} \left( \begin{array}{c} 1 \\\ 2\frac{1}{n}\sum_{i=1}^n x_i \end{array} \right)}
\end{equation*}
and the asymptotic distribution is
\begin{equation*}
\sqrt{n}(\sigma - \hat{\sigma}) \xrightarrow{d} (0, \mathbf{(Q'WQ)^{-1} Q'W \Omega WQ)(Q'WQ)^{-1}})
\end{equation*}
with 
\begin{equation*}
\Omega = \left( \begin{array} \mathbf{E}[(x-\sigma)^2] \;\; \mathbf{E}[(x-\sigma)^3] \\\ \mathbf{E}[(x-\sigma)^3] \;\; \mathbf{E}[(x-\sigma)^4 - \sigma^{4}] \end{array} \right) 
\\\
\\\
\mathbf{Q} = (-1 \;\; -2\mathbf{E}[X])
\end{equation*}
Q1. comparing standard error and 95% confidence interval of using identity matrix or optimal weight matrix as W to estimate enclosed dataset.

Q2. perform overidentification test and conclude whether the model is correctly specified.

\begin{equation*}
\mathbf{H}_0 : \left( \begin{array} \mathbf{E}[x - \sigma] \\\ \mathbf{E}[x^2 - 2x\sigma] \end{array} \right) = \left( \begin{array}{c}0  \\\ 0 \end{array} \right)
\end{equation*}

In [156]:
import math
import numpy as np
from numpy.linalg import inv
from scipy.stats import chi2

In [103]:
#load data
df = np.loadtxt('6_2.txt')
df.shape, df.ndim

((500,), 1)

In [115]:
#define GMM estimator
def gmm(x, W):
    #generate elements
    x_bar = x.mean()
    x_squared_bar = np.square(x).mean()
    first_matrix = np.matrix([1, 2*x_bar])
    second_matrix = np.matrix([x_bar, x_squared_bar]).T
    nominator = first_matrix * W * second_matrix
    denominator = first_matrix * W * first_matrix.T
    #calculate sigma
    sigma = nominator / denominator
    #calculate omega
    x_cent = x-sigma
    omega = np.matrix([[np.power(x_cent, 2).mean(), np.power(x_cent, 3).mean()], [np.power(x_cent, 3).mean(), (np.power(x_cent, 4) - sigma**4).mean()]])
    #calculate asymptotic variance  
    var_n = inv(nominator) * first_matrix * W * omega * W * first_matrix.T * inv(nominator)
    #eliminate effect of n^(1/2)
    var = var_n / x.shape[0]
    return sigma[0, 0], omega, var[0, 0]

In [116]:
#pass in idntity matrix
sigma_i, omega, var_i = gmm(df, np.identity(2))

since optimal weight matrix is $\Omega ^{-1}$, 
we pass inversed omega into gmm function to get optimal estimation

In [118]:
sigma_o, omega_o, var_o = gmm(df, n(2))

## answer to Q1

In [148]:
def ci(sigma, var):
    std = math.sqrt(var)
    ci_l = sigma - 1.96 * std
    ci_h = sigma + 1.96 * std
    print(f"mean : {sigma:0.2f}, 95% CI : ({ci_l:.2f}, {ci_h:.2f})")


In [149]:
print('identity matrix')
ci(sigma_i, var_i)
print('optimal weight matrix')
ci(sigma_o, var_o)

identity matrix
mean : 1.05, 95% CI : (0.91, 1.20)
optimal weight matrix
mean : 1.02, 95% CI : (0.93, 1.11)


## answer to Q2

In [161]:
g_n = np.matrix([df.mean() - sigma_o, np.square(df).mean() - 2*df.mean()*sigma_o])
J = df.shape[0] * g_n * inv(omega) * g_n.T
print(f'J statistic is {J[0, 0]:0.2f}')
#chisquare distribution probability of 95%, IV 2, variable 1, degree of freedom = 2 - 1
print(f'the critical value is {chi2.ppf(0.95, 2-1):0.2f}')

J statistic is 0.38
the critical value is 3.84


Since J statistic is smaller than the critical value, We can not reject the null hypothesis at 5% level.