-
Notifications
You must be signed in to change notification settings - Fork 14
Home
JanDitzen edited this page Nov 22, 2018
·
1 revision
-------------------------------------------------------------------------------------------------
help xtdcce2 v. 133 - 24. August 2018
-------------------------------------------------------------------------------------------------
Title
xtdcce2 - estimating heterogeneous coefficient models using common correlated effects in a
dynamic panel with a large number of observations over groups and time periods.
Syntax
xtdcce2 depvar [indepvars] [varlist2 = varlist_iv] [if] [in] , crosssectional(varlist) [
pooled(varlist) cr_lags(string) nocrosssectional ivreg2options(string) e_ivreg2
ivslow noisily lr(varlist) lr_options(string) pooledconstant reportconstant
noconstant trend pooledtrend jackknife recursive nocd showindividual fullsample fast
NOOMITted]
where varlist2 are endogenous variables and varlist_iv the instruments.
Data has to be xtset before using xtdcce2; see tsset. varlists may contain time-series
operators, see tsvarlist, or factor variables, see fvvarlist.
xtdcce2 requires the moremata package.
Contents
Description
Options
Econometric and Empirical Model
Saved Values
Postestimation commands
Examples
References
About
Description
xtdcce2 estimates a heterogeneous coefficient model in a large panel with dependence between
cross sectional units. A panel is large if the number of cross-sectional units (or groups)
and the number of time periods are going to infinity.
It fits the following estimation methods:
i) The Mean Group Estimator (MG, Pesaran and Smith 1995).
ii) The Common Correlated Effects Estimator (CCE, Pesaran 2006),
iii) The Dynamic Common Correlated Effects Estimator (DCCE, Chudik and Pesaran 2015), and
For a dynamic model, several methods to estimate long run effects are possible:
a) The Pooled Mean Group Estimator (PMG, Shin et. al 1999) based on an Error Correction
Model,
b) The Cross-Sectional Augmented Distributed Lag (CS-DL, Chudik et. al 2016) estimator
which directly estimates the long run coefficients from a dynamic equation, and
c) The Cross-Sectional ARDL (CS-ARDL, Chudik et. al 2016) estimator using an ARDL model.
For a further discussion see Ditzen (2018b). Additionally xtdcce2 tests for cross sectional
dependence (see xtcd2) and supports instrumental variable estimations (see ivreg2).
Options
crosssectional(varlist) defines the variables which are added as cross sectional averages to
the equation. Variables in crosssectional() may be included in pooled(),
exogenous_vars(), endogenous_vars() and lr(). Variables in crosssectional() are
partialled out, the coefficients not estimated and reported.
crosssectional(_all) adds all variables as cross sectional averages. No cross sectional
averages are added if crosssectional(_none) is used, which is equivalent to
nocrosssectional. crosssectional() is a required option but can be substituted by
nocrosssectional.
pooled(varlist) specifies variables which estimated coefficients are constrained to be equal
across all cross sectional units. Variables may occur in indepvars. Variables in
exogenous_vars(), endogenous_vars() and lr() may be pooled as well.
cr_lags(integers) sets the number of lags of the cross sectional averages. If
not defined but crosssectional() contains a varlist, then only
contemporaneous cross sectional averages are added but no lags. cr_lags(0)
is the equivalent. The number of lags can be different for different
variables, where the order is the same as defined in cr(). For example if
cr(y x) and only contemporaneous cross-sectional averages of y but 2 lags of
x are added, then cr_lags(0 2).
nocrosssectional suppresses adding any cross sectional averages Results will be
equivalent to the Mean Group estimator.
pooledconstant restricts the constant term to be the same across all cross
sectional units.
reportconstant reports the constant term. If not specified the constant is
partialled out.
noconstant suppresses the constant term.
xtdcce2 supports IV regressions using ivreg2. The IV specific options are:
ivreg2options(string) passes further options to ivreg2, see ivreg2, options.
e_ivreg2 posts all available results from ivreg2 in e() with prefix ivreg2_, see ivreg2,
macros.
noisily displays output of ivreg2.
ivslow: For the calculation of standard errors for pooled coefficients an auxiliary
regressions is performed. In case of an IV regression, xtdcce2 runs a simple IV
regression for the auxiliary regressions. this is faster. If option is used ivslow,
then xtdcce2 calls ivreg2 for the auxiliary regression. This is advisable as soon as
ivreg2 specific options are used.
xtdcce2 is able to estimate long run coefficients. Three models are supported: The pooled
mean group models (Shin et. al 1999), similar to xtpmg, the CS-DL (see xtdcce2, csdl)
and CS-ARDL method (see xtdcce2, ardl) as developed in Chudik et. al 2016. No options
for the CS-DL model are necessary.
lr(varlist) specifies the variables to be included in the long-run cointegration vector.
The first variable(s) is/are the error-correction speed of adjustment term. The default
is to use the pmg model. In this case each estimated coefficient is divided by the
negative of the long-run cointegration vector (the first variable). If the option ardl
is used, then the long run coefficients are estimated as the sum over the coefficients
relating to a variable, divided by the sum of the coefficients of the dependent variable.
lr_options(string), options for the long run coefficients. Options are:
ardl estimates the CS-ARDL estimator. For further details see xtdcce2, ardl.
nodivide coefficients are not divided by the error correction speed of adjustment vector. Equation (7) is estimated, see xtdcce2, pmg.
xtpmgnames coefficient names in e(b_p_mg) (or e(b_full)) and e(V_p_mg) (or e(V_full)) match the name convention from xtpmg.
trend adds a linear unit specific trend. May not be combined with pooledtrend.
pooledtrend adds a linear common trend. May not be combined with trend.
Two methods for small sample time series bias correction are supported:
jackknife applies the jackknife bias correction method. May not be combined with
recursive.
recursive applies the recursive mean adjustment method. May not be combined with
jackknife.
nocd suppresses calculation of CD test. For details about the CD test see xtcd2.
showindividual reports unit individual estimates in output.
fullsample uses entire sample available for calculation of cross sectional
averages. Any observations which are lost due to lags will be included
calculating the cross sectional averages (but are not included in the
estimation itself).
fast omit calculation of unit specific standard errors.
noomitted no omitted variable checks.
Econometric and Empirical Model
Econometric Model
Assume the following dynamic panel data model with heterogeneous coefficients:
(1) y(i,t) = b0(i) + b1(i)*y(i,t-1) + x(i,t)*b2(i) + x(i,t-1)*b3(i) + u(i,t)
u(i,t) = g(i)*f(t) + e(i,t)
where f(t) is an unobserved common factor loading, g(i) a heterogeneous factor loading,
x(i,t) is a (1 x K) vector and b2(i) and b3(i) the coefficient vectors. The error e(i,t) is
iid and the heterogeneous coefficients b1(i), b2(i) and b3(i) are randomly distributed around
a common mean. It is assumed that x(i,t) is strictly exogenous. In the case of a static
panel model (b1(i) = 0) Pesaran (2006) shows that mean of the coefficients 0, b2 and b3 (for
example for b2(mg) = 1/N sum(b2(i))) can be consistently estimated by adding cross sectional
means of the dependent and all independent variables. The cross sectional means approximate
the unobserved factors. In a dynamic panel data model (b1(i) <> 0) pT lags of the cross
sectional means are added to achieve consistency (Chudik and Pesaran 2015). The mean group
estimates for b1, b2 and b3 are consistently estimated as long as N,T and pT go to infinity.
This implies that the number of cross sectional units and time periods is assumed to grow
with the same rate. In an empirical setting this can be interpreted as N/T being constant.
A dataset with one dimension being large in comparison to the other would lead to
inconsistent estimates, even if both dimension are large in numbers. For example a financial
dataset on stock markets returns on a monthly basis over 30 years (T=360) of 10,000 firms
would not be sufficient. While individually both dimension can be interpreted as large, they
do not grow with the same rate and the ratio would not be constant. Therefore an estimator
relying on fixed T asymptotics and large N would be appropriate. On the other hand a dataset
with lets say N = 30 and T = 34 would qualify as appropriate, if N and T grow with the same
rate.
The variance of the mean group coefficient b1(mg) is estimated as:
var(b(mg)) = 1/N sum(i=1,N) (b1(i) - b1(mg))^2
or if the vector pi(mg) = (b0(mg),b1(mg)) as:
var(pi(mg)) = 1/N sum(i=1,N) (pi(i) - pi(mg))(p(i)-pi(mg))'
Empirical Model
The empirical model of equation (1) without the lag of variable x is:
(2) y(i,t) = b0(i) + b1(i)*y(i,t-1) + x(i,t)*b2(i) + sum[d(i)*z(i,s)] + e(i,t),
where z(i,s) is a (1 x K+1) vector including the cross sectional means at time s and the sum
is over s=t...t-pT. xtdcce2 supports several different specifications of equation (2).
xtdcce2 partials out the cross sectional means internally. For consistency of the cross
sectional specific estimates, the matrix z = (z(1,1),...,z(N,T)) has to be of full column
rank. This condition is checked for each cross section. xtdcce2 will return a warning if z
is not full column rank. It will, however, continue estimating the cross sectional sepecific
coefficients and then calculate the mean group estimates. The mean group estimates will be
consistent. For further reading see, Chudik, Pesaran (2015, Journal of Econometrics),
Assumption 6 and page 398.
i) Mean Group
If no cross sectional averages are added (d(i) = 0), then the estimator is the Mean Group
Estimator as proposed by Pesaran and Smith (1995). The estimated equation is:
(3) y(i,t) = b0(i) + b1(i)*y(i,t-1) + x(i,t)*b2(i) + e(i,t).
Equation (3) can be estimated by using the nocross option of xtdcce2. The model can be either
static (b(1) = 0) or dynamic (b(1) <> 0).
See example
ii) Common Correlated Effects
The model in equation (3) does not account for unobserved common factors between units. To
do so, cross sectional averages are added in the fashion of Pesaran (2006):
(4) y(i,t) = b0(i) + x(i,t)*b2(i) + d(i)*z(i,t) + e(i,t).
Equation (4) is the default equation of xtdcce2. Including the dependent and independent
variables in crosssectional() and setting cr_lags(0) leads to the same result.
crosssectional() defines the variables to be included in z(i,t). Important to notice is,
that b1(i) is set to zero.
See example
iii) Dynamic Common Correlated Effects
If a lag of the dependent variable is added, endogeneity occurs and adding solely
contemporaneous cross sectional averages is not sufficient any longer to achieve consistency.
Chudik and Pesaran (2015) show that consistency is gained if pT lags of the cross sectional
averages are added:
(5) y(i,t) = b0(i) + b1(i)*y(i,t-1) + x(i,t)*b2(i) + sum [d(i)*z(i,s)] + e(i,t).
where s = t,...,t-pT. Equation (5) is estimated if the option cr_lags() contains a positive
number.
See example
iv) Pooled Estimators
Equations (3) - (5) can be constrained that the parameters are the same across units. Hence
the equations become:
(3-p) y(i,t) = b0 + b1*y(i,t-1) + x(i,t)*b2 + e(i,t),
(4-p) y(i,t) = b0 + x(i,t)*b2 + d(i)*z(i,t) + e(i,t),
(5-p) y(i,t) = b0 + b1*y(i,t-1) + x(i,t)*b2 + sum [d(i)*z(i,s)] + e(i,t).
Variables with pooled (homogenous) coefficients are specified using the pooled(varlist)
option. The constant is pooled by using the option pooledconstant. In case of a pooled
estimation, the standard errors are obtained from a mean group regression. This regression
is performed in the background. See Pesaran (2006).
See example
v) Instrumental Variables
xtdcce2 supports estimations of instrumental variables by using the ivreg2 package.
Endogenous variables (to be instrumented) are defined in varlist2 and their instruments are
defined in varlist_iv.
See example
vi) Error Correction Models (Pooled Mean Group Estimator)
As an intermediate between the mean group and a pooled estimation, Shin et. al (1999)
differentiate between homogenous long run and heterogeneous short run effects. Therefore the
model includes mean group as well as pooled coefficients. Equation (1) (without the lag of
the explanatory variable x and for a better readability without the cross sectional averages)
is transformed into an ARDL model:
(6) y(i,t) = phi(i)*(y(i,t-1) - w0(i) - x(i,t)*w2(i)) + g1(i)*[y(i,t)-y(i,t-1)] + [x(i,t) - x(i,t-1)] * g2(i) + e(i,t),
where phi(i) is the cointegration vector, w(i) captures the long run effects and g1(i) and
g2(i) the short run effects. Shin et. al estimate the long run coefficients by ML and the
short run coefficients by OLS. xtdcce2 estimates a slightly different version by OLS:
(7) y(i,t) = o0(i) + phi(i)*y(i,t-1) + x(i,t)*o2(i) + g1(i)*[y(i,t)-y(i,t-1)] + [x(i,t) - x(i,t-1)] * g2(i) + e(i,t),
where w2(i) = - o2(i) / phi(i) and w0(i) = - o0(i)/phi(i). Equation (7) is estimated by
including the levels of y and x as long run variables using the lr(varlist) and pooled(
varlist) options and adding the first differences as independent variables. xtdcce2
estimates equation (7) but automatically calculates estimates for w(i) = (w0(i),...,wk(i)).
The advantage estimating equation (7) by OLS is that it is possible to use IV regressions and
add cross sectional averages to account for dependencies between units. The
variance/covariance matrix is calculated using the delta method, for a further discussion,
see Ditzen (2018).
See example
vii) Cross-Section Augmented Distributed Lag (CS-DL)
Chudik et. al (2016) show that the long run effect of variable x on variable y in equation
(1) can be directly estimated. Therefore they fit the following model, based on equation
(1):
(8) y(i,t) = w0(i) + x(i,t) * w2(i) + delta(i) * (x(i,t) - x(i,t-1)) + sum [d(i)*z(i,s)] + e(i,t)
where w2(i) is the long effect and sum [d(i)*z(i,s)] the cross-sectional averages with an
appropriate number of lags. To account for the lags of the dependent variable, the
corresponding number of first differences are added. If the model is an ARDL(1,1), then only
the first difference of the explanatory variable is added. In the case of an ARDL(1,2)
model, the first and the second difference are added. The advantage of the CS-DL approach
is, that no short run coefficients need to be estimated.
A general ARDL(py,px) model is estimated by:
(8) y(i,t) = w0(i) + x(i,t) * w2(i) + sum(l=1,px) delta(i,l) * (x(i,t-l) - x(i,t-l-1)) + sum [d(i)*z(i,s)] + e(i,t)
The mean group coefficients are calculated as the unweighted averages of all cross-sectional
specific coefficient estimates. The variance/covariance matrix is estimated as in the case
of a Mean Group Estimation.
See example
viii) Cross-Section Augmented ARDL (CS-ARDL)
As an alternative approach the long run coefficients can be estimated by first estimating the
short run coefficients and then the long run coefficients. For a general ARDL(py,px) model
including cross-sectional averages such as:
(9) y(i,t) = b0(i) + sum(l=1,py) b1(i,l) y(i,t-l) + sum(l=0,px) b2(i,l) x(i,t-l) + sum [d(i)*z(i,s)] + e(i,t),
the long run coefficients for the independent variables are calculated as:
(10) w2(i) = sum(l=0,px) b2(i,l) / ( 1 - sum(l=1,py) b1(i,l))
and for the dependent variable as:
(11) w1(i) = 1 - sum(l=1,py) b1(i,l).
This is the CS-ARDL estimator in Chudik et. al (2016). The variables belonging to w(1,i)
need to be enclosed in parenthesis, or tsvarlist need to be used. For example coding lr(y x
L.x) is equivalent to lr(y (x lx)), where lx is a variable containing the first lag of x (lx
= L.x).
The disadvantage of this approach is, that py and px need to be known. The
variance/covariance matrix is calculated using the delta method, see Ditzen (2018b).
See example
Saved Values
xtdcce2 stores the following in e():
Scalars
e(N) number of observations
e(N_g) number of groups (cross sectional units)
e(T) number of time periods
e(K_mg) number of regressors (excluding variables partialled out)
e(N_partial) number of partialled out variables
e(N_omitted) number of omitted variables
e(N_pooled) number of pooled (homogenous) coefficients
e(mss) model sum of squares
e(rss) residual sum of squares
e(F) F statistic
e(rmse) root mean squared error
e(df_m) model degrees of freedom
e(df_r) residual degree of freedom
e(r2) R-squared
e(r2_a) R-squared adjusted
e(cd) CD test statistic
e(cdp) p-value of CD test statistic
e(Tmin) minimum time (only unbalanced panels)
e(Tbar) average time (only unbalanced panels)
e(Tmax) maximum time (only unbalanced panels)
e(cr_lags) number of lags of cross sectional averages}
Macros
e(tvar) name of time variable
e(idvar) name of unit variable
e(depvar) name of dependent variable
e(indepvar) name of independent variables
e(omitted) omitted variables
e(lr) variables in long run cointegration vector
e(pooled) pooled (homogenous) coefficients
e(cmd) command line
e(cmdline) command line including options
e(insts) instruments (exogenous) variables (only IV)
e(istd) instrumented (endogenous) variables (only IV)
e(version) xtdcce2 version, if xtdcce2, version used.
Matrices
e(b) coefficient vector
e(V) variance-covariance matrix
e(bi) coefficient vector of individual and pooled coefficients
e(Vi) variance-covariance matrix of individual and pooled coefficients
Estimated long run coefficients of the ARDL model are marked with the prefix lr_.
Functions
e(sample) marks estimation sample
Postestimation Commands
predict and estat can be used after xtdcce2.
predict
The syntax for predict is:
predict [type] newvar [if] [in] [, xb stdpresiduals coefficients se replace ]
Options Description
-------------------------------------------------------------------------------------------------
xb calculate linear prediction
stdp calculate standard error of the prediction
residuals calculate residuals (e(i,t))
cfresiduals calculate residuals including the common factors (u(i,t))
coefficients a variable with the estimated cross section specific values for all coefficients is created. The name of the new variable is newvar_varname.
se as coefficient, but with standard error instead.
partial create new variables with the partialled out values.
replace replace the variable if existing.
-------------------------------------------------------------------------------------------------
xtdcce2 is able to calculte both residuals from equation (1). predict newvar , residuals
calculates e(i,t). That is, the residuals of the regression with the cross sectional
averages partialled out. predict newvar , residuals calculates u(i,t) = g(i)*f(g) +
e(i,t). That is, the residuals including the common factors, approximated by the cross
sectional averages. Internally, the fitted values are calculated and then subtracted
from the dependent variable. Therefore it is important to note, that if a constant is
used, the constant needs to be reported using the xtdcce2 option reportconstant.
Otherwise the u(i,t) includes the constant as well (u(i,t) = b0(i) + g(i)*f(g) + e(i,t)).
estat
estat can be used to create a box, bar or range plot. The syntax is:
estat graphtype [varlist] [if] [in] [, combine(string) individual(string) nomg cleargraph]
graphtype Description
-------------------------------------------------------------------------------------------------
box box plot; see graph bar
bar bar plot; see graph box
rcap range plot; see twoway rcap
-------------------------------------------------------------------------------------------------
Options Description
-------------------------------------------------------------------------------------------------
individual(string) passes options for individual graphs (only bar and rcap); see twoway_options
combine(string) passes options for combined graphs; see twoway_options
nomg mean group point estimate and confidence interval are not included in bar and range plot graphs
cleargraph clears the option of the graph command and is best used in combination with the combine() and individual() options
-------------------------------------------------------------------------------------------------
The name of the combined graph is saved in r(graph_name).
Examples
An example dataset of the Penn World Tables 8 is available for download here. The dataset
contains yearly observations from 1960 until 2007 and is already tsset. To estimate a growth
equation the following variables are used: log_rgdpo (real GDP), log_hc (human capital),
log_ck (physical capital) and log_ngd (population growth + break even investments of 5%).
Mean Group Estimation
To estimate equation (3), the option nocrosssectional is used. In order to obtain estimates
for the constant, the option reportconstant is enabled.
xtdcce2 d.log_rgdpo L.log_rgdpo log_hc log_ck log_ngd , nocross reportc.
Omitting reportconstant leads to the same result, however the constant is partialled out:
xtdcce2 d.log_rgdpo L.log_rgdpo log_hc log_ck log_ngd , nocross.
Common Correlated Effect
Common Correlated effects (static) models can be estimated in several ways. The first
possibility is without any cross sectional averages related options:
xtdcce2 d.log_rgdpo log_hc log_ck log_ngd , cr(_all) reportc.
Note, that as this is a static model, the lagged dependent variable does not occur and only
contemporaneous cross sectional averages are used. Defining all independent and dependent
variables in crosssectional(varlist) leads to the same result:
xtdcce2 d.log_rgdpo log_hc log_ck log_ngd , reportc cr(d.log_rgdpo log_hc log_ck
log_ngd).
The default for the number of cross sectional lags is zero, implying only contemporaneous
cross sectional averages are used. Finally the number of lags can be specified as well using
the cr_lags option.
xtdcce2 d.log_rgdpo log_hc log_ck log_ngd , reportc cr(d.log_rgdpo log_hc log_ck log_ngd)
cr_lags(0).
All three command lines are equivalent and lead to the same estimation results.
Dynamic Common Correlated Effect
The lagged dependent variable is added to the model again. To estimate the mean group
coefficients consistently, the number of lags is set to 3:
xtdcce2 d.log_rgdpo L.log_rgdpo log_hc log_ck log_ngd , reportc cr(d.log_rgdpo
L.log_rgdpo log_hc log_ck log_ngd) cr_lags(3).
Using predict
predict, [options] can be used to predict the lienar prediction, the residuals, coefficients
and the partialled out variables. To predict the residuals, options residuals is used:
predict residuas, residuals
The residuals do not contain the partialled out factors, that is they are e(i,t) in equation
(1) and (2). To estimate u(i,t), the error term containing the common factors, option
cfresiduals is used:
predict uit, cfresiduals
In a similar fashion, the linear prediction (option xb, the default) and the standard error
of the prediction can be obtained. The unit specific estimates for each variable and the
standard error can be obtained using options coefficients and se. For example, obtain the
coefficients for log_hc from the regression above and calculate the mean, which should be the
same as the mean group estimate:
predict coeff, coefficients
sum coeff_log_hc.
The partialled out variables can be obtained using
predict partial, partial.
Then a regression on the variables would lead to the same results as above.
If the option replace is used, then the newvar is replaced if it exists.
Pooled Estimations
All coefficients can be pooled by including them in pooled(varlist). The constant is pooled
by using the pooledconstant option:
xtdcce2 d.log_rgdpo L.log_rgdpo log_hc log_ck log_ngd , reportc cr(d.log_rgdpo
L.log_rgdpo log_hc log_ck log_ngd) pooled(L.log_rgdpo log_hc log_ck log_ngd) cr_lags(3)
pooledconstant.
Instrumental Variables
Endogenous variables can be instrumented by using options endogenous_vars(varlist) and
exogenous_vars(varlist). Internally ivreg2 estimates the individual coefficients. Using the
lagged level of physical capital as an instrument for the contemporaneous level, leads to:
xtdcce2 d.log_rgdpo L.log_rgdpo log_hc log_ck log_ngd (log_ck = L.log_ck), reportc
cr(d.log_rgdpo L.log_rgdpo log_hc log_ck log_ngd) cr_lags(3) ivreg2options(nocollin noid).
Further ivreg2 options can be passed through using ivreg2options. Stored values in e() from
ivreg2options can be posted using the option fulliv.
Error Correction Models (Pooled Mean Group Estimator)
Variables of the long run cointegration vector are defined in lr(varlist), where the first
variable is the error correction speed of adjustment term. To ensure homogeneity of the long
run effects, the corresponding variables have to be included in the pooled(varlist) option.
Following the example from Blackburne and Frank (2007) with the jasa2 dataset (the dataset is
available at here from Pesaran's webpage).
xtdcce2 d.c d.pi d.y if year >= 1962 , lr(L.c pi y) p(L.c pi y) cr(_all) cr_lags(2)
xtdcce2 internally estimates equation (7) and then recalculates the long run coefficients,
such that estimation results for equation (8) are obtained. Equation (7) can be estimated
adding nodivide to lr_options(). A second option is xtpmgnames in order to match the naming
convention from xtpmg.
xtdcce2 d.c d.pi d.y if year >= 1962 , lr(L.c pi y) p(L.c pi y) cr(_all) cr_lags(2)
lr_options(nodivide)
xtdcce2 d.c d.pi d.y if year >= 1962 , lr(L.c pi y) p(L.c pi y) cr(_all) cr_lags(2)
lr_options(xtpmgnames)
Cross-Section Augmented Distributed Lag (CS-DL)
Chudik et. al (2013) estimate the long run effects of public debt on output growth (the data
is available here on Kamiar Mohaddes' personal webpage). In the dataset, the dependent
variable is d.y and the independent variables are the inflation rate (dp) and debt to GDP
ratio (d.gd). For an ARDL(1,1,1) only the first difference of dp and d.gd are added as
further covariates. Only a contemporaneous lag of the cross-sectional averages (i.e.
cr_lags(0)) of the dependent variable and 3 lags of the independent variables are added. The
lag structure is implemented by defining a numlist rather than a number in cr_lags(). For
the example here cr_lags(0 3 3) is used, where the first number refers to the first variable
defined in cr(), the second to the second etc.
To replicate the results in Table 18, the following command line is used:
xtdcce2 d.y dp d.gd d.(dp d.gd), cr(d.y dp d.gd) cr_lags(0 3 3) fullsample
For an ARDL(1,3,3) model the first and second lag are of the first differences are added by
putting L(0/2) in front of the d.(dp d.gd):
xtdcce2 d.y dp d.gd L(0/2).d.(dp d.gd), cr(d.y dp d.gd) cr_lags(0 3 3) fullsample
Note, the fullsample option is used to reproduce the results in Chudik et. al (2013).
Cross-Section Augmented ARDL (CS-ARDL)
Chudik et. al (2013) estimate besides the CS-DL model a CS-ARDL model. To estimate this
model all variables are treated as long run coefficients and thus added to varlist in lr(
varlist). xtdcce2 first estimates the short run coefficients and the calculates then long
run coefficients, following Equation 10. The option lr_options(ardl) is used to invoke the
estimation of the long run coefficients. Variables with the same base (i.e. forming the same
long run coefficient) need to be either enclosed in parenthesis or tsvarlist operators need
to be used. In Table 17 an ARDL(1,1,1) model is estimated with three lags of the
cross-sectional averages:
xtdcce2 d.y , lr(L.d.y dp L.dp d.gd L.d.gd) lr_options(ardl) cr(d.y dp d.gd) cr_lags(3)
fullsample
xtdcce2 calculates the long run effects identifying the variables by their base. For example
it recognizes that dp and L.dp relate to the same variable. If the lag of dp is called ldp,
then the variables need to be enclosed in parenthesis.
Estimating the same model but as an ARDL(3,3,3) and with enclosed parenthesis reads:
xtdcce2133 d.y , lr((L(1/3).d.y) (L(0/3).dp) (L(0/3).d.gd) ) lr_options(ardl) cr(d.y dp
d.gd) cr_lags(3) fullsample
which is equivalent to coding without parenthesis:
xtdcce2133 d.y , lr(L(1/3).d.y L(0/3).dp L(0/3).d.gd) lr_options(ardl) cr(d.y dp d.gd)
cr_lags(3) fullsample
References
Baum, C. F., M. E. Schaffer, and S. Stillman 2007. Enhanced routines for instrumental
variables/generalized method of moments estimation and testing. Stata Journal 7(4):
465-506
Chudik, A., K. Mohaddes, M. H. Pesaran, and M. Raissi. 2013. Debt, Inflation and Growth:
Robust Estimation of Long-Run Effects in Dynamic Panel Data Model.
Chudik, A., and M. H. Pesaran. 2015. Common correlated effects estimation of heterogeneous
dynamic panel data models with weakly exogenous regressors. Journal of Econometrics
188(2): 393-420.
Chudik, A., K. Mohaddes, M. H. Pesaran, and M. Raissi. 2016. Long-Run Effects in Large
Heterogeneous Panel Data Models with Cross-Sectionally Correlated Errors Essays in Honor
of Aman Ullah. 85-135.
Ditzen, J. 2018. xtdcce2: Estimating Dynamic Common Correlated Effcts in Stata. The Stata
Journal, forthcoming.
Ditzen, J. 2018b. Estimating long run effects in models with cross-sectional dependence using
xtdcce2.
Blackburne, E. F., and M. W. Frank. 2007. Estimation of nonstationary heterogeneous panels.
Stata Journal 7(2): 197-208.
Eberhardt, M. 2012. Estimating panel time series models with heterogeneous slopes. Stata
Journal 12(1): 61-71.
Feenstra, R. C., R. Inklaar, and M. Timmer. 2015. The Next Generation of the Penn World
Table. American Economic Review. www.ggdc.net/pwt
Jann, B. 2005. moremata: Stata module (Mata) to provide various functions. Available from
http://ideas.repec.org/c/boc/bocode/s455001.html.
Pesaran, M. 2006. Estimation and inference in large heterogeneous panels with a multifactor
error structure. Econometrica 74(4): 967-1012.
Pesaran, M. H., and R. Smith. 1995. Econometrics Estimating long-run relationships from
dynamic heterogeneous panels. Journal of Econometrics 68: 79-113.
Shin, Y., M. H. Pesaran, and R. P. Smith. 1999. Pooled Mean Group Estimation of Dynamic
Heterogeneous Panels. Journal of the American Statistical Association 94(446): 621-634.
Author
Jan Ditzen (Heriot-Watt University)
Email: j.ditzen@hw.ac.uk
Web: www.jan.ditzen.net
I am grateful to Arnab Bhattacharjee, David M. Drukker, Markus Eberhardt, Tullio Gregori,
Erich Gundlach and Mark Schaffer, to the participants of the 2016 Stata Users Group
meeting in London and two anonymous referees of The Stata Journal for many valuable
comments and suggestions.
The routine to check for positive definite or singular matrices was provided by Mark
Schaffer, Heriot-Watt University, Edinburgh, UK.
xtdcce2 was formally called xtdcce.
Please cite as follows:
Ditzen, J. 2018. xtdcce2: Estimating dynamic common correlated effects in Stata. The
Stata Journal. Forthcoming.
The latest versions can be obtained via net from http://www.ditzen.net/Stata and beta
versions including a full history of xtdcce2 from net from
http://www.ditzen.net/Stata/xtdcce2_beta.
Changelog
This version: 1.33 - 22. August 2018
Version 1.32 to Version 1.33
- bug in if statements fixed.
- noomitted added, bug in cr(_all) fixed.
- added option "replace" and "cfresiduals" to predict.
- CS-DL and CS-ARDL method added.
- Output as in Stata Journal Version.
Version 1.31 to Version 1.32
- bug number of groups fixed
- predict, residual produced different results within xtdcce2 and after if panel
unbalanced or trend used (thanks to Tullio Gregori for the pointer).
- check for rank condition.
- several bugs fixed.
Version 1.2 to Version 1.31
- code for regression in Mata
- corrected standard errors for pooled coefficients, option cluster not necessary any
longer. Please rerun estimations if used option pooled()
- Fixed errors in unbalanced panel
- option post_full removed, individual estimates are posted in e(bi) and e(Vi)
- added option ivslow.
- legacy control for endogenous_var(), exogenous_var() and residuals().
Also see
See also: xtcd2, ivreg2, xtmg, xtpmg, moremata