Skip to content

Commit

Permalink
Ironed out minor typos in the documentation.
Browse files Browse the repository at this point in the history
  • Loading branch information
segsell committed Jan 29, 2020
1 parent 968b643 commit 287f38b
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 8 deletions.
8 changes: 4 additions & 4 deletions docs/source/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -106,17 +106,17 @@ In most empirical applications, bandwidth choices between 0.2 and 0.4 are approp
:cite:`Fan1994` find that a gridsize of 400 is a good default for graphical analysis.
For data sets with less than 400 observations, we recommend a gridsize equivalent to the maximum number of observations that
remain after trimming the common support.
If the data set of size *N* is large enough, a gridsize of 400 should be considered as the minimal number of evaluation points.
If the data set of size N is large enough, a gridsize of 400 should be considered as the minimal number of evaluation points.
Since *grmpy*'s algorithm is fast enough, gridsize can be easily increased to *N* evaluation points.

The "rbandwidth", which is 0.05 by default, specifies the bandwidth for the LOWESS (Locally Weighted Scatterplot Smoothing) regression of
:math:`X`, :math:`X \ \times \ p`, and :math:`Y` on :math:`\widehat{P}(Z)`. If the sample size is small (N < 400),
the user may need to increase "rbandwidth" to 0.1. Otherwise *grmpy* will throw an error.

Note that the MTE identified by LIV consists of wo components: $\overline{x}(\beta_1 - \beta_0)$ (which does not depend on :math:`P(Z) = p) and :math:`k(p)`
(which does depend on :math:`p`). The latter is estimated nonparametrically. The section "p_range" in the initialization file specifies the interval
Note that the MTE identified by LIV consists of wo components: :math:`\overline{x}(\beta_1 - \beta_0)` (which does not depend on :math:`P(Z) = p)` and :math:`k(p)`
(which does depend on :math:`p`). The latter is estimated nonparametrically. The key "p_range" in the initialization file specifies the interval
over which :math:`k(p)` is estimated. After the data outside the overlapping support are trimmed, the locally quadratic kernel estimator
uses the remaining data to predict $k(p)$ over the entire "p_range" specified by the user. If "p_range" is larger than the common support, *grmpy*
uses the remaining data to predict :math:`k(p)` over the entire "p_range" specified by the user. If "p_range" is larger than the common support, *grmpy*
extrapolates the values for the MTE outside this region. Technically speaking, interpretations of the MTE are only valid within the common support.
In our empirical applications, we set "p_range" to :math:`[0.005,0.995]`.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -143,8 +143,8 @@
"For the semiparametric estimation, we need information on the following sections:\n",
"\n",
"* __ESTIMATION__: Specify the dependent (wage) and indicator variable (treatment dummy) of the input data frame.\n",
"For the estimation of the propensity score $P(Z)$, we choose a probability model, here logit. Furthermore, we select 30 bins to determine the common support in the treated and untreated subsamples. For the locally quadratic regression, we follow the specification of [Carneiro et al. (2011)](https://pubs.aeaweb.org/doi/pdfplus/10.1257/aer.101.6.2754) and choose a bandwidth of 0.322. The respective gridsize for the locally quadratic regression is set to 500. [Fan and Marron (1994)](https://www.tandfonline.com/doi/abs/10.1080/10618600.1994.10474629) find that a gridsize of 400 is a good default for graphical analysis. Since the data set is large (1785 observations) and the kernel regression function has a very low runtime, I increase the gridsize to 500. Setting it to the default or increasing it even more does not affect the final MTE. <br>\n",
"Note that the MTE identified by LIV consists of two components: $\\overline{x}(\\beta_1 - \\beta_0)$ (which does not depend on $P(Z) = p$) and $k(p)$ (which does depend on $p$). The latter is estimated nonparametrically. The section \"p_range\" in the initialization file specifies the interval over which $k(p)$ is estimated. After the data outside the overlapping support are trimmed, the locally quadratic kernel estimator uses the remaining data to predict $k(p)$ over the entire \"p_range\" specified by the user. If \"p_range\" is larger than the common support, *grmpy* extrapolates the values for the MTE outside this region. Technically speaking, interpretations of the MTE are only valid within the common support. Here, we set \"p_range\" to [0.005, 0.995]. <br>\n",
"For the estimation of the propensity score $P(Z)$, we choose a probability model, here logit. Furthermore, we select 30 bins to determine the common support in the treated and untreated subsamples. For the locally quadratic regression, we follow the specification of [Carneiro et al. (2011)](https://pubs.aeaweb.org/doi/pdfplus/10.1257/aer.101.6.2754) and choose a bandwidth of 0.322. The respective gridsize for the locally quadratic regression is set to 500. [Fan and Marron (1994)](https://www.tandfonline.com/doi/abs/10.1080/10618600.1994.10474629) find that a gridsize of 400 is a good default for graphical analysis. Since the data set is large (1785 observations) and the kernel regression function has a very low runtime, we increase the gridsize to 500. Setting it to the default or increasing it even more does not affect the final MTE. <br>\n",
"Note that the MTE identified by LIV consists of two components: $\\overline{x}(\\beta_1 - \\beta_0)$ (which does not depend on $P(Z) = p$) and $k(p)$ (which does depend on $p$). The latter is estimated nonparametrically. The key \"p_range\" in the initialization file specifies the interval over which $k(p)$ is estimated. After the data outside the overlapping support are trimmed, the locally quadratic kernel estimator uses the remaining data to predict $k(p)$ over the entire \"p_range\" specified by the user. If \"p_range\" is larger than the common support, *grmpy* extrapolates the values for the MTE outside this region. Technically speaking, interpretations of the MTE are only valid within the common support. Here, we set \"p_range\" to [0.005, 0.995]. <br>\n",
"The other parameters in this section are set by default and, normally, do not need to be changed.\n",
"\n",
"\n",
Expand Down Expand Up @@ -461,7 +461,7 @@
}
],
"source": [
"rslt = fit('tutorial_semipar.yml', semipar=True)"
"rslt = fit('files/tutorial_semipar.yml', semipar=True)"
]
},
{
Expand Down Expand Up @@ -546,7 +546,7 @@
}
],
"source": [
"mte, quantiles = plot_semipar_mte(rslt, 'tutorial_semipar.yml', nbootstraps=250)"
"mte, quantiles = plot_semipar_mte(rslt, 'files/tutorial_semipar.yml', nbootstraps=250)"
]
},
{
Expand Down

0 comments on commit 287f38b

Please sign in to comment.