Skip to content

Commit c1495d4

Browse files
authored
Merge pull request #253 from DoubleML/jh-logistic
Added documentation for LPLR model.
2 parents 0acb4fa + 9d47550 commit c1495d4

File tree

10 files changed

+406
-4
lines changed

10 files changed

+406
-4
lines changed

doc/api/datasets.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ Dataset Generators
3131

3232
plm.datasets.make_plr_CCDDHNR2018
3333
plm.datasets.make_plr_turrell2018
34+
plm.datasets.make_lplr_LZZ2020
3435
plm.datasets.make_pliv_CHS2015
3536
plm.datasets.make_pliv_multiway_cluster_CKMS2021
3637
plm.datasets.make_confounded_plr_data

doc/api/dml_models.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ doubleml.plm
1616
:template: class.rst
1717

1818
DoubleMLPLR
19+
DoubleMLLPLR
1920
DoubleMLPLIV
2021

2122

doc/conf.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -274,6 +274,8 @@
274274
"https://doi.org/10.1097%2FEDE.0b013e3181f74493",
275275
# Valid DOI; Causes 403 Client Error: Forbidden for url:...
276276
"https://doi.org/10.3982/ECTA15732",
277+
# Valid DOI; Causes 403 Client Error: Forbidden for url:...
278+
"https://doi.org/10.1093/ectj/utab019"
277279
]
278280

279281
# To execute R code via jupyter-execute one needs to install the R kernel for jupyter

doc/examples/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,10 +22,11 @@ General Examples
2222
py_double_ml_sensitivity.ipynb
2323
py_double_ml_apo.ipynb
2424
py_double_ml_irm_vs_apo.ipynb
25+
py_double_ml_lplr.ipynb
26+
py_double_ml_ssm.ipynb
2527
py_double_ml_learner.ipynb
2628
py_double_ml_firststage.ipynb
2729
py_double_ml_multiway_cluster.ipynb
28-
py_double_ml_ssm.ipynb
2930
py_double_ml_sensitivity_booking.ipynb
3031
learners/py_tabpfn.ipynb
3132
py_double_ml_basic_iv.ipynb

doc/examples/py_double_ml_lplr.ipynb

Lines changed: 291 additions & 0 deletions
Large diffs are not rendered by default.

doc/guide/models/plm/lplr.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
**Logistic partially linear regression (LPLR)** models take the form
2+
3+
.. math::
4+
5+
\mathbb{E} [Y | D, X] = \mathbb{P} (Y=1 | D, X) = \text{expit} \{\beta_0 D + r_0 (X) \}
6+
7+
where :math:`Y` is the binary outcome variable and :math:`D` is the policy variable of interest.
8+
The high-dimensional vector :math:`X = (X_1, \ldots, X_p)` consists of confounding covariates and
9+
:math:`\text{expit}` is the logistic link function
10+
11+
.. math::
12+
\text{expit} ( X ) = \frac{1}{1 + e^{-x}}
13+

doc/guide/models/plm/plm_models.inc

Lines changed: 37 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,40 @@ Partially linear regression model (PLR)
6464
dml_plr_obj$fit()
6565
print(dml_plr_obj)
6666

67+
.. _lplr-model:
68+
69+
Logistic partially linear regression model (LPLR)
70+
*************************************************
71+
72+
.. include:: /guide/models/plm/lplr.rst
73+
74+
.. include:: /shared/causal_graphs/plr_irm_causal_graph.rst
75+
76+
``DoubleMLLPLR`` implements LPLR models. Estimation is conducted via its ``fit()`` method.
77+
78+
.. note::
79+
Remark that the treatment effects are not additive in this model. The partial linear term enters the model through a logistic link function.
80+
81+
.. tab-set::
82+
83+
.. tab-item:: Python
84+
:sync: py
85+
86+
.. ipython:: python
87+
88+
import numpy as np
89+
import doubleml as dml
90+
from doubleml.plm.datasets import make_lplr_LZZ2020
91+
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
92+
from sklearn.base import clone
93+
np.random.seed(3141)
94+
ml_t = RandomForestRegressor(n_estimators=100, max_features=15, max_depth=15, min_samples_leaf=5)
95+
ml_m = RandomForestRegressor(n_estimators=100, max_features=15, max_depth=15, min_samples_leaf=5)
96+
ml_M = RandomForestClassifier(n_estimators=100, max_features=15, max_depth=15, min_samples_leaf=5)
97+
obj_dml_data = make_lplr_LZZ2020(alpha=0.5, n_obs=1000, dim_x=15)
98+
dml_lplr_obj = dml.DoubleMLLPLR(obj_dml_data, ml_M, ml_t, ml_m)
99+
dml_lplr_obj.fit().summary
100+
67101

68102
.. _pliv-model:
69103

@@ -91,12 +125,12 @@ Estimation is conducted via its ``fit()`` method:
91125
from sklearn.ensemble import RandomForestRegressor
92126
from sklearn.base import clone
93127

94-
learner = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
128+
learner = RandomForestRegressor(n_estimators=100, max_features=5, max_depth=5, min_samples_leaf=5)
95129
ml_l = clone(learner)
96130
ml_m = clone(learner)
97131
ml_r = clone(learner)
98132
np.random.seed(2222)
99-
data = make_pliv_CHS2015(alpha=0.5, n_obs=500, dim_x=20, dim_z=1, return_type='DataFrame')
133+
data = make_pliv_CHS2015(alpha=0.5, n_obs=500, dim_x=5, dim_z=1, return_type='DataFrame')
100134
obj_dml_data = dml.DoubleMLData(data, 'y', 'd', z_cols='Z1')
101135
dml_pliv_obj = dml.DoubleMLPLIV(obj_dml_data, ml_l, ml_m, ml_r)
102136
print(dml_pliv_obj.fit())
@@ -120,4 +154,4 @@ Estimation is conducted via its ``fit()`` method:
120154
obj_dml_data = DoubleMLData$new(data, y_col="y", d_col = "d", z_cols= "Z1")
121155
dml_pliv_obj = DoubleMLPLIV$new(obj_dml_data, ml_l, ml_m, ml_r)
122156
dml_pliv_obj$fit()
123-
print(dml_pliv_obj)
157+
print(dml_pliv_obj)
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
For the LPLR model implemented in ``DoubleMLLPLR`` one can choose between
2+
``score='nuisance_space'`` and ``score='instrument'``.
3+
4+
``score='nuisance_space'`` implements the score function:
5+
6+
.. math::
7+
8+
\psi(W, \beta, \eta) := \psi(X) \{Y e^{\beta D} -(1-Y)e^{r_0(X)} \} \{ D - m_0(X)\}
9+
10+
with nuisance elements :math:`\eta = { r(\cdot), m(\cdot), \psi(\cdot) }`, where
11+
12+
.. math::
13+
14+
r_0(X) = t_0(X) - \breve \beta a_0(X),
15+
16+
m_0(X) = \mathbb{E} [D | X, Y=0],
17+
18+
\psi(X) = \text{expit} (-r_0(X)).
19+
20+
For the estimation of :math:`r_0(X)`, we further need to obtain a preliminary estimate :math:`\breve \beta` and
21+
:math:`M (D, X) = \mathbb{P} [Y=1 | D, X]` as described in `Liu et al. (2021) <https://doi.org/10.1093/ectj/utab019>`_
22+
and the following estimates:
23+
24+
.. math::
25+
26+
t_0(X) = \mathbb{E} [\text{logit}(M (D, X)) | X],
27+
28+
a_0(X) = \mathbb{E} [D | X].
29+
30+
31+
32+
``score='instrument'`` implements the score function:
33+
34+
.. math::
35+
36+
\psi(W; \beta, \eta) := \mathbb E [ \{Y - \text{expit} (\beta_0 D + r_0(X )) \} Z_0 ]
37+
38+
39+
with :math:`Z_0=D-m(X)` and :math:`\eta = { r(\cdot), m(\cdot), \psi(\cdot) }`, where
40+
41+
.. math::
42+
43+
r_0(X) = t_0(X) - \breve \beta a_0(X),
44+
45+
m_0(X) = \mathbb{E} [D | X].
46+
47+
and :math:`r_0(X)` is computed as for ``score='nuisance_space'``.

doc/guide/scores/plm/plm_scores.inc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,12 @@ Partially linear regression model (PLR)
77

88
.. include:: /guide/scores/plm/plr_score.rst
99

10+
.. _lplr-score:
11+
12+
Logistic partial linear regression (LPLR)
13+
===========================================
14+
15+
.. include:: /guide/scores/plm/lplr_score.rst
1016

1117
.. _pliv-score:
1218

doc/literature/literature.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,12 @@ Double Machine Learning Literature
136136
:octicon:`link` :bdg-link-dark:`URL <https://jmlr.org/papers/volume21/19-827/19-827.pdf>`
137137
|hr|
138138

139+
- Molei Liu, Yi Zhang, Doudou Zhou |br|
140+
**Double/Debiased Machine Learning for Logistic Partially Linear Model** |br|
141+
*The Econometrics Journal, 24(3), Pages 559–588, 2021* |br|
142+
:octicon:`link` :bdg-link-dark:`URL <https://doi.org/10.1093/ectj/utab019>`
143+
|hr|
144+
139145
- Yusuke Narita, Shota Yasui, Kohei Yata |br|
140146
**Debiased Off-Policy Evaluation for Recommendation Systems** |br|
141147
*RecSys '21: Fifteenth ACM Conference on Recommender Systems, 372–379, 2021* |br|

0 commit comments

Comments
 (0)