Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,8 @@
'https://glmnet.stanford.edu/index.html',
# Valid URL Failed to establish a new connection: [Errno 111] Connection refused' ...
'https://glmnet.stanford.edu/reference/cv.glmnet.html',
# Valid URL (error not replicable), Causes 409 Client Error: Too Many Requests for url
'http://dx.doi.org/10.2139/ssrn.3619201',
]

# To execute R code via jupyter-execute one needs to install the R kernel for jupyter
Expand Down
78 changes: 55 additions & 23 deletions doc/examples/py_double_ml_apo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@
"source": [
"# Python: Average Potential Outcome (APO) Models\n",
"\n",
"In this example, we illustrate how the [DoubleML](https://docs.doubleml.org/stable/index.html) package can be used to estimate average potential outcomes (APOs) in a [DoubleMLIRM](https://docs.doubleml.org/stable/guide/models.html#interactive-regression-model-irm) model.\n",
"In this example, we illustrate how the [DoubleML](https://docs.doubleml.org/stable/index.html) package can be used to estimate average potential outcomes (APOs) in an interactive regression model (see [DoubleMLIRM](https://docs.doubleml.org/stable/guide/models.html#interactive-regression-model-irm)).\n",
"\n",
"The goal is to estimate the average potential outcome\n",
"\n",
" $$\\theta_0 =\\mathbb{E}[Y(a)]$$\n",
" $$\\theta_0 =\\mathbb{E}[Y(d)]$$\n",
"\n",
"for a given treatment level $a$ and and discrete valued treatment $D$."
"for a given treatment level $d$ and and discrete valued treatment $D$."
]
},
{
Expand Down Expand Up @@ -46,9 +46,11 @@
"metadata": {},
"source": [
"At first, let us generate data according to the [make_irm_data_discrete_treatments](https://docs.doubleml.org/dev/api/generated/doubleml.datasets.make_irm_data_discrete_treatments.html#doubleml.datasets.make_irm_data_discrete_treatments) data generating process. The process generates data with a continuous\n",
"treatment variable and contains the true individual treatment effects (ITEs) with respect to option of not getting treated (the underlying continuous treatment is not required for the model).\n",
"treatment variable and contains the true individual treatment effects (ITEs) with respect to option of not getting treated.\n",
"\n",
"According to the continuous treatment variable, the treatment is discretized into multiple levels, based on quantiles. Using the oracle values of the model let us estimate the true APOs and averate treatment effects (ATEs) for the different levels of the treatment variable.\n"
"According to the continuous treatment variable, the treatment is discretized into multiple levels, based on quantiles. Using the *oracle* ITEs, enables the comparison to the true APOs and averate treatment effects (ATEs) for the different levels of the treatment variable.\n",
"\n",
"**Remark:** The average potential outcome model does not require an underlying continuous treatment variable. The model will work identically if the treatment variable is discrete by design."
]
},
{
Expand Down Expand Up @@ -170,7 +172,16 @@
"source": [
"## Single Average Potential Outcome Models (APO)\n",
"\n",
"Further, we have to specify machine learning algorithms. As in the [DoubleMLIRM](https://docs.doubleml.org/stable/guide/models.html#interactive-regression-model-irm) model, we have to set ''ml_m'' as a classifier and ''ml_g'' as a regressor (since the outcome is continuous)."
"Further, we have to specify machine learning algorithms. As in the [DoubleMLIRM](https://docs.doubleml.org/stable/guide/models.html#interactive-regression-model-irm) model, we have to set ``ml_m`` as a classifier and ``ml_g`` as a regressor (since the outcome is continuous). As in the \n",
"[DoubleMLIRM](https://docs.doubleml.org/stable/guide/models.html#interactive-regression-model-irm) model, the classifier ``ml_m`` is used to estimate the conditional probability of receiving treatment level $d$ given the covariates $X$\n",
"\n",
"$$m_{0,d}(X) = \\mathbb{E}[1\\{D=d\\}|X]$$\n",
"\n",
"and the regressor ``ml_g`` is used to estimate the conditional expectation of the outcome $Y$ given the covariates $X$ and the treatment $D$\n",
"\n",
"$$g_{0}(D, X) = \\mathbb{E}[Y|X,D].$$\n",
"\n",
"As the DGP is linear we will use a linear regression model for the regressor and a logistic regression model for the classifier."
]
},
{
Expand All @@ -187,7 +198,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Further, the [DoubleMLAPO](https://docs.doubleml.org/dev/api/generated/doubleml.DoubleMLAPO.html#doubleml.DoubleMLAPO) model requires a specification of the treatment value for which the APOs should be estimated. In this example, we will loop over all treatment levels."
"Further, the [DoubleMLAPO](https://docs.doubleml.org/dev/api/generated/doubleml.DoubleMLAPO.html#doubleml.DoubleMLAPO) model requires a specification of the treatment level $a$ for which the APOs should be estimated. In this example, we will loop over all treatment levels."
]
},
{
Expand Down Expand Up @@ -232,6 +243,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The tables above displays the estimated values in the ``theta`` column and the corresponding oracle values in the ``apo`` column. \n",
"\n",
"Again, let us summarize the results in a plot of the APOs with confidence intervals."
]
},
Expand Down Expand Up @@ -291,7 +304,7 @@
"ci_pointwise = dml_obj.confint(level=0.95)\n",
"\n",
"df_apos_ci = pd.DataFrame(\n",
" {'treatment_level': range(n_levels + 1),\n",
" {'treatment_level': treatment_levels,\n",
" 'apo': apos,\n",
" 'theta': thetas,\n",
" 'ci_lower': ci_pointwise.values[:, 0],\n",
Expand Down Expand Up @@ -332,6 +345,25 @@
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For joint confidence intervals, the ``bootstrap`` method can be used."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dml_obj.bootstrap(n_rep_boot=2000)\n",
"ci_joint = dml_obj.confint(level=0.95, joint=True)\n",
"\n",
"ci_joint"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -378,16 +410,16 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Causal Contrats\n",
"### Causal Contrasts\n",
"\n",
"The [DoubleMLAPOS](https://docs.doubleml.org/dev/api/generated/doubleml.DoubleMLAPOS.html#doubleml.DoubleMLAPOS) model also allows for the estimation of causal contrasts. \n",
"The contrast is defined as the difference in the average potential outcomes between the treatment levels $a_i$ and $a_j$ where\n",
"The contrast is defined as the difference in the average potential outcomes between the treatment levels $d_i$ and $d_j$ where\n",
"\n",
"$$ \\theta_0 = \\mathbb{E}[Y(a_i)] - \\mathbb{E}[Y(a_{j})]$$\n",
"$$ \\theta_{0,ij} = \\mathbb{E}[Y(d_i)] - \\mathbb{E}[Y(d_{j})]$$\n",
"\n",
"and will be calculated for all defined treatment levels $i$ and reference levels $j$.\n",
"\n",
"In this example, we will estimate the causal contrast between the treatment level $0$ and all other treatment levels, as the treatment level $0$ correspondonds to no treatment at all whereas the the other levels are based on the treatment dosage.\n",
"In this example, we will estimate the causal contrast between the treatment level $0$ and all other treatment levels, as the treatment level $0$ corresponds to no treatment at all whereas the the other levels are based on the treatment dosage.\n",
"\n",
"Therefore we have to specify ``reference_levels=0``."
]
Expand All @@ -399,17 +431,7 @@
"outputs": [],
"source": [
"causal_contrast_model = dml_obj.causal_contrast(reference_levels=0)\n",
"\n",
"ates = causal_contrast_model.thetas\n",
"ci_ates = causal_contrast_model.confint(level=0.95)\n",
"\n",
"df_ates = pd.DataFrame(\n",
" {'treatment_level': treatment_levels[1:],\n",
" 'ate': ates,\n",
" 'ci_lower': ci_ates.iloc[:, 0].values,\n",
" 'ci_upper': ci_ates.iloc[:, 1].values}\n",
")\n",
"df_ates"
"print(causal_contrast_model.summary)"
]
},
{
Expand All @@ -425,6 +447,16 @@
"metadata": {},
"outputs": [],
"source": [
"ates = causal_contrast_model.thetas\n",
"ci_ates = causal_contrast_model.confint(level=0.95)\n",
"\n",
"df_ates = pd.DataFrame(\n",
" {'treatment_level': treatment_levels[1:],\n",
" 'ate': ates,\n",
" 'ci_lower': ci_ates.iloc[:, 0].values,\n",
" 'ci_upper': ci_ates.iloc[:, 1].values}\n",
")\n",
"\n",
"# Plotting\n",
"plt.figure(figsize=(10, 6))\n",
"# Plot Estimate with 95% CI\n",
Expand Down
32 changes: 29 additions & 3 deletions doc/release/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,38 @@ Release notes

.. tab-item:: Python

.. dropdown:: DoubleML 0.8.2
.. dropdown:: DoubleML 0.9.0
:class-title: sd-bg-primary sd-font-weight-bold
:open:

- **Release highlight:** Average potential outcomes for multiple discrete treatments
via ``DoubleMLAPO`` and ``DoubleMLAPOS`` classes
`#250 <https://github.com/DoubleML/doubleml-for-py/pull/250>`_

- Update User Guide and Example Gallery
`#188 <https://github.com/DoubleML/doubleml-docs/pull/188>`_
`#195 <https://github.com/DoubleML/doubleml-docs/pull/195>`_

- Add sensitivity analysis to ``DoubleMLFramework``
`#249 <https://github.com/DoubleML/doubleml-for-py/pull/249>`_

- Maintainance package
`#264 <https://github.com/DoubleML/doubleml-for-py/pull/264>`_
`#265 <https://github.com/DoubleML/doubleml-for-py/pull/265>`_
`#266 <https://github.com/DoubleML/doubleml-for-py/pull/266>`_

- Maintenance documentation
`#182 <https://github.com/DoubleML/doubleml-docs/pull/182>`_
`#184 <https://github.com/DoubleML/doubleml-docs/pull/184>`_
`#186 <https://github.com/DoubleML/doubleml-docs/pull/186>`_
`#193 <https://github.com/DoubleML/doubleml-docs/pull/193>`_
`#194 <https://github.com/DoubleML/doubleml-docs/pull/194>`_
`#196 <https://github.com/DoubleML/doubleml-docs/pull/196>`_
`#197 <https://github.com/DoubleML/doubleml-docs/pull/197>`_

.. dropdown:: DoubleML 0.8.2
:class-title: sd-bg-primary sd-font-weight-bold

- **API Update**: Change nuisance evaluation for classifiers.
The corresponding properties are renamed ``nuisance_loss`` instead of ``rmses``.
`#254 <https://github.com/DoubleML/doubleml-for-py/pull/254>`_
Expand Down Expand Up @@ -40,7 +68,6 @@ Release notes

.. dropdown:: DoubleML 0.8.1
:class-title: sd-bg-primary sd-font-weight-bold
:open:

- Increment package requirements and update workflows for python 3.9 (add tests for python 3.12)
`#247 <https://github.com/DoubleML/doubleml-for-py/pull/247>`_
Expand All @@ -55,7 +82,6 @@ Release notes

.. dropdown:: DoubleML 0.8.0
:class-title: sd-bg-primary sd-font-weight-bold
:open:

- **Release highlight:** Sample-selections models as ``DoubleMLSMM`` class (by `Michaela Kecskésová <https://github.com/mychaelka>`_)
`#231 <https://github.com/DoubleML/doubleml-for-py/pull/231>`_
Expand Down
Loading