TMLE & Machine Learning #109

pzivich · 2019-06-27T18:33:10Z

TMLE is not guaranteed to attain nominal coverage when used with machine learning. A simulation paper showing major problems is: https://arxiv.org/abs/1711.07137
As a result, I don't feel like TMLE can continue to be supported with machine learning, especially since it implies the confidence intervals are way too narrow (sometimes resulting in 0% coverage). I know this is a divergence from R's tmleverse, but I would rather enforce the best practice/standards than allow incorrect use of methods

Due to this issue, I will be dropping support for TMLE with machine learning. In place of this, I plan on adding CrossfitTMLE which will support machine learning approaches. The crossfitting will result in valid confidence intervals / inference.

Tentative plan:

In v0.8.0, TMLE will throw a warning when using the custom_model argument.
Once the Crossfit-AIPW and Crossfit-TMLE are available (v0.9.0), TMLE will lose that functionality. If users want to use TMLE with machine learning, they will need to use a prior version

The text was updated successfully, but these errors were encountered:

pzivich · 2019-10-02T22:36:13Z

Getting into the semiparametric theory behind the estimator, some machine learning estimators are "smooth-enough" (i.e. Donsker class) to work with TMLE. As a result, TMLE will not actually lose support for machine learning.

Rather, it will keep it but I would like to write a check to see if the user is using an estimator that is not Donsker class (i.e. random forest). This would ideally trigger a warning about the confidence intervals. I also should update the docs to thoroughly explain this concept and when to use the cross-fitting procedure for TMLE instead

As a future note; nuisance function estimators like LASSO and GAM are Donsker class and should provide appropriate coverage with TMLE. However, I would still push users to use the crossfit estimators over TMLE with machine learning (once implemented and available)

pzivich · 2019-10-04T13:19:34Z

I am thinking something like

def _check_donsker_(est):
   if est in set:
        warnings.warn("Donsker class issue. Use a crossfit estimator instead")

However, I need to decide whether I check whether the input is in a set of Donsker estimators or is in a set of non-Donsker.

I am leaning towards checking if the input is in a set of Donsker instead. This is a little more careful and will handle estimators with unknown Donsker class properties, until I can run some heurstic simulations. I am okay directing the use of crossfit estimators mistakenly for potentially Donsker class estimators since the crossfit estimators are valid for both Donsker and non-Donsker classes. I would rather encourage more care in the use of machine learning in these causal inference estimators than what is currently implemented elsewhere.

This is a lot more work for me, but I think it fits with all the semiparametric theory wrt what works and what doesn't. Also I am not enforcing a strict rule (like the original plan for 0.9.0 was), users can still turn off the warning and ignore my recommendations and use random forests in regular TMLE.

emaadmanzoor · 2019-12-23T01:23:19Z

Thank you for this amazing package! I had a few questions regarding this issue, and was wondering if you could help:

Until version 0.9: if I split my sample manually, and fit my ML method on a different sample than that used for the rest of the estimation, will the confidence intervals be valid?
Do you know how TMLE + cross-fitting differs from "double machine learning" (Microsoft Research has some code implementing it)?
Does the computational overhead of cross-fitting justify using a Donsker class check in any situation? I would personally always cross-fit.

pzivich · 2019-12-23T13:47:05Z

Sure thing! There is a lot I don't fully understand yet, but below is what I currently know.

How the architecture for TMLE is currently setup wouldn't directly allow the sample splitting as you described. Behind the scenes, TMLE takes an unfit ML and fits it to the full data set. It then generates predictions from the ML and stores them as self.g1w, self.g0w, self.QA1W, self.QA0W, and self.QAW. You could "trick" TMLE into using the sample splits by changing the array of values stored to correspond to the sample split desired. However, at that point you might as well code the cross-fit TMLE procedure by hand. My plan is to release the cross-fit estimators early in 2020. I have code that I believe implements them as intended but I have to find a way to test some of the features.
Double machine learning (based on the definition provided in Chernozhukov et al. 2016, same paper referenced by Microsoft Research) differs slightly from the cross-fit procedure (based on the definition provided in Newey & Robins). Whereas double machine learning requires a minimum of one split, cross-fitting requires two splits. The unique part of the cross-fit estimators is that they are designed for doubly robust estimators like AIPW and TMLE. Since these estimators have two nuisance functions, we need to estimate those nuisance functions in different splits. Double machine learning on the other hand allows for 'cross-over' between nuisance function samples, theoretically causing a problem for the doubly robust estimators. However, I have not seen a direct comparison between these approaches. Since the cross-fit estimators are preferable in theory for AIPW and TMLE, I plan on adding that soon.
I would generally recommend cross-fit, but there are important exceptions. Cross-fitting is unnecessary for things like GAM or LASSO. Both of those are sufficiently smooth for valid confidence intervals with standard TMLE. While they would still be valid when used with cross-fit there are three potential problems that could arise. (1) Computation time for the cross-fit estimator can be intense. Since how splits are allotted can result in different estimates in finite samples, we need to repeat the procedure over a bunch of different splits and take the average (100 different splits has been recommended and my simulations support that). Therefore a single run of a cross-fit procedure may require fitting 600 GAMs, whereas the standard TMLE would only require 2. (2) For small sample sizes cross-fit doesn't work too well because of the sample splitting procedure. TMLE with GAM may have a sufficient sample size to be flexible to estimate the functional form but Crossfit-TMLE with GAM may have too small of a sample size for the GAM to converge to the correct functional form. (3) Confidence interval width is a little wider for cross-fit estimators (due to their calculation). If we use GAM only, then we are unnecessarily sacrificing precision in our estimate.

Based on the points in 3, I think a Donsker check for some estimators (that may be reasonably believed to be Donsker) is worthwhile. I would love to hear your thoughts though.

Another note, I have not seen it discussed how a cross-fit estimator would work with censored (missing outcome) observations, so CrossfitTMLE won't support that when initially released but TMLE does.

As a final note, the cross-fit procedure technically does not allow for any ML algorithm. The ML algorithm must still meet some convergence criteria (but that criteria is weaker than non-cross-fit). For extremely slow converging things (like random forests), I don't think there is a way to obtain valid confidence intervals yet (but I may be wrong).

pzivich · 2019-12-28T22:45:30Z

@emaadmanzoor saw you asked van der Laan about this point on Twitter and he wrote a blog post about it https://vanderlaan-lab.org/2019/12/24/cv-tmle-and-double-machine-learning/

emaadmanzoor · 2019-12-29T01:11:19Z

Thank you for the detailed response! Yes I just saw his response today but will take a while to parse it completely. I’m hoping to write some simulations/benchmarks over the holidays comparing DML/TMLE under different scenarios so we can understand this work better. I’ll post back here with my findings. These methods are crucial in my work on causal inference with text, and implementations such as yours really pave the way for practitioners. Exciting times!

pzivich added bug change Short-term Issues/additions that will be completed in the near future Causal inference Updates for the causal inference branch labels Jun 27, 2019

pzivich self-assigned this Jun 27, 2019

pzivich mentioned this issue Jun 27, 2019

v0.8.0 #108

Merged

23 tasks

pzivich pinned this issue Jul 9, 2019

pzivich added this to To do in Causal inference with machine learning Jul 19, 2019

pzivich changed the title ~~TMLE dropping support for Machine Learning~~ TMLE dropping some support for Machine Learning Oct 2, 2019

This was referenced Oct 3, 2019

Support Prediction from PyGAM LogisticGAM #121

Merged

Check if Donsker class #124

Closed

pzivich unpinned this issue Oct 4, 2019

pzivich mentioned this issue Oct 4, 2019

Add custom_model to AIPTW #125

Closed

pzivich changed the title ~~TMLE dropping some support for Machine Learning~~ TMLE & Machine Learning Nov 21, 2019

pzivich closed this as completed Dec 27, 2020

pzivich moved this from To do to Done in Causal inference with machine learning Dec 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TMLE & Machine Learning #109

TMLE & Machine Learning #109

pzivich commented Jun 27, 2019

pzivich commented Oct 2, 2019

pzivich commented Oct 4, 2019

emaadmanzoor commented Dec 23, 2019

pzivich commented Dec 23, 2019

pzivich commented Dec 28, 2019

emaadmanzoor commented Dec 29, 2019

TMLE & Machine Learning #109

TMLE & Machine Learning #109

Comments

pzivich commented Jun 27, 2019

pzivich commented Oct 2, 2019

pzivich commented Oct 4, 2019

emaadmanzoor commented Dec 23, 2019

pzivich commented Dec 23, 2019

pzivich commented Dec 28, 2019

emaadmanzoor commented Dec 29, 2019