Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TMLE & Machine Learning #109

Closed
pzivich opened this issue Jun 27, 2019 · 6 comments
Closed

TMLE & Machine Learning #109

pzivich opened this issue Jun 27, 2019 · 6 comments
Assignees
Labels
bug Causal inference Updates for the causal inference branch change Short-term Issues/additions that will be completed in the near future

Comments

@pzivich
Copy link
Owner

pzivich commented Jun 27, 2019

TMLE is not guaranteed to attain nominal coverage when used with machine learning. A simulation paper showing major problems is: https://arxiv.org/abs/1711.07137
As a result, I don't feel like TMLE can continue to be supported with machine learning, especially since it implies the confidence intervals are way too narrow (sometimes resulting in 0% coverage). I know this is a divergence from R's tmleverse, but I would rather enforce the best practice/standards than allow incorrect use of methods

Due to this issue, I will be dropping support for TMLE with machine learning. In place of this, I plan on adding CrossfitTMLE which will support machine learning approaches. The crossfitting will result in valid confidence intervals / inference.

Tentative plan:

  • In v0.8.0, TMLE will throw a warning when using the custom_model argument.

  • Once the Crossfit-AIPW and Crossfit-TMLE are available (v0.9.0), TMLE will lose that functionality. If users want to use TMLE with machine learning, they will need to use a prior version

@pzivich pzivich added bug change Short-term Issues/additions that will be completed in the near future Causal inference Updates for the causal inference branch labels Jun 27, 2019
@pzivich pzivich self-assigned this Jun 27, 2019
@pzivich pzivich mentioned this issue Jun 27, 2019
23 tasks
@pzivich pzivich pinned this issue Jul 9, 2019
@pzivich pzivich changed the title TMLE dropping support for Machine Learning TMLE dropping some support for Machine Learning Oct 2, 2019
@pzivich
Copy link
Owner Author

pzivich commented Oct 2, 2019

Getting into the semiparametric theory behind the estimator, some machine learning estimators are "smooth-enough" (i.e. Donsker class) to work with TMLE. As a result, TMLE will not actually lose support for machine learning.

Rather, it will keep it but I would like to write a check to see if the user is using an estimator that is not Donsker class (i.e. random forest). This would ideally trigger a warning about the confidence intervals. I also should update the docs to thoroughly explain this concept and when to use the cross-fitting procedure for TMLE instead

As a future note; nuisance function estimators like LASSO and GAM are Donsker class and should provide appropriate coverage with TMLE. However, I would still push users to use the crossfit estimators over TMLE with machine learning (once implemented and available)

@pzivich pzivich unpinned this issue Oct 4, 2019
@pzivich
Copy link
Owner Author

pzivich commented Oct 4, 2019

I am thinking something like

def _check_donsker_(est):
   if est in set:
        warnings.warn("Donsker class issue. Use a crossfit estimator instead")

However, I need to decide whether I check whether the input is in a set of Donsker estimators or is in a set of non-Donsker.

I am leaning towards checking if the input is in a set of Donsker instead. This is a little more careful and will handle estimators with unknown Donsker class properties, until I can run some heurstic simulations. I am okay directing the use of crossfit estimators mistakenly for potentially Donsker class estimators since the crossfit estimators are valid for both Donsker and non-Donsker classes. I would rather encourage more care in the use of machine learning in these causal inference estimators than what is currently implemented elsewhere.

This is a lot more work for me, but I think it fits with all the semiparametric theory wrt what works and what doesn't. Also I am not enforcing a strict rule (like the original plan for 0.9.0 was), users can still turn off the warning and ignore my recommendations and use random forests in regular TMLE.

@pzivich pzivich changed the title TMLE dropping some support for Machine Learning TMLE & Machine Learning Nov 21, 2019
@emaadmanzoor
Copy link

Thank you for this amazing package! I had a few questions regarding this issue, and was wondering if you could help:

  1. Until version 0.9: if I split my sample manually, and fit my ML method on a different sample than that used for the rest of the estimation, will the confidence intervals be valid?

  2. Do you know how TMLE + cross-fitting differs from "double machine learning" (Microsoft Research has some code implementing it)?

  3. Does the computational overhead of cross-fitting justify using a Donsker class check in any situation? I would personally always cross-fit.

@pzivich
Copy link
Owner Author

pzivich commented Dec 23, 2019

Sure thing! There is a lot I don't fully understand yet, but below is what I currently know.

  1. How the architecture for TMLE is currently setup wouldn't directly allow the sample splitting as you described. Behind the scenes, TMLE takes an unfit ML and fits it to the full data set. It then generates predictions from the ML and stores them as self.g1w, self.g0w, self.QA1W, self.QA0W, and self.QAW. You could "trick" TMLE into using the sample splits by changing the array of values stored to correspond to the sample split desired. However, at that point you might as well code the cross-fit TMLE procedure by hand. My plan is to release the cross-fit estimators early in 2020. I have code that I believe implements them as intended but I have to find a way to test some of the features.

  2. Double machine learning (based on the definition provided in Chernozhukov et al. 2016, same paper referenced by Microsoft Research) differs slightly from the cross-fit procedure (based on the definition provided in Newey & Robins). Whereas double machine learning requires a minimum of one split, cross-fitting requires two splits. The unique part of the cross-fit estimators is that they are designed for doubly robust estimators like AIPW and TMLE. Since these estimators have two nuisance functions, we need to estimate those nuisance functions in different splits. Double machine learning on the other hand allows for 'cross-over' between nuisance function samples, theoretically causing a problem for the doubly robust estimators. However, I have not seen a direct comparison between these approaches. Since the cross-fit estimators are preferable in theory for AIPW and TMLE, I plan on adding that soon.

  3. I would generally recommend cross-fit, but there are important exceptions. Cross-fitting is unnecessary for things like GAM or LASSO. Both of those are sufficiently smooth for valid confidence intervals with standard TMLE. While they would still be valid when used with cross-fit there are three potential problems that could arise. (1) Computation time for the cross-fit estimator can be intense. Since how splits are allotted can result in different estimates in finite samples, we need to repeat the procedure over a bunch of different splits and take the average (100 different splits has been recommended and my simulations support that). Therefore a single run of a cross-fit procedure may require fitting 600 GAMs, whereas the standard TMLE would only require 2. (2) For small sample sizes cross-fit doesn't work too well because of the sample splitting procedure. TMLE with GAM may have a sufficient sample size to be flexible to estimate the functional form but Crossfit-TMLE with GAM may have too small of a sample size for the GAM to converge to the correct functional form. (3) Confidence interval width is a little wider for cross-fit estimators (due to their calculation). If we use GAM only, then we are unnecessarily sacrificing precision in our estimate.

Based on the points in 3, I think a Donsker check for some estimators (that may be reasonably believed to be Donsker) is worthwhile. I would love to hear your thoughts though.

Another note, I have not seen it discussed how a cross-fit estimator would work with censored (missing outcome) observations, so CrossfitTMLE won't support that when initially released but TMLE does.

As a final note, the cross-fit procedure technically does not allow for any ML algorithm. The ML algorithm must still meet some convergence criteria (but that criteria is weaker than non-cross-fit). For extremely slow converging things (like random forests), I don't think there is a way to obtain valid confidence intervals yet (but I may be wrong).

@pzivich
Copy link
Owner Author

pzivich commented Dec 28, 2019

@emaadmanzoor saw you asked van der Laan about this point on Twitter and he wrote a blog post about it https://vanderlaan-lab.org/2019/12/24/cv-tmle-and-double-machine-learning/

@emaadmanzoor
Copy link

Thank you for the detailed response! Yes I just saw his response today but will take a while to parse it completely. I’m hoping to write some simulations/benchmarks over the holidays comparing DML/TMLE under different scenarios so we can understand this work better. I’ll post back here with my findings. These methods are crucial in my work on causal inference with text, and implementations such as yours really pave the way for practitioners. Exciting times!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Causal inference Updates for the causal inference branch change Short-term Issues/additions that will be completed in the near future
Development

No branches or pull requests

2 participants