Difference between predict() and average_treatment_effect() for calculating CATEs in honest causal_forest #1031

njawadekar · 2021-08-10T01:22:59Z

Hi,
I am writing this in an effort to better understand the predict() and average_treatment_effect() functions, particularly in regards to when one should be utilized over the other to estimate conditional average treatment effects (CATEs) in an honest causal forest. Additional details related to this query are listed below:

(1) Research Goal:
After building an honest causal_forest on my dataset, I would now like to calculate Conditional Average Treatment Effects (CATEs) within specific strata of covariates on the same dataset.

(2) Initial Plan:
Based on this application paper by Athey & Wager, it seems that I should be using the predict() function in order to estimate these CATEs using an "honest" approach. Based on the documentation on predict(), it appears that by default, this function estimates the treatment effects such that these effects are estimated for every observation using only the trees in the forest which did not use that particular observation when it was modeled--so, out-of-bag estimation.

(3) Question:
However, I understand that there is additionally an average_treatment_effect function, which can also supposedly estimate Conditional Average Treatment Effects in a causal forest in a doubly robust fashion. I would like to better understand the differences between these two functions (predict() and average_treatment_effect()), and the different circumstances in which one function should be used over the other to estimate CATEs on data within an honest causal forest. Evidently, the math behind each function differs, as shown in my attached code that I ran on a mock dataset. This attached R code can be used to reproduce the very different conditional average treatment effects that I calculated for a specific subset of individuals when I used the predict() approach vs. average_treatment_effect().

In addition to explaining which function is better for calculating conditional average treatment effects in various circumstances, could someone also please explain in layman's terms what each function is doing behind the scenes? For example, is the average_treatment_effect function using all of the trees to estimate the treatment effects (and not out-of-bag?)? Also - how are propensity scores utilized for the average_treatment_effect function?

Thanks!

Steps to reproduce
Please find the attached code.
cates_cf.txt

GRF version
2.0.2

erikcs · 2021-08-14T16:22:42Z

Hi, predict(causal.forest) gives estimates of E[Y(1) - Y(0)|X=x], average_treatment_effect gives an estimate (based on augmented inverse probability weighting, (8) in https://arxiv.org/pdf/1902.07409.pdf) of E[Y(1) - Y(0)].

erikcs · 2021-09-01T04:28:13Z

Hi again @njawadekar, for a recent overview on suggested practices for HTE estimation you can have a look at the August 31 talk on the Online Causal Inference Seminar where there is a tutorial on Estimating heterogeneous treatment effects in R.

njawadekar · 2021-10-09T05:56:22Z

Thanks for sending this information.

After watching the video, I still have a follow-up question to double-check whether I am estimating my CATEs correctly. While it is clear to me how to calculate individual-specific CATEs using the predict function, I am still not crystal clear about how to calculate CATE (and corresponding 95% CI) across a group of multiple individuals. For example, let's say that my training dataset contained 100 individuals, 30 of whom were male. If I wanted to calculate the CATE on this specific subgroup of individuals (the 30 males), then I believe I would need to write something like the following steps to calculate the CATE (but please confirm):

(1) First, I would write the standard predict function to generate predictions on my entire training dataset, e.g.
c.pred <- predict(object, newdata = NULL)

(2) Next, I would subset my c.pred to only those males, e.g.
males_strata <- c.pred %>% select(gender = "M")

(3) Finally, I would calculate the average CATE in the subgroup (the 30 males) by doing this:
mean(males_strata$predictions)

(4) And then to compute the lower and upper 95% CI, I would just manually compute it after calculating the S.D. and sample size (N) for this strata... 95% CI = Mean +/- S.D. / sqrt(N)

Does this make sense?

erikcs · 2021-10-11T19:15:21Z

Hi @njawadekar, average_treatment_effect(forest, subset = gender == "M") (doc) will give you a doubly robust estimate of the ATE in that subgroup along with std errors

njawadekar · 2021-10-12T04:32:16Z

Thanks @erikcs , and what about if I wanted to compute these CATEs without this doubly robust approach? Are the steps I outlined above (using the predict() function) okay?

erikcs · 2021-10-12T15:28:08Z

The mean CATE will be very similar, but the 95% CI won't give correct coverage

njawadekar · 2021-10-13T02:49:55Z

Ok, thank you for the insights @erikcs !

njawadekar · 2021-10-13T13:39:38Z

Last question @erikcs , does the average_treatment_effect function conduct out-of-bag estimation (i.e., only produces estimates on trees for which those particular observations were not used to build that tree), like the predict() function does? I am currently running a simulation study on the causal_forest, to see how close to the truth the causal_forest can get to the truth under different model parameters. However, if the average_treatment_effect does not take an honest approach to estimation, then I'm not sure I should be using this for my analysis.

erikcs · 2021-10-13T15:36:04Z

Yes, average_treatment_effect uses OOB

njawadekar · 2021-10-13T21:26:18Z

Thanks @erikcs. Related to this, can you also please explain scenarios when we we would want to use the AIPW approach embedded in this average_treatment_effect function, as opposed to the quasi-oracle estimation described by Nie & Wager (e.g. the Y.hat and W.hat arguments in the causal_forest function). I realize that both are doubly robust approaches intended to address unconfoundedness... however, I was wondering if your group has any general guidelines on when either/both should be used when implementing a causal forest.

erikcs · 2021-10-13T21:46:43Z

causal_forest is GRF + R-learner (Nie/Wager) by default, i.e. Y.hat/W.hat are by default estimated separately. This "orthogonalization" step helps when treatment assignment is confounded, see the last two columns in Table 1 of https://arxiv.org/pdf/1610.01271.pdf to see the empirical performance of causal_forest without/with orthogonalization

njawadekar · 2021-10-13T22:02:03Z

Thanks! So just to clarify, when I calculate the conditional treatment effects using average_treatment_effect on a causal_forest object, these treatment effects have been estimated using two doubly robust approaches (orthogonalization (i.e. R-learner) as well as AIPW)?

erikcs · 2021-10-13T22:20:41Z

Orthogonalization here refers to the centering step in causal forest (the "R-learner"). Doubly robustness is a "post fitting" correction to give a "better" estimate of an ATE, which is what average_treatment_effect does (section 2.1 in https://arxiv.org/pdf/1902.07409.pdf, briefly: augmented inverse propensity weighting can "cancel out" estimation errors in W.hat and Y.hat)

katianak · 2021-10-14T04:01:50Z

I have had the same doubts for a while. However, what does it mean if the CATE estimated using mean(predict(forest)$predictions) and average_treatment_effect(forest) are different?

njawadekar · 2021-10-18T16:19:00Z

@erikcs : When estimating a conditional ATE using the average_treatment_effect function and the subset argument, what are the specific covariates that actually go into building the treatment model as well as the outcome model for constructing the AIPW? By default, does it just include all Xi covariates that originally went into the causal forest?

e.g.: average_treatment_effect(cforest, target.sample = "all", subset = (cforest$X[,2] == 1 & cforest$X[,14] == 0))[1]

In addition, do the treatment and outcome models used for the orthogonalization to build the causal forest, by default, also just use all of the Xi covariates?

erikcs · 2021-10-18T18:01:34Z

by default, does it just include all Xi covariates that originally went into the causal forest?

Yes

In addition, do the treatment and outcome models used for the orthogonalization to build the causal forest, by default, also just use all of the Xi covariates?

Yes, it's the same treatment and outcome model used above

erikcs closed this as completed Sep 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between predict() and average_treatment_effect() for calculating CATEs in honest causal_forest #1031

Difference between predict() and average_treatment_effect() for calculating CATEs in honest causal_forest #1031

njawadekar commented Aug 10, 2021 •

edited

erikcs commented Aug 14, 2021

erikcs commented Sep 1, 2021

njawadekar commented Oct 9, 2021

erikcs commented Oct 11, 2021

njawadekar commented Oct 12, 2021

erikcs commented Oct 12, 2021

njawadekar commented Oct 13, 2021

njawadekar commented Oct 13, 2021

erikcs commented Oct 13, 2021

njawadekar commented Oct 13, 2021

erikcs commented Oct 13, 2021

njawadekar commented Oct 13, 2021

erikcs commented Oct 13, 2021

katianak commented Oct 14, 2021

njawadekar commented Oct 18, 2021 •

edited

erikcs commented Oct 18, 2021

Difference between predict() and average_treatment_effect() for calculating CATEs in honest causal_forest #1031

Difference between predict() and average_treatment_effect() for calculating CATEs in honest causal_forest #1031

Comments

njawadekar commented Aug 10, 2021 • edited

erikcs commented Aug 14, 2021

erikcs commented Sep 1, 2021

njawadekar commented Oct 9, 2021

erikcs commented Oct 11, 2021

njawadekar commented Oct 12, 2021

erikcs commented Oct 12, 2021

njawadekar commented Oct 13, 2021

njawadekar commented Oct 13, 2021

erikcs commented Oct 13, 2021

njawadekar commented Oct 13, 2021

erikcs commented Oct 13, 2021

njawadekar commented Oct 13, 2021

erikcs commented Oct 13, 2021

katianak commented Oct 14, 2021

njawadekar commented Oct 18, 2021 • edited

erikcs commented Oct 18, 2021

njawadekar commented Aug 10, 2021 •

edited

njawadekar commented Oct 18, 2021 •

edited