Question: How/where does this compare to EconML #88

tszumowski · 2019-12-03T01:10:34Z

I've been experimenting with DoWhy recently and really enjoy the structure. I noticed some other libraries out there, such as EconML (mentioned here) and Uber's CausalML. I'll focus on EconML for this discussion, particularly because I see some recent PRs that brought in EconML's CATE estimator.

I'd just like to confirm the differences and overlap between DoWhy and EconML. Please let me know if I understand this correctly.

My Attempt at Comparisons

Let's start with DoWhy's structure:

model (make assumptions),
identify (find what to estimate given the assumptions),
estimate
refute (sensitivity and robustness checks).

1. Model

DoWhy: Provides ability to explicitly define complex causal graphs. Or alternately, (though not preferred?), define common confounders to assess.
EconML: I didn't see a means to define the causal graph other than through variable definition Y, T, X, W, Z

2. Identify

DoWhy: Hunts down causal effects using graph analysis and do-calculus
EconML: Not sure I saw this explicitly in the library?

3. Estimate

DoWhy: Backdoor, instrumental variables, and most recently do-sampling (Which is aweeesoome!)
EconML: Seems to be where EconML currently shines. There's a whole slew of approaches, many implementing approaches very recent ML research papers

4. Refute

DoWhy: Heavy focus on model validation with several methods
EconML: I didn't notice anything explicit.

Where they overlap

It seems to me they overlap most heavily in the Estimation section. That's where I saw some references here on the roadmap in bringing in EconML calls. Is that correct?

Question on estimators and terminology

I see EconML has the following called out for estimators:

Potential Outcomes
Structural Equations
CATE

What is the difference between the estimators listed above and the ones built into DoWhy? I was just struggling to connect the dots there.

Thank you.

The text was updated successfully, but these errors were encountered:

amit-sharma · 2019-12-04T12:38:24Z

thanks @tszumowski for starting this discussion. Yes, you are right---DoWhy and Econml overlap only in the estimate section. DoWhy implements the full process of causal reasoning including model, identify, estimate and refute. In comparison, Econml implements only the estimate step.

In designing DoWhy, we kept a focus on the "ideal" process of doing a causal analysis, which includes identification and more importantly, refutation so that modeling assumptions can be tested. The estimators in DoWhy currently are the standard estimators for causal inference. As you rightly point out, EconML has much more advanced estimators for estimating the conditional average treatment effect (CATE). This is why we are implementing an interface in DoWhy so that you can call Econml methods directly from dowhy's estimate function. Here's an experimental Jupyter notebook to see it in action.

For your second question, there are actually two considerations when designing a causal analysis. One is about the modeling framework, and the other is about the target estimand.

Potential outcomes and structural equations are ways to construct a causal model. Another way is to use structural causal model which is based on a graphical model. The differences between these frameworks are often a matter of detail (and of big academic debate). But in practice, both econml and dowhy are compatible with these different ways of expressing a causal model. In DoWhy specifically, we use the structural causal model framework in the identify step, and rely heavily on methods derived from the potential outcomes framework in the estimate step.

The other consideration is the target estimand: do you want an effect for the full population (average treatment effect, ATE) or for a specific population, e.g., conditioned on "Gender=Female", (conditional average treatment effect, CATE). EconML methods are designed to estimate CATE, which is a subject of active research. Most of DoWhy's methods focus on estimating the ATE so far, although we are extending some of the methods to also estimate CATE.

Hope this helps.

tszumowski · 2019-12-05T20:41:04Z

Wow. @amit-sharma thank you for the fantastic summary. I'm only recently trying to ramp up on causal analysis, coming from a stats/ML/bayesian background. I am noticing what you mean by "big academic debate" as I was getting a bit lost in the various different methods. This clarified a lot for me.

Regarding the EconML integration, I ran into an issue when trying to run that notebook. I'll create a separate issue for that. Otherwise, this ticket can be closed. Thank you.

tonyabracadabra · 2021-08-12T10:14:41Z

we use the structural causal model framework in the identify step

What does it mean by this? Is the identify step used as input for the estimate step? How are we integrating the prior into the estimation?

amit-sharma added the discussion Discussion about causal inference and DoWhy's roadmap. label Dec 4, 2019

tszumowski closed this as completed Dec 5, 2019

tszumowski mentioned this issue Dec 5, 2019

CATE Notebook Error: econml.dml.DMLCateEstimator is not an existing causal estimator #89

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: How/where does this compare to EconML #88

Question: How/where does this compare to EconML #88

tszumowski commented Dec 3, 2019

amit-sharma commented Dec 4, 2019

tszumowski commented Dec 5, 2019

tonyabracadabra commented Aug 12, 2021

Question: How/where does this compare to EconML #88

Question: How/where does this compare to EconML #88

Comments

tszumowski commented Dec 3, 2019

My Attempt at Comparisons

1. Model

2. Identify

3. Estimate

4. Refute

Where they overlap

Question on estimators and terminology

amit-sharma commented Dec 4, 2019

tszumowski commented Dec 5, 2019

tonyabracadabra commented Aug 12, 2021