# Prelude: Why Heterogeneous Treatment Effects?
What are the typical use cases of estimating heterogeneous treatment effects?
1. Customer targeting -- we want to _personalize_ our interventions (e.g. determine whom to send pushes to).
2. Personalized pricing -- we want to estimate personalized price elasticities, and set prices differently for different users.
3. Learning Click-Through-Rates (CTRs). This is _not_ a substitute for running multiple A/B tests, but rather a complementary analysis.

# Problem Setup and API Design
We want to estimate the conditional average treatment effect (CATE). In the discrete case:

$$ \tau(t_0, t_1, x) = E[Y(t_1) - Y(t_0)|X = x] $$

And in the case of a continuous treatment, like price:

$$ \partial \tau (t, x) = E[\nabla_t Y(t) | X = x] $$

For Econ ML, we'll assume we have data of the form

$$ {Y_i(T_i), T_i, X_i, W_i, Z_i} $$

where
* $ Y_i(T_i) $ is the observed outcome;
* $ T_i $ is the treatment;
* $ X_i $ are the covariates used for heterogeneity;
* $ W_i $ are other observable covariates used as controls;
* $ Z_i $ are instruments that affect the treatment, but don't affect the outcome except through the treatment.

## API
Each `Estimator` class will have the following main methods:
* `fit`; which estimates the counterfactual model from data;
* `effect`; which estimates the discrete heterogeneous treatment effect between two treatment points;
* `marginal_effect`; which estimates the continuous heterogeneous marginal effect around a base treatment point;
* `effect_interval` & `marginal_effect_interval`; which compute confidence intervals when explicitly called for.

# Example API Use on Toy Data

## Generate Data

In [1]:
import numpy as np

# Instance parameters
n_controls = 100
n_instruments = 1
n_features = 1 
n_treatments = 1
alpha = np.random.normal(size=(n_controls, 1))
beta = np.random.normal(size=(n_instruments, 1))
gamma = np.random.normal(size=(n_treatments, 1))
delta = np.random.normal(size=(n_treatments, 1))
zeta = np.random.normal(size=(n_controls, 1))

n_samples = 1000
W = np.random.normal(size=(n_samples, n_controls))
Z = np.random.normal(size=(n_samples, n_instruments))
X = np.random.normal(size=(n_samples, n_features))
eta = np.random.normal(size=(n_samples, n_treatments))
epsilon = np.random.normal(size=(n_samples, 1))
T = np.dot(W, alpha) + np.dot(Z, beta) + eta
y = np.dot(T**2, gamma) + np.dot(np.multiply(T, X), delta) + np.dot(W, zeta) + epsilon

## Fit Basic Causal Model

In [15]:
from econml._cate_estimator import BaseCateEstimator
from econml.orf import DMLOrthoForest

In [18]:
cfest = DMLOrthoForest()
# cfest.fit(y, T, X=X, W=W, inference='bootstrap')

NOTE! The above doesn't work. Need to go through specific examples with orthogonal CATE estimators (tomorrow).