New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add capability to predict the outcomes to causal tree/forest #590
Comments
Hi @winston-zillow, once you train a causalml model, you can predict for both the control and treatment units with the same covariates. Is is different from what you describe here? If so, can you elaborate more on what you'd like to achieve? I'd appreciate it if you could provide a pseudo code with the APIs you have in mind. |
@jeongyoonlee I meant the tree1_ite_pred = tree1.predict(df_test[feature_names].values)
tree2_ite_pred = tree2.predict(df_test[feature_names].values)
df_result = pd.DataFrame(
{
'tree_mse_ite': tree1_ite_pred,
'tree_causal_mse_ite': tree2_ite_pred,
'outcome': df_test['outcome'], # <== at inference, we also want to estimate this
'is_treated': df_test['treatment'],
'treatment_effect': df_test['treatment_effect']
}
) But during inference, given a unit with covariates, we also want the estimated outcome using the same trained model. The GRF in EconML has the # Code for EconML predict_full()
from econml.grf import CausalForest
est = CausalForest(criterion='het', n_estimators=400, min_samples_leaf=5, max_depth=None,
min_var_fraction_leaf=None, min_var_leaf_on_val=True,
min_impurity_decrease = 0.0, max_samples=0.45, min_balancedness_tol=.45,
warm_start=False, inference=True, fit_intercept=True, subforest_size=4,
honest=True, verbose=0, n_jobs=-1, random_state=1235)
est.fit(X, T, y)
effect_and_Y0 = est.predict_full(X_test, alpha=0.01) Is this clear? Is there a way to do the same in CausalML already? |
I'm closing this issue as it has been addressed in #623. |
While we use CausalML to predict the effects, one often wants to know the outcome values of the control and/or treatment given the covariates at the same time. Even though one could build separate prediction tree/forest for this purpose, not only that approach is more inconvenient and expensive, but it is hard to ensure the prediction model agrees with the causal model. (It seems that the nodes of
CausalTree
/CausalRandomForest
already contain the necessary values, e.g.ct_y_sum
andct_count
etc. It currently lack ways to aggregate them at the API level.)The text was updated successfully, but these errors were encountered: