Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue associated with Causal Forest DML when calling Score() method #760

Closed
Fan981230 opened this issue Apr 23, 2023 · 3 comments
Closed
Assignees

Comments

@Fan981230
Copy link

Hi, thanks for the fascinating work you guys provide!

I'm new to causal inference field and i found this package super to help build model. While i came across an IndexError when trying to call the model.score() as i built a causal forest model.

The model parameter i used is as following:

model = CausalForestDML(
                model_y=RandomForestRegressor(n_estimators=150),
                model_t=RandomForestRegressor(n_estimators=150),
                n_jobs=-1,
                min_var_fraction_leaf=0.5,
                min_var_leaf_on_val=True,
                mc_iters = 4,
                n_estimators=50,
                subforest_size=5
                )
            
        model.tune(Y=Y, T=T, X=X, W=W)
        model.fit(Y=Y, T=T, W=W, X=X, cache_values=False)

And when i call the score method i received the following error:
Screenshot 2023-04-22 at 11 00 22 PM

And my guess for why it happened is the mc_iter parameter, as when i changed the mc_iters = 2, i can run the score method without any error. Really appreciate any advice or help, Thanks!

@kbattocchi
Copy link
Collaborator

Thanks for reporting this, and sorry for the inconvenience, this certainly looks like a bug, and is not limited to CausalForestDML. We'll take a look.

@kbattocchi kbattocchi self-assigned this Apr 26, 2023
@Fan981230
Copy link
Author

Thanks Keith! May i ask what might be the cause for the bug? is it when we apply multiple mc_iters the model is passing multiple result to the method that brings the index error?

@kbattocchi
Copy link
Collaborator

The nuisances are indexed by both mc_iters and the number of cross-fitting folds, and the logic we use to index into them here isn't correct

nuisances[it][i * n_iters + j] = nuis

because i and j should be reversed (or n_splits should be used instead of n_iters)

kbattocchi added a commit that referenced this issue Apr 26, 2023
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
kgao pushed a commit that referenced this issue May 18, 2023
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: kgao <kevin.leo.gao@gmail.com>
star1327p pushed a commit to star1327p/EconML that referenced this issue Jul 19, 2023
Signed-off-by: Keith Battocchi <kebatt@microsoft.com>
Signed-off-by: star1327p <star1327p@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants