Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when using continuous features 'capital_gain' and 'capital_loss' on Adult Income #13

Closed
aivodji opened this issue Apr 10, 2020 · 5 comments

Comments

@aivodji
Copy link

aivodji commented Apr 10, 2020

Hi,
The examples you provided for Adult Income work perfectly.
However, when 'capital_gain' and 'capital_loss' are used in addition to 'age' and 'hours_per_week', DiCE has some trouble to deal with those features. I suspect it may have something to do with the fact that they are highly skewed. Did you experience the same thing? Is that the reason why they are not used in the examples you provided? Thanks!

@raam93
Copy link
Collaborator

raam93 commented Apr 13, 2020

Can you post the error you are getting? It's hard to say anything without looking at the error.

Yes, you are right regarding cappital_gain and capital_loss - they are highly skewed and most of the observations had 0 value. So we did not use those variables in our experiments. As we mention in the paper, the data preparation is mostly based on the analysis here: https://rpubs.com/H_Zhu/235617.

@aivodji
Copy link
Author

aivodji commented Apr 18, 2020

Thank you for your feedback on cappital_gain and capital_loss.
The error is "RuntimeWarning: divide by zero encountered in double_scalars feature_weights[feature] = round(1/normalized_mads[feature], 2)", and the relataded file is DiCE/dice_ml/dice_interfaces/dice_tensorflow1.py at line 279

@raam93
Copy link
Collaborator

raam93 commented Apr 18, 2020

Yeah, that happens when the MAD is 0. The code is now updated and it will replace all such occurrences with 1. So, currently, features with 0 MAD are not given any weights in optimization. Though it might make sense in many cases, we welcome other ways of handling this situation.

@raam93 raam93 closed this as completed Apr 18, 2020
@aivodji
Copy link
Author

aivodji commented Apr 20, 2020

Thanks for your feedback. By the way, I think the PrivateData class in private_data_interface.py also need a get_valid_mads method.

@raam93
Copy link
Collaborator

raam93 commented Apr 20, 2020

Ah, good catch! Thanks for pointing that out. I will update the private data notebook in a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants