-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when using continuous features 'capital_gain' and 'capital_loss' on Adult Income #13
Comments
Can you post the error you are getting? It's hard to say anything without looking at the error. Yes, you are right regarding cappital_gain and capital_loss - they are highly skewed and most of the observations had 0 value. So we did not use those variables in our experiments. As we mention in the paper, the data preparation is mostly based on the analysis here: https://rpubs.com/H_Zhu/235617. |
Thank you for your feedback on cappital_gain and capital_loss. |
Yeah, that happens when the MAD is 0. The code is now updated and it will replace all such occurrences with 1. So, currently, features with 0 MAD are not given any weights in optimization. Though it might make sense in many cases, we welcome other ways of handling this situation. |
Thanks for your feedback. By the way, I think the PrivateData class in private_data_interface.py also need a get_valid_mads method. |
Ah, good catch! Thanks for pointing that out. I will update the private data notebook in a while. |
Hi,
The examples you provided for Adult Income work perfectly.
However, when 'capital_gain' and 'capital_loss' are used in addition to 'age' and 'hours_per_week', DiCE has some trouble to deal with those features. I suspect it may have something to do with the fact that they are highly skewed. Did you experience the same thing? Is that the reason why they are not used in the examples you provided? Thanks!
The text was updated successfully, but these errors were encountered: