Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gini in quality #33

Closed
Mateko opened this issue Jun 3, 2020 · 9 comments
Closed

Gini in quality #33

Mateko opened this issue Jun 3, 2020 · 9 comments

Comments

@Mateko
Copy link

Mateko commented Jun 3, 2020

Gini returns bad values, its returns 42-43 for all variables (checked on binned, and non binned values)

@Secbone
Copy link
Member

Secbone commented Jun 6, 2020

@Mateko Could you supply a case for more details?

@Mateko
Copy link
Author

Mateko commented Jun 8, 2020

@Secbone I'm trying to test my data, while IV values ​​seem like my function, then gini only returns values ​​from (42-43) for all variables in the dataset - which is definitely wrong.

@Secbone
Copy link
Member

Secbone commented Jun 8, 2020

@Mateko the gini value is depends on data, so you can't say "the values is between 42 and 43" is wrong, could you supply a case, so we can find out if it is wrong?

BTW, the gini value in quality is conditional gini, not the gini value of the feature.

@Mateko
Copy link
Author

Mateko commented Jun 8, 2020

unknown
Its looking like constat value, but is not, how the variable with 0.04 IV could have 43 gini??
I probably don't understand something

@Secbone
Copy link
Member

Secbone commented Jun 8, 2020

@Mateko the gini value's formula in quality is formula, which is to use to measure the correlation between the feature and the target, you can see the order of values in your result, the higher the gini value is, the more useless the feature is.

@Mateko
Copy link
Author

Mateko commented Jun 9, 2020

If corellation between feature and default its higher then this feature should be more usefull.

@Secbone
Copy link
Member

Secbone commented Jun 9, 2020

@Mateko Yes, you can think the gini value as the negtive correlation value.

@Mateko
Copy link
Author

Mateko commented Jun 9, 2020

So u think its good to named it Gini in this place? Maybe be better to get there a default % and get a roc auc score from this?

@Secbone
Copy link
Member

Secbone commented Jun 9, 2020

@Mateko Of course it is Gini. The ROC or AUC usually used in binary classification as model metrics, but in this place, it needs a value to measure features in many cases, not only binary.
So I think Gini in this place is not bad at least.

@Mateko Mateko closed this as completed Jun 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants