Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFE: more average_method's for normalized_mutual_info_score #12484

Open
sam-s opened this issue Oct 29, 2018 · 6 comments
Open

RFE: more average_method's for normalized_mutual_info_score #12484

sam-s opened this issue Oct 29, 2018 · 6 comments

Comments

@sam-s
Copy link
Contributor

sam-s commented Oct 29, 2018

Description

It would be nice if normalized_mutual_info_score could compute uncertainty coefficient, AKA proficiency.

Implementation

E.g., average_method can be labels_true to compute proficiency.

Expected Results

normalized_mutual_info_score(
    labels_true=[0,1,0,1], 
    labels_pred=[0,1,0,2], 
    average_method='labels_true')

should evaluate to 1.

@aishgrt1
Copy link
Contributor

@jnothman Shall I take this?

@jnothman
Copy link
Member

It seems reasonable enough given that we are supporting various normalizers, and seeing as this makes the normalization independent of labels_pred. @sam-s, is this used often in practice for clustering evaluation?

@sam-s
Copy link
Contributor Author

sam-s commented Oct 30, 2018

I use it all the time.
The proficiency metric measures the proportion of information in the target distribution recovered by the classifier, so, unlike all the other normalizations, it has a clear interpretable meaning.

@sam-s
Copy link
Contributor Author

sam-s commented Oct 30, 2018

BTW, it would be nice to be able to specify the average_method for scoring in cross_validate. Something like 'normalized_mutual_info_score/average_method=labels_true' instead of make_scorer(lambda y,yh: normalized_mutual_info_score(y,yh,average_method='labels_true')).

@amueller
Copy link
Member

make_scorer(normalized_mutual_info_score, average_method='labels_true') works, right?
we could add a named scorer, we have done that for the different precision/recall averages.

@sam-s
Copy link
Contributor Author

sam-s commented Oct 30, 2018

@amueller right, thanks, missed that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants