Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
[MRG + 1] Add fowlkess-mallows and other supervised cluster metrics to SCORERS dict so it can be used in hyper-param search #8117
This adds all cluster metrics that uses supervised evaluation like fowlkess-mallows etc...
Code to reproduce
>>> from sklearn.model_selection import GridSearchCV >>> from sklearn.cluster import KMeans >>> from sklearn.datasets import load_iris >>> iris = load_iris() >>> X, y = iris.data, iris.target >>> km = KMeans(random_state=42) >>> grid_search = GridSearchCV(km, param_grid=dict(n_clusters=[2, 3, 4, 5]), ... scoring='fowlkes_mallows_score') >>> grid_search.fit(X, y).best_params_['n_clusters']
In this branch
Ah so no clustering metric is added :/
This doesn't work -
grid_search = GridSearchCV(km, param_grid=dict(n_clusters=[2, 3, 4]), scoring='fowlkes_mallows_score') grid_search.fit(X, y)
Oh, strange. I don't mind other supervised measures being there, but really we need to deal with the scoring framework for clusterers. (Might be an interesting thing to shape up as a GSoC project??)…
On 27 December 2016 at 09:25, (Venkat) Raghav (Rajagopalan) < ***@***.***> wrote: (All cluster metrics that compare true and predicted labels like a classification metric?) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8117 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz62ern5ycLCAu_X6_4ywtzBRImot-ks5rMD7ngaJpZM4LV6WK> .
referenced this pull request
Dec 27, 2016
added a commit
this pull request
Dec 27, 2016
+1 Maybe we should first start a dedicated issue or wiki page well before GSoC timeline to first sketch out the design so the student can spend less time in API design and more time in implementation...