Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation of time series clustering #14

Closed
jonpappalord opened this issue Oct 23, 2017 · 8 comments
Closed

Evaluation of time series clustering #14

jonpappalord opened this issue Oct 23, 2017 · 8 comments

Comments

@jonpappalord
Copy link

It would be nice to allow the user to evaluate the quality of a clustering by providing the equivalent of the silhouette score (or related metric) for time series clustering.

@rtavenar
Copy link
Member

Hi,

This is a very good idea! I'll do that as soon as I can. If anyone is interested in giving a hand on this, the idea would be to adapt the code from sklearn to tslearn formats (shouldn't be too hard).

@fonnesbeck
Copy link

Can scikit-learn's silhouette_score not be used directly with tslearn.metrics.soft_dtw as the metric?

@rtavenar
Copy link
Member

rtavenar commented Nov 29, 2017

Hum, a first problem we would have is that we don't expect the same format for parameter X and sklearn checks that the provided X is valid.

Something that might work would be to fool sklearn by providing a reshaped X array and a metric function that reshapes arrays back before computing their similarity. Something like:

n, sz, d = X.shape
sklearn_X = X.reshape((n, -1))
sklearn_metric = lambda x, y: metric_fun(x.reshape((sz, d)), y.reshape((sz, d)))

@rtavenar
Copy link
Member

I just pushed an attempt to provide this functionality. Could you guys test it and give some feedback about it?

@rtavenar
Copy link
Member

@jonpappalord @fonnesbeck
Guys, could you tell if the implemented feature matches your needs? If so, I would close this Issue.

@fonnesbeck
Copy link

I'm testing it now, but I'm using a gigantic database, so it will take a few hours to complete.

@rtavenar
Copy link
Member

Since I have no news, I close this issue, re-open it if needed.

@ceydaakbulut95
Copy link

ceydaakbulut95 commented Aug 13, 2022

Hello. Good to see these comments :) I have tried what you did and it worked! Thanks!


mySeries=np.array(mySeries)
n, sz, d = mySeries.shape
sklearn_X = mySeries.reshape((n, -1))
sklearn_metric = lambda x, y: metric_fun(x.reshape((sz, d)), y.reshape((sz, d)))
km.fit_predict(sklearn_X)
score = silhouette_score(sklearn_X, km.labels_, metric='euclidean')

print('Silhouetter Score: %.3f' % score)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants