Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identical shapelets #590

Closed
Bruno-Hanzen opened this issue Jan 3, 2021 · 6 comments
Closed

Identical shapelets #590

Bruno-Hanzen opened this issue Jan 3, 2021 · 6 comments
Labels
module:classification classification module: time series classification

Comments

@Bruno-Hanzen
Copy link

When fitting shapelets to a set of time series, I get twice the same shapelet (for the same ts, at different places).
In the ts, there are zones of constant values. 2 of them are 11 positions long, so after normalization, this gives 2 shapelets with 11 "0" in "data", but different "start_pos".

I suggest testing identity or similarity of a new shapelet (data part) with the previously discovered ones. This would save time in the "fit" and "transform" execution.

@mloning
Copy link
Contributor

mloning commented Jan 5, 2021

@jasonlines @TonyBagnall any idea?

@TonyBagnall
Copy link
Contributor

hi, yes this is possible, although we do have a threshold to exclude shapelets with identical info gain. STC could do with some work, its on my list after sorting out distance functions

@Bruno-Hanzen
Copy link
Author

Hi, I can confirm that the shapelets were identical, but info gain was different (0.18 and 0.14), if I remember correctly.

@TonyBagnall
Copy link
Contributor

hmmm, curious, we will look into it, thanks for pointing it out

@TonyBagnall TonyBagnall added the module:classification classification module: time series classification label Jun 30, 2021
@MatthewMiddlehurst
Copy link
Contributor

The new shapelet_transform in #1490 removes identical shapelets from the final set. This issue still remains for the original transform(s) in shapelets.py, however. I will leave this open for now.

@TonyBagnall
Copy link
Contributor

I think we can close it, original shapelets no longer needed. OP feel free to reopen if you really need the original fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:classification classification module: time series classification
Projects
None yet
Development

No branches or pull requests

4 participants