-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] time series classification: shorthands for common sktime pipelines including TsfreshClassifier #1063
Comments
Sure, although this is a little completist, the reason for having RocketClassifier etc is that there are results published with a specific set up which we want to reproduce. TSfresh classifier is not at all competitive according to Markus's experiments. MPClassifier is kind of ok (according to my own paper!) and I look forward to testing out the SignatureClassifier in the near future. We call the SevenNumberClassifier the summary stats classifier, but that is not informative enough. If there is not a published version to copy, what would we default the classifier to? Random Forest with 500 trees I guess, although CAWPE would be good from our perspective. |
oh, and a version of the SevenNumberClassifier is the first default classifier I ever tried on TSC in 2012 :) |
Yes, I'd either: use
I'd invoke here "relevance justifies inclusion" rather than "performance justifies inclusion". |
PS: "according to Markus' experiments" (that are nowhere published?) is not a scientific reference. |
pinging @jambo6 since I mis-spelt his name above. |
this is not a scientific forum, I was talking in anecdote. I'll take any bets you want on tsfresh+random forest against anything close to sota. I'll run the experiment next week now you have said that :) But yes, all this is fine by me, I have after all championed to static classifiers against pure composition, having both routes is the best |
one question is where to put them. I try to group by the core nature of the transformation involved to provide a basic, inevitably flawed, taxonomy |
I believe you -
indeed, so let's be consistent! 😃
I'd suggest a folder |
I'm not sure about that, since they are all based on some form of feature extraction except for distance based, although distances as features is itself a valid and published approach. I'll have a think about this. |
Yes, but not all TSC are of the simple form above, they typically also do sth else. |
Going to assign myself to this if no one else wants to take it, seems like a good series of small tasks. Can put the Catch22Classifier with them. |
yeah ok that makes sense, We could have methods to create "standard" configurations rather than a class for each. So thinking Signature |
@MatthewMiddlehurst did you complete this? We should write it up as an arxiv/workshop paper |
@TonyBagnall Waiting on #1329 really. After the summary classifier is done and the package is refactored we can close this i think. |
One of our most requested/used time series classifiers is the pipeline of
tsfresh
feature extraction and then ansklearn
classifier.TsfreshClassifier
for that.We should create a module with simple "feature extraction based" TSC strategies like this, e.g.,
feature_extr_based
.Other shorthands that would be great to have:
TabularizerClassifier
, using the tabluarizer; a bonus feature would be using its potential "good first issue" extension to be created, the binner [ENH] ImplementTimeBinner
transformer, regular binning with aggregation for irregular time series #242MatrixProfileClassifier
using the matrix profileSignatureClassifier
- using signature - this already exists in @jambo6's work, so just needs to be moved into the same "TSC type" sub-folderSevenNumberClassifier
- using the seven-number-summary of the series (quartiles, mean, variance). A nice feature would be the ability to specify which of these to use, or more "sample summaries" like kurtosis or other percentiles; this should not use any sequential features, only features where the order does not matter.All of these will have parameters:
estimator
- asklearn
classifier; and any parameters that come from the feature extractor, without additional nesting level (that you would get fromPipeline
).The text was updated successfully, but these errors were encountered: