Sktime: Python implementation of HIVE-COTE #842

iuimaki · 2021-04-26T23:38:56Z

Is your feature request related to a problem? Please describe.
I'm new to sktime and time series classification.
I have installed sktime toolkit. I trying to apply HIVE-COTE 1.0 to my dataset. The codes are shown as follows, x_train, y_train, x_test, and y_test are my prepared dataset. I call HIVE-COTE directly through the module HIVECOTEV1, but I am not sure if this is the correct way to reproduce hive-cote algorithm. Is this HIVECOTE1 module includes ensembling part (like TimeseriesForestClassifier)?

Another question is the time series dataset I have is 3d array, which is in the format of (number of samples, time steps, number of features). But when I fed the dataset into the Timeserisforestclassifier, the error information is shown as follows. After processing the dataset with the module 'columnconcatenator()', it can run smoothly. Does that mean only 2d array can be fed into the algorithm as well as HIVE-COTE?
The shape of x_train, y_train, x_test, and y_test: (28, 1918, 62) (16, 1918, 62) (28,) (16,)
error information:
Traceback (most recent call last):
File "h:/scipt/prediction/stacking for classification/HIVECOTE.py", line 73, in
clf.fit(x_train, y_train)
File "C:\Anaconda3\envs\lib\site-packages\sktime\series_as_features\base\estimators\interval_based_tsf.py", line 86, in fit
coerce_to_numpy=True,
File "C:\envs\ARTC\lib\site-packages\sktime\utils\validation\panel.py", line 187, in check_X_y
coerce_to_pandas=coerce_to_pandas,
File "C:\Anaconda3\envs\lib\site-packages\sktime\utils\validation\panel.py", line 87, in check_X
f"X must be univariate with X.shape[1] == 1, but found: "
ValueError: X must be univariate with X.shape[1] == 1, but found: X.shape[1] == 1918.

Describe the solution you'd like
It will be very appreciated that if you can provide a demo of Multivariate time series classification with HIVECOTE in python on Github or sktime documentation. I think that will be very helpful for us to apply this state-of-the-art algorithm for industry application.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Remmert-A · 2021-04-29T06:19:53Z

+1
An introduction into the usage and preprocessing requirements of the HIVECOTE module would be greatly appreciated from me as well.

iuimaki · 2021-04-30T01:03:53Z

Another question is how does the class_weight parameter in sktime work?
I want to use TimeSeriesForestClassifier to do binary classification on an unbalanced data set. The classes are labelled 0 (negative) and 1 (positive) and the observed data is in a ratio of about 9:1 with the majority of samples having negative outcome.
The documentation of sktime shows that there is a way to set class_weight parameter, similar to scikit-learn. But it failed when I run for this: clf = TimeSeriesForestClassifier(n_estimators=100, class_weight= {0:1, 1:w},). The error information is: TypeError: init() got an unexpected keyword argument 'class_weight'.

TonyBagnall · 2021-04-30T09:16:28Z

hi, we are working on hive-cote for python, just tidying up and testing some features for the java version first. Once term has finished @MatthewMiddlehurst and I can give it our full attention.

TonyBagnall · 2021-04-30T09:20:09Z

The python version has all sorts of pythonesque issues, memory intensive, slow etc which require significant engineering and we are not really python programmers, so it is painful. In the short term, if anyone wants to run HIVE-COTE v2 just email me ajb@uea.ac.uk, we can help you get it running in java (very easy, we can just give you the jar file and you can run it on command line or with a script) or we can run it ourselves and just send you the results files.

iuimaki · 2021-04-30T10:02:46Z

hi, we are working on hive-cote for python, just tidying up and testing some features for the java version first. Once term has finished @MatthewMiddlehurst and I can give it our full attention.

Heya! Thank you for the feedback :)

paulttt · 2021-05-20T20:37:42Z

Hi guys!
FYI, I was also trying to run HIVE-COTE v1 today. I got confronted with the following error. Any help is appreciated if that is a known bug and/or I make something wrong here. My data is of shape X_train.shape = (N_samples, 1, time_bins) and y_train.shape = (N_samples,) with y_train = [0, 1].

RuntimeError                              Traceback (most recent call last)
<ipython-input-18-815c2880dcd0> in <module>
----> 1 hc.fit(X_train, y_train)

~/anaconda3/envs/py38/lib/python3.7/site-packages/sktime/classification/hybrid/_hivecote_v1.py in fit(self, X, y)
    101             time_contract_in_mins=60,
    102         )
--> 103         self.stc.fit(X, y)
    104         train_preds = cross_val_predict(
    105             ShapeletTransformClassifier(

~/anaconda3/envs/py38/lib/python3.7/site-packages/sktime/classification/shapelet_based/_stc.py in fit(self, X, y)
    119         self.classes_ = class_distribution(np.asarray(y).reshape(-1, 1))[0][0]
    120 
--> 121         self.classifier_.fit(X, y)
    122 
    123         self._is_fitted = True

~/anaconda3/envs/py38/lib/python3.7/site-packages/sklearn/pipeline.py in fit(self, X, y, **fit_params)
    328         """
    329         fit_params_steps = self._check_fit_params(**fit_params)
--> 330         Xt = self._fit(X, y, **fit_params_steps)
    331         with _print_elapsed_time('Pipeline',
    332                                  self._log_message(len(self.steps) - 1)):

~/anaconda3/envs/py38/lib/python3.7/site-packages/sklearn/pipeline.py in _fit(self, X, y, **fit_params_steps)
    294                 message_clsname='Pipeline',
    295                 message=self._log_message(step_idx),
--> 296                 **fit_params_steps[name])
    297             # Replace the transformer of the step with the fitted
    298             # transformer. This is necessary when loading the transformer

~/anaconda3/envs/py38/lib/python3.7/site-packages/joblib/memory.py in __call__(self, *args, **kwargs)
    350 
    351     def __call__(self, *args, **kwargs):
--> 352         return self.func(*args, **kwargs)
    353 
    354     def call_and_shelve(self, *args, **kwargs):

~/anaconda3/envs/py38/lib/python3.7/site-packages/sklearn/pipeline.py in _fit_transform_one(transformer, X, y, weight, message_clsname, message, **fit_params)
    738     with _print_elapsed_time(message_clsname, message):
    739         if hasattr(transformer, 'fit_transform'):
--> 740             res = transformer.fit_transform(X, y, **fit_params)
    741         else:
    742             res = transformer.fit(X, y, **fit_params).transform(X)

~/anaconda3/envs/py38/lib/python3.7/site-packages/sktime/transformations/base.py in fit_transform(self, Z, X)
     89         else:
     90             # Fit method of arity 2 (supervised transformation)
---> 91             return self.fit(Z, X).transform(Z)
     92 
     93     # def inverse_transform(self, Z, X=None):

~/anaconda3/envs/py38/lib/python3.7/site-packages/sktime/transformations/panel/shapelets.py in transform(self, X, y)
    699         if len(self.shapelets) == 0:
    700             raise RuntimeError(
--> 701                 "No shapelets were extracted in fit that exceeded the "
    702                 "minimum information gain threshold. Please retry with other "
    703                 "data and/or parameter settings."

RuntimeError: No shapelets were extracted in fit that exceeded the minimum information gain threshold. Please retry with other data and/or parameter settings.

MatthewMiddlehurst · 2021-05-21T08:22:05Z

Hi @paultt, its hard to know exactly whats going on without knowing a bit about the data. STC is definitetly one of the more under-developed classifiers. We are hoping to sort out all our sktime classifiers after this teaching term at UEA ends.

paulttt · 2021-05-21T10:08:32Z

Hi @MatthewMiddlehurst,
Thanks for the feedback! I will try to train my data on the shapelet classifier only and see if I face similar problems.
I use continuous-time signals recorded from 5500 neurons. So far, I had no problems with any classification model. The exact shape is (5500, 1, 1750).

MatthewMiddlehurst · 2021-05-21T14:40:52Z

Im a bit late for the previous comments on this issue, but when it comes to preprocessing I would look at the data_loading and classification notebooks in the examples folder. HIVE-COTE uses the same data format as other sktime classifiers. HIVE-COTE can only take univairate data currently, not datasets with multiple series per instance.

TonyBagnall · 2021-07-14T11:48:29Z

update on this, Matt now has equivalence with tsml on DrCIF, TDE and Arsenal. Just STC to sort out now, which is a summer objective for my group

MatthewMiddlehurst · 2021-10-13T13:38:41Z

Close to completion, see #1504

MatthewMiddlehurst · 2021-10-15T20:20:00Z

The sktime master branch has an implementation of HIVE-COTE 2.0 and an updated version of HIVE-COTE 1.0 after the merge of #1504. If any issues with these arise they can be a separate issue.

mloning assigned TonyBagnall Apr 29, 2021

TonyBagnall added the module:classification classification module: time series classification label Jun 20, 2021

TonyBagnall assigned MatthewMiddlehurst Jul 14, 2021

MatthewMiddlehurst closed this as completed Oct 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sktime: Python implementation of HIVE-COTE #842

Sktime: Python implementation of HIVE-COTE #842

iuimaki commented Apr 26, 2021

Remmert-A commented Apr 29, 2021 •

edited

iuimaki commented Apr 30, 2021

TonyBagnall commented Apr 30, 2021

TonyBagnall commented Apr 30, 2021

iuimaki commented Apr 30, 2021

paulttt commented May 20, 2021 •

edited

MatthewMiddlehurst commented May 21, 2021 •

edited

paulttt commented May 21, 2021

MatthewMiddlehurst commented May 21, 2021

TonyBagnall commented Jul 14, 2021

MatthewMiddlehurst commented Oct 13, 2021

MatthewMiddlehurst commented Oct 15, 2021

Sktime: Python implementation of HIVE-COTE #842

Sktime: Python implementation of HIVE-COTE #842

Comments

iuimaki commented Apr 26, 2021

Remmert-A commented Apr 29, 2021 • edited

iuimaki commented Apr 30, 2021

TonyBagnall commented Apr 30, 2021

TonyBagnall commented Apr 30, 2021

iuimaki commented Apr 30, 2021

paulttt commented May 20, 2021 • edited

MatthewMiddlehurst commented May 21, 2021 • edited

paulttt commented May 21, 2021

MatthewMiddlehurst commented May 21, 2021

TonyBagnall commented Jul 14, 2021

MatthewMiddlehurst commented Oct 13, 2021

MatthewMiddlehurst commented Oct 15, 2021

Remmert-A commented Apr 29, 2021 •

edited

paulttt commented May 20, 2021 •

edited

MatthewMiddlehurst commented May 21, 2021 •

edited