Skip to content
This repository has been archived by the owner on Apr 10, 2024. It is now read-only.

[BUG] BadZipFile and ValueError on Wafer Dataset #49

Closed
SewoongLee opened this issue Jul 12, 2023 · 2 comments
Closed

[BUG] BadZipFile and ValueError on Wafer Dataset #49

SewoongLee opened this issue Jul 12, 2023 · 2 comments
Labels
bug Something isn't working dependency issues

Comments

@SewoongLee
Copy link

Describe the bug
A clear and concise description of what the bug is.
There is data loading error in 'Wafer' dataset, while 'GunPoint' is ok.
image

To Reproduce
Steps to reproduce the behavior:
You can reproduce the error by running following file.
https://1drv.ms/f/s!AuU5Lmr0utymk4IKdA1WNoxzkD0tNg

Expected behavior
A clear and concise description of what you expected to happen.
Accuracy should be 1 on wafer dataset.

Code example
If applicable, add code example to help explain your problem.
Same as above

Environment (please complete the following information):

  • OS: [e.g. iOS] Google colab
  • Version of the convst package [e.g. 0.15] : Latest (0.3.0)

Additional context
Add any other context about the problem here.

@SewoongLee SewoongLee added the bug Something isn't working label Jul 12, 2023
@baraline
Copy link
Owner

baraline commented Jul 12, 2023

Hi, thanks for raising the issue. Next time you create an issue, please paste your code in markdown using ```python your code```. Don't worry, the screenshot does the trick for this simple issue.

I can reproduce the issue on my side. The problem seem to be linked to an update of the source of the data (https://www.timeseriesclassification.com/dataset.php), and the link in the aeon package, which I use to pull the data, was not yet updated.

This should be fixed in the next version of aeon. In the meantime, you can edit the source file (in aeon 0.3) aeon/datasets/_data_loaders.py and modify the _load_dataset function. At line 493, you will find:

url = "https://timeseriesclassification.com/Downloads/%s.zip" % name

Change it to:

url = "https://timeseriesclassification.com/ClassificationDownloads/%s.zip" % name

And now this works as exepcted:

X_train, X_test, y_train, y_test, _ = load_UCR_UEA_dataset_split('Wafer')

Alternatively, the file was already updated on the current verion of aeon on git (see https://github.com/aeon-toolkit/aeon/blob/48075924fd95be0e80cdca131aa71fee22e1017f/aeon/datasets/_data_loaders.py#L452). So you could build aeon from the github sources.

TODO to solve the issue :

  • Update aeon dependency when new version comes out

@baraline
Copy link
Owner

baraline commented Aug 2, 2023

This should be now solved with the new release, updating aeon to >=0.4 should have the same effect.

@baraline baraline closed this as completed Aug 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working dependency issues
Projects
None yet
Development

No branches or pull requests

2 participants