Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DemonsP300 has only targets in the first third of data #216

Open
jsosulski opened this issue Jul 9, 2021 · 6 comments
Open

DemonsP300 has only targets in the first third of data #216

jsosulski opened this issue Jul 9, 2021 · 6 comments
Assignees
Labels

Comments

@jsosulski
Copy link
Collaborator

See this plot for one subject, but all subjects look the same:

X-Axis is which epoch is plotted and y is the label the epoch has

grafik

@sylvchev
Copy link
Member

Thanks!
@v-goncharenko do you know if there is a data loading problem?

@v-goncharenko
Copy link
Collaborator

Rest of the ground true labels are in .csv file. Look at the code, they get read there.

Also it's a good idea not to read raw unstructured data, but use final class (which is abstraction between internal format and common one)

@jsosulski
Copy link
Collaborator Author

So is this an issue with the dataset implementation?

See MWE using current moabb version on pypi:

from matplotlib import pyplot as plt
from moabb.datasets import DemonsP300
from moabb.paradigms import P300

paradigm = P300()

dset = DemonsP300()
subject = 0

X, label, meta = paradigm.get_data(dset, [subject])

plt.plot(label)
plt.show()

This produces the plot in the first post and this uses the default moabb way of loading data

@sylvchev sylvchev added the bug label Jul 31, 2021
@jsosulski
Copy link
Collaborator Author

I noticed in more and more literature that MOABB is being referenced (yay!), although most authors just use it for dataset acquisition, which is still a win I guess, until we have a centralized classification running system. However, should we start to tag datasets that have, e.g., been vetted by us that they work correctly? Then new MOABB users could use it as intended as a fire&forget way.

See e.g. this issue or the fixed #96 . As a new user who just wants to check out their classifiers performance on X, y data, they probably do not want to dig deep into the underlying datasets and check if everything is doing what it should. Currently on the documentation there is no hint that there are currently issues with this dataset.

I could offer to clean up the sanity check script (#184), commit it to moabb, and run it locally for all avilable P300 datasets, as I am most experienced with ERP data.

@sylvchev
Copy link
Member

sylvchev commented Dec 3, 2021

Good for the citations ;) I tried to add paper in found referencing MOABB on this wiki page, it could be useful soon. Feel free to add some papers if you have the time.

I agree that the dataset should be verified and this issue is open for quite some time. I'm trying to improve the documentation by adding more information on the dataset. As a groundtruth, I update a wiki page with metadata regarding the datasets that are useful for ML. As you suggest, we could a minima include references to issues that are open for each dataset.

Best would be to ensure that all dataset are ok before adding them and you sanity check script could really help. It could be part of the required steps asked to comply with before adding a dataset. If you could run it on P300 it is nice. Someone could help for checking existing MOABB dataset in MI and SSVEP? @Div12345 @ErikBjare @v-goncharenko (I could help)

@sylvchev
Copy link
Member

sylvchev commented Mar 2, 2022

This issue is stalling and users could use DemonsP300 without knowing that there is an issue. We could add a warning when the dataset is loaded, that make a reference to this issue, and we could update the documentation as well.
If the problem of this dataset could not be fixed, we may have to deprecate it.

@sylvchev sylvchev moved this from Datasets to Maybe Bugs in Benchmarking paper Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

No branches or pull requests

3 participants