Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify and Better Document Getting Own Datasets into Braindecode #544

Open
robintibor opened this issue Sep 27, 2023 · 0 comments
Open
Labels
documentation Improvements or additions to documentation enhancement New feature or request intermediate Intermediate Difficulty

Comments

@robintibor
Copy link
Contributor

robintibor commented Sep 27, 2023

I think we should make it easily understandable how to get your own data into a braindecode dataset (also in cases you don't want to use skorch, but would like to use braindecode dataset).
I see two basic scenarios:

  1. You have a very basic dataset with numpy arrays X and y, and it fits into memory.

Atm we have https://braindecode.org/dev/auto_examples/datasets_io/plot_custom_dataset_example.html#sphx-glr-auto-examples-datasets-io-plot-custom-dataset-example-py This is already not so bad, I suggest to rename the example mentioning X and y numpy arrays in the title to make it more clear what this is about. Also simplify the example, like don't load some mne dataset, extract X, y etc., rather just create fake X and y, makes it much shorter and easier to understand I would say. Especially as we are not doing any training in that example anyways...

We may also want to allow there y to have a temporal dimension for segmentation-like tasks. For this, one would need to make code changes I guess.

Additionally, we may need to distinguish between cases you have a raw or a precut X and y, e.g. see #148

  1. You have some other format, and could go through mne.

Here atm we have https://braindecode.org/dev/auto_examples/datasets_io/plot_mne_dataset_example.html#sphx-glr-auto-examples-datasets-io-plot-mne-dataset-example-py

Maybe could also be renamed to signal that this is Braindecode general-purpose way of getting any data into a Braindecode dataset.

Here on the type of datasets we should ensure we have a simple API and show how to use it for:

  • single label per mne raw (corresponding to create_fixed_length_windows)
  • using event annotations (~create_windows_from_events)
  • using segmentation-like target channel (~create_windows_from_target_channels)

And we should probably cover that either one is willing to load all the data into memory or already has or is willing to create mne if for other files on disk (so preload True or False)

I also had some colab showing how to get data into mne, but this might be better to link to some appropriate mne doc if it exists?
https://colab.research.google.com/drive/1B-5K7dNyfyu-UIVFp3A1BvgQGkZmLWrg#scrollTo=YbFKRInJCGYw

@robintibor robintibor added documentation Improvements or additions to documentation enhancement New feature or request intermediate Intermediate Difficulty labels Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request intermediate Intermediate Difficulty
Projects
None yet
Development

No branches or pull requests

1 participant