-
Notifications
You must be signed in to change notification settings - Fork 179
Weekly MOABB meeting
Divyesh Narayanan edited this page Jan 3, 2023
·
24 revisions
- Monday 2 January 2023:
- Discussed running benchmark on colab
- Discussed and merged the benchmark function PR
- Discussed PR #302 - Return raw with get_data
-
Monday 12 December 2022:
- New papers in BCI and using MOABB in the wiki section
- GridSearchCV: Discussion on how to tune models if 2 sessions are available. Existing behavior: GridSearch on one session and use it to predict data from the same session. Proposition: as all dataset is available, GridSearch on all session to find best parameters, then train using those parameters on one session and test it on data from the same session (WithinSession) or other session (CrossSession). Making model selection on whole data, then training on subset of the data is quite similar to leaking information. This goes towards a more offline version of MOABB, where efforts could made to bridge the gap with online BCI evaluation.
-
Monday 5 December 2022
- Chance level evaluation
- Reporting results in wiki
-
Monday 28 November 2022
- Igor Cararra: Pseudo Online Evaluation
-
Monday 21 November 2022
- Link with braindecode
-
Monday 14 November 2022
- Bruno Aristimunha Pinto: Deep learning and braindecode
-
Monday 7 November 2022
-
Monday 31 October 2022
-
Monday 24 October 2022
-
Monday 17 October 2022
-
Monday 10 October 2022
-
Monday 3 October 2022
- Wednesday 2nd June
- Open discussion:
- Use a data sanity check, see #184. Jan already generated all the plot for P300 datasets, but it is too large (150 MB). Right now, it is a Python script that load the data through MOABB and generates the plots. To share those plot files, we could use another Github repo or a Figshare/Zenodo repo. We could upload the butterfly plots, the joint map and the min-max distribution. It is also possible to extract and share the information about all datasets in a CSV file.
- We could make a describe/info method for dataset that summarize information for all subjects.
- There is a problem in dataset BNCI 2014-009, the EEG values are 10x larger than normal.
- When downloading datasets, it could be useful to show a progress bar. It is possible to use TQDM, there is a easy integration with Pooch but this means adding another dependency. It is possible to avoid this by following something like MNE, see the external libs.
- There are two data competition, one on passive BCI in the neuroergonomics conference and one on transfer learning in NeurIPS. The latter rely directly on MOABB.
- Divyesh will set up an environment on a private server to run a MOABB script that generate evaluations and allows to share the pipeline scores on all datasets, including statistical significance and average score.
- Code update:
- MNE is now supported in evaluation, so pipelines could take MNE Epoch as input. See PR #192
- All the unit tests have been updated, for a better coverage, but we still need to set up a coverage estimation.
- The dependency to WFDB is now removed (see PR #188)
- We now have an independent code to download dataset, that does not depend on MNE, as they indicated they do not plan to support this functionality. Our own code relies on Pooch and is now used for downloading all dataset. It could work in HTTP, HTTPS, FTP and SFTP, solving some issues when data were available on an FTP server.
- There is now an advanced support for downloading data from Figshare, which allows to get the file listing of a dataset and download only the specific files (e.g. EEG files and not PDF)
- Issues:
- Open discussion:
- Thursday 15th April
- Open discussion:
- How to use poetry and install MOABB without
pip install -e
. How works pre-commit and black - The semantic versioning and the milestones that we could set for the next release
- Update wiki pages about the datasets (before generating them automatically
- How to use poetry and install MOABB without
- Code update:
- version of MOABB is now available with
moabb.__version__
in python console
- version of MOABB is now available with
- Issues:
- Open discussion:
- Thursday 8th April
- Open discussion:
- How to
pip install -e moabb
with poetry? It should work directly when usingpoetry install
, that is it creates a file in site-packages that points moabb toward local install path. We could make some tests to see if this local install works across environments. - About
wfdb
, while it could be useful to access all physionet datasets, there is a problem with silent errors during download. We have contacted Spiridon Nikolopoulos, to ask if we could help him to update files available on figshare. He agrees, so we could have an access to individual .mat files for each subject. This will allow to remove the wfdb dependency for MAMEM dataset. - We could add some checks to make a clear error when the datasets are not downloaded correctly: if metadata is empty on this line, we could raise a meaningful error.
- Add support for different conditions, for example MI paradigm in normal and stressful conditions, or ERP with one virtual keyboard layout or another. A specific evaluation, based on different conditions, could be proposed by Jan.
- Add support to return raw MNE signal, as well as epochs or numpy arrays. This could be useful for pipelines that need access to complete raw and to specific epochs. Jan wants to propose a PR for this.
- It is also possible to return access to raw numpy, to speed up loading dataset bypassing MNE preprocessing.
- Vlad investigates DVC to store datasets in several places. There is a problem when using Google Drive to host a copy of the data, which requires lots of permission to work. This is an important and large update, which requires to change the dataset class and we need to ensure that the API is still valid.
- MOABB: flexibility or enforcing a common framework.
- Code update:
- The learning curves are now available, along with examples
- An error on the log has been corrected
- Issues:
- Thursday 1st April
- Open discussion:
- Dev branch: discussion is going on here. For now, we could stay with the Feature branch workflow. To have less testing during PR, we could trigger only heavy tests for release, using release candidates to test if everything is good. The documentation is not ready yet to be included in the release triggered events, but it should be part of it as soon as the build process is good.
- Doc generation: wfdb silently fails and the doc fail to build.
- Github actions failures are mainly related to Github internal error and are not specific to our test workflow.
- Code update:
- MOABB is on PyPi!
- colorlog removal, to have minimal dependencies
- test all python branch
- adding Neiry P300 dataset and Lee MI dataset
- Issues:
- Thursday 25th March
- Open discussion:
- Host all data on distributed servers, using DVC. We need to identify the licence of each dataset.
- reduce computation, use fast-failed, make dev branch with only 3 OS and python 3.6 tests
- move toward pytorch integration: use a wrapper to serve dataset, with a loader
- Code update:
- The PyPi version of MOABB is almost there! We could push the code and allows to install it with
pip install moabb
- Any python version could work, for testing and for developing. The code have been tested against python-3.7. The versions between 3.6 and 3.9 are all tested in the CI.
- poetry allows to develop directly on the current branch, like
pip install -i
. orpython setup.py develop
. The only requirement is to runpoetry install
.
- Issues:
- Thursday 18th March
- Open discussion:
- We need to drop the strict requirement to python 3.6, this will be proposed in a PR coming quickly
- The documentation is not updated anymore and we should set up a GH action to do this automatically. There is errors right now when building the docs that need to be addressed, related to example rendering. A commit for this proposed in this PR. We need to get in touch with Yannick to have the correct link to the documentation (www.neurotechx.com/moabb)
- Code update:
- The learning curve PR is almost ready to be merged. This will allows to test classifier pipelines to be tested on small number of examples
- There is still some changes to be made to finishing using black on all files, this is done in PR #152.
- A similar PR is made for prettier, a tool to ensure that yaml and markdown are formatted consistently. All these checks are done in pre-commit.
- The poetry PR will facilitate MOABB installation (by simplifying the upload on PyPi) and allows to separate dependencies for development tools.
- Issues:
- There is a problem with Schirrmeister dataset, the events are not handled correctly.
- The code for adding Lee2019 should be rebased on latest master branch. The steps to install dev tools will be explained on the
contributing.md
page.
- Thursday 11th March
- Open discussion:
- On PR #132, we discussed about how to compare different classifiers when the number of samples is different on different datasets. Should we ensure that the same number of trials is used for each dataset or does that make sense to compare just a proportion of the total number of trials. One interesting alternative is to measure the EEG recording time for training, and not the number of samples. While it could be computed simply for MI and SSVEP, this is much a challenge for P300 due to overlaping stimulation.
- A follow-up discussion on issue #146 regarding the use of
poetry
to handle dependencies and PyPi export. One major interest is thatpoetry
takes care of the differences regarding dependencies for several plateform (linux, windows, osx). This is a important thing to ensure reproductibility. Also it is possible to handle separately dependencies for usage and for dev, as well as specific dependencies, like exotic libs or multiple choices. - If we push MOABB on PyPi, using
poetry
, we will need to set up a guideline regarding version number. We will follow a semantic versioning scheme. - We should discuss during next meeting how to handle datasets, having a distributed storage and a automated check, verifying that download url are up.
- Code update:
- Following the pre-commit PR introducing black, we have now changed all the codebase to black formatting PR #147
- Thanks to Mohammad Mostafa Farzan, we have corrected the issue #138 regarding h5py.
- Issues:
- We have updated issue #121 about updating README page. The how to contribute section should mention that we use pre-commit with isort and black.
- The documentation is not up to date and we should have a setup that allows to build it automatically.
- There are still configuration regarding old CI (TravisCI) and a universal flag (indicating python 2 and 3 compatibility) in
setup.cfg
that should be removed. - We should correct the unit tests to allow a migration to python 3.8, as it work quite well with newer python version even if they are not officially supported.
- Open discussion:
- Thursday 4th March
- Thursday 25th February
- Thursday 18th February
- Thursday 11th February
- Thursday 4th February
- Thursday 21st January
- Open discussion:
- Morgan Hough: asking financial support from a company to support cloud computing time for processing data. Also computational power available to run evaluations soon.
- New techniques in ML: evaluation on EEG
- BCI society meeting on ML, check for recording available online
- Adding support to affective computing, and emotion recognition, as more and more databases are available such as DEAP or GAMEEMO datasets (with GAMEEMO paper)
- Code update:
- Additional columns PR # 127: last review and merge
- Learning curve on various trial number: cross validation first, select subset of trials (in percentage or in number of sample per classes), repeat multiple times (permutation)
- open a PR on MNE for FTP download
- Issues
- Explain MOABB philosophy, regarding the idea to restrict the number of parameters to ease the comparison of algorithm. For example, the number of k-fold validation is set to 5 to ensure reproducible and accurate results.
- Divyesh: open a new discussion for cVEP paradigm, as there is data available
- How to start contributing: adding examples and running some pipelines to contribute on the leaderboard
- Open discussion:
- Thursday 14th January
- Blocking issues:
- update HDF5 requirement, https://docs.h5py.org/en/stable/whatsnew/3.0.html#breaking-changes-deprecations
- Pinning scikit-learn requirement because of PyRiemann, temporary fix, should be updated soon.
- Enhancements
- Adding SSVEP datasets: almost okay, 2 datasets ok. Need to update classes for MAMEM datasets (200+ electrodes)
- Additional information in results (PR #127): almost ok, need review
- Learning curve: evaluate accuracy while varying the number of trials. First draft.
- Transfer learning: discussions in issue.
- Open discussions:
- Google summer of code
- Sebastien: run ML/autoML competitions, interest in transfer learning, few shot learning, cross-dataset, cross-hardware, cross-paradigm. Could be interesting to run a meta-learning challenge, in a RAMP like fashion maybe.
- How to compare fairly different tasks, with different number of class, research questions close to meta-learning
- Blocking issues:
-
Thursday 3rd December
-
Thursday 10th December
- Open discussions:
- Making a leaderboard, publicly available on the wiki, as in paper with code. What process to report score? see The Ladder paper
- Enhancements
- The leaderboard page is created on the wiki
- Open discussions:
Some notes on previous discussions are available here : https://hackmd.io/adgvNhWOTbOUw2LwK1sWVg