New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: added unified validation scheme #397
WIP: added unified validation scheme #397
Conversation
Codecov Report
@@ Coverage Diff @@
## master #397 +/- ##
=======================================
Coverage 82.63% 82.63%
=======================================
Files 53 53
Lines 3738 3738
=======================================
Hits 3089 3089
Misses 649 649 |
Hi @martinwimpff, Thank you so much for your PR! Here is the render version: I am still revising it, but it is already pretty cool! I will try to add a figure to make it easier to differentiate the three schemes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job @martinwimpff @bruAristimunha! To me the description is clear and I hope people will read it before doing ML for EEG. My comments are mainly typos and some reformulations.
A visualization of the CV schemes would be highly appreciated.
One more thing, now this tutorial and plot_hyperparameter_tuning_with_scikit-learn.py
are quite similar. Maybe we can add hyperparameters search
to the title of this one and just keep this tutorial? What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is very good! I tried to give some small suggestions, I tried not to change the text. Feel free to accept or not the suggestions. I'll try to put some image to illustrate the options.
Co-authored-by: Maciej Sliwowski <maciejo988@wp.pl>
Hi @martinwimpff, At a tutorial, I was wondering if we could add: It's a vertorized image, so possibly it will be bigger in the tutorial. |
@sliwy: first of all, thanks a lot for your review!
I agree with that, there shouldn't be two tutorials. The new title could be something like: How to train, test and tune your model. But I am open for your suggestions ;) As you suggested I added additional parts for option 2 and 3 that do not use GridSearch. @bruAristimunha: thanks for your suggestions as well, I added them as well. |
I would produce images with matplotlib like sklearn does to demo the cross validation generators
|
@martinwimpff thanks for the corrections!
Having visualization done with matplotlib and sklearn could be also awesome. Can we find some names for CV options (it is easier to use outside of tutorial context), my proposition:
@martinwimpff For the option 3 visualization I would add I took a while (quite short, so I just scanned paper for CV method) to prepare summary of evaluation methods used for models that are currently available in braindecode. Maybe it can be useful for discussion and showing how it is used in papers. Feel free to include it in the tutorial if you think it may be useful. I feel that it would be nice if we can express all the methods from this table in terms of our evaluation tutorial terminology. What do you think?
Schirrmeister, R.T., Springenberg, J.T., Fiederer, L.D.J., Glasstetter, M., Eggensperger, K., Tangermann, M., Hutter, F., Burgard, W. and Ball, T. (2017), Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp., 38: 5391-5420. https://doi.org/10.1002/hbm.23730 Lawhern, V.J., Solon, A.J., Waytowich, N.R., Gordon, S.M., Hung, C.P., & Lance, B. (2018). EEGNet: a compact convolutional neural network for EEG-based brain–computer interfaces. Journal of Neural Engineering, 15. Chambon, S., Galtier, M.N., Arnal, P.J., Wainrib, G., & Gramfort, A. (2018). A Deep Learning Architecture for Temporal Sleep Stage Classification Using Multivariate and Multimodal Time Series. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 26, 758-769. Eldele, E., Chen, Z., Liu, C., Wu, M., Kwoh, C., Li, X., & Guan, C. (2021). An Attention-Based Deep Learning Approach for Sleep Stage Classification With Single-Channel EEG. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 29, 809-818. Perslev, M., Darkner, S., Kempfner, L., Nikolic, M., Jennum, P.J., & Igel, C. (2021). U-Sleep: resilient high-frequency sleep staging. NPJ Digital Medicine, 4. |
👍
Agree. I am not sure about the nested CV. What we do in option 3 is a k-fold blockwise CV (with a fixed Hold-Out set, the untouched test set). As far as I understand this should be used with BCI research as the test set should always be from an unseen session or subject. Further, most datasets provide this two-fold split anyways.
I actually had the same thought while creating the image. Maybe we have to make the division between HP search and "normal" training with 3 splits clearer. For HP tuning the test set should never be used. We can include the table that you created in something like an additional section that we call |
To avoid confusing the community I would aim to follow sklearn naming conventions and refer to sklearn documentation as much as possible. Maintaining a good documentation takes energy so let’s put our energy where we are very unique
|
Something like that @agramfort? https://scikit-learn.org/stable/auto_examples/model_selection/plot_cv_indices.html |
+2. I love the new title =)
I love the idea of a reporting subsection to discuss a little the potential difference that we can find in metrics reported in articles and during the replication moment. I think you just passed this with deep4net, right @martinwimpff? I would love this table in my thesis, but not in the library. With more deep learning methods, it may become unfeasible to maintain the table. I wonder if we're putting too much information into a tutorial. I would suggest that we don't spend too much energy on definitions, as indicated by @agramfort, and close with a simple matplotlib view, like https://scikit-learn.org/stable/auto_examples/model_selection/plot_cv_indices.html. What do you think, @martinwimpff, @sliwy and @agramfort? |
I think that this is a valid point and it might be better if we reorganize the content of this tutorial, also regarding the merge with the current hyperparameter tuning tutorial. So, as mentioned above, I would like to make the division between "normal" training and HP tuning clearer. Therefore I would propose the following struture:
As @agramfort mentioned, we should reference the sklearn library as often as possible (especially in sections 2 and 3). Further, the explanations of the schemes should be minimal and rather focus on the specifics of BCI data (e.g. test set should be from another session or subject, train-val split should be blockwise, is there anything else?). Is this what you meant with too much information @bruAristimunha ? |
Yes @martinwimpff |
@bruAristimunha I agree, the table may be too much for the tutorial. The matplotlib simple plots look good to me! :) @martinwimpff I like the new structure of the tutorial. Just for point 2.3 I would mention that it can be used as well without hold-out test set. |
I just committed a new version with the restructuring mentioned above. |
Hello @martinwimpff and @sliwy, |
Great job @bruAristimunha! I have two small suggestions for the visualizations:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job @martinwimpff @bruAristimunha! Thanks for that!
The only small thing is that the code for generating the visualization complicates the tutorial a little bit, but I don't have any idea how to fix that apart from moving those functions to some module. I am not a big fan of this option either.
To me, after the small changes in the visualization mentioned by @martinwimpff it is ready to be merged! :)
I have incorporated all comments. For me, we can apply merge, @agramfort. |
@bruAristimunha I almost forgot, do we finally remove the plot_hyperparameter_tuning_with_scikit-learn.py? |
Can we handle this in another pull request? |
@bruAristimunha sure! Once again, thanks for this great tutorial! 👏 |
from torch.utils.data import Subset | ||
from sklearn.model_selection import train_test_split | ||
from skorch.helper import predefined_split, SliceDataset | ||
|
||
X_train = SliceDataset(train_set, idx=0) | ||
y_train = np.array([y for y in SliceDataset(train_set, idx=1)]) | ||
train_indices, val_indices = train_test_split( | ||
X_train.indices_, test_size=0.2, shuffle=False | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is neat but a bit convoluted for an intro tutorial.
I would consider exposing in train_set indices or something that can simply
be passed to train_test_split
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could use
train_test_split(np.array(len(train_set)), test_size=0.2, shuffle=False)
instead, but X_train
and y_train
would still be necessary for cross_val_score, search.fit()
and the visualizations.
train_val_split = KFold(n_splits=5, shuffle=False) | ||
|
||
###################################################################### | ||
# .. note:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is something that should appear at the top and maybe as a warning as it's not a detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can use copy.deepcopy(model)
in every EEGClassifier
init, so we won't modify the weights of original model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the warning option. If we use deepcopy
we would have to include a small note on that as well to avoid confusion.
arff I would use the word CV splitter
… Message ID: ***@***.***
com>
|
I included the comments from @agramfort. I did not change this part: from torch.utils.data import Subset
from sklearn.model_selection import train_test_split
from skorch.helper import predefined_split, SliceDataset
X_train = SliceDataset(train_set, idx=0)
y_train = np.array([y for y in SliceDataset(train_set, idx=1)])
train_indices, val_indices = train_test_split(
X_train.indices_, test_size=0.2, shuffle=False
) as I changed the final note to a warning and moved it to the start as I personally prefer this option for the tutorial. I also like the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @martinwimpff @bruAristimunha 🙏 |
A huge thank you to you, @martinwimpff. For building the entire tutorial! Welcome to the braindecode contributor group, and we look forward to your next contributions! Always feel welcome to contribute and interact with the library. |
Also wanted to say a big thank you to @martinwimpff @bruAristimunha and also @agramfort the tutorial really looks incredible, really a valuable addition to the library. I had been a bit absent and feel super happy to see such great work once I found the time to look haha :) |
Added a unified validation scheme example as discussed in #378
This tutorial shares some similarities with #349
@bruAristimunha: let me know what your think about this first draft