K-fold cross validation for cropped decoding does not work #422

gigi-vis · 2022-10-14T16:31:36Z

Hi all, I am trying to reproduce this example (#3: k-fold cross validation - https://braindecode.org/master/auto_examples/plot_how_train_test_and_tune.html#sphx-glr-auto-examples-plot-how-train-test-and-tune-py), but using cropped decoding for my model instead of trialwise decoding as was used in the example here. I am unable to make this example work for cropped decoding.

The error I receive: "ValueError: not enough values to unpack (expected 3, got 2)" which directs me to line 179 in classifier.py from the braindecode package.

I was able to reproduce the exact same error using the data from the tutorial (my own data is confidential). What I did to reproduce the error: follow the entire tutorial from this link (https://braindecode.org/master/auto_examples/plot_bcic_iv_2a_moabb_cropped.html) up until fitting the model. Here I added the following from (https://braindecode.org/master/auto_examples/plot_how_train_test_and_tune.html):
create the X_train and y_train objects and the corresponding train and val subsets.

Rather than fitting the model with "clf.fit(train_set, y=None, epochs=n_epochs)", I used the CV-approach:
"from sklearn.model_selection import KFold, cross_val_score
train_val_split = KFold(n_splits=5, shuffle=False)
fit_params = {"epochs": n_epochs}
cv_results = cross_val_score(
clf, X_train, y_train, scoring="accuracy", cv=train_val_split, fit_params=fit_params
)"

I hope you can help me solve this issue, thank you in advance!

martinwimpff · 2022-10-19T07:23:04Z

Hi @gigi-vis, can you provide some more information about the error? Line 179 (current version) does not unpack anything. So please show what line 179 does in your version of braindecode.

gigi-vis · 2022-10-19T08:26:53Z

Of course! I will attach a screenshot:

This is line 179 in classifier.py:

This is the complete error message:

martinwimpff · 2022-10-19T11:56:33Z

Thanks @gigi-vis. At the moment I do not see any obvious error. Two ideas:

Try to train it without the CV, exactly as in the cropped tutorial and see if that works.
Debug the code with a debugger and use something like
for placeholder in self.get_iterator(..):
and then look into the placeholder (probably a tuple) to find out which part is missing (X, y or i)

If that still does not solve the error put your code in a colab file and share it here.

gigi-vis · 2022-10-19T12:44:46Z

Hi @martinwimpff thanks for taking a look!

I have already tried the first option and that works just fine! (: I will try and find out what you mean by the second suggestion because I have not used a debugger before, but in the meantime here is the code in a colab file (identical to the code from your two tutorials because this gave me the same error as my own code):
https://colab.research.google.com/drive/1SPHyOzc2kU0yZF0SQx_CMMQyPCUCJtSC?usp=sharing

Your colleague suggested that it might have something to do with the slicing that is used for cropped decoding (https://gitter.im/braindecodechat/community).

martinwimpff · 2022-10-19T14:29:22Z

I think I have found the bug:
If you use Cropped Decoding, you have to use the estimator like this:
clf.fit(train_set, y=None, epochs=n_epochs)
with train_set being a dataset. If your try
clf.fit(X_train, y=y_train, epochs=n_epochs)
you will get an error as well.

Unfortunately cross_val_score needs the data separated into X and y. So at the moment, Cropped decoding does not work with cross_val_score. @bruAristimunha any ideas why that is the case?

gigi-vis · 2022-10-19T14:56:38Z

@martinwimpff Thank you for your time!

Using what you did (clf.fit(train_set, y = None, epochs = n_epochs) indeed works for me!

As I understood and have tried (and was shown in the example of the tutorial I posted in the first message, although that link does not seem to work anymore), cross_val_score did work for trialwise decoding, where the input was also split into an X and y (X_train, y_train). Am I correct to assume that the shape of these objects changes when using Cropped decoding?

Do you think manual k-fold cross validation could be an option (because I would really like to apply cross-validation for my project)?

martinwimpff · 2022-10-19T15:22:32Z

Correct. I can't tell you how exactly the cropped decoding works (as I only know the basic idea) but you can either ask Robin or read his paper.

Manual CV is of course an option, but you could also ask yourself if you really need cropped decoding or if basic trial-wise decoding is sufficient.

gigi-vis · 2022-10-20T08:18:16Z

@martinwimpff Ah I see, thank you. I have read the paper, and that is why I chose to use Cropped Decoding! Would it be possible to still get in touch with Robin? I am doing an internship at a hospital and need to train and evaluate the shallow CNN from the 2017 paper using data from patients at this hospital, of which unfortunately we only have a limited amount. This is why I thought using Cropped Decoding and cross-validation was the most efficient way to make use of the data that we have!

martinwimpff · 2022-10-20T08:53:28Z

You can just ask him your questions yourself @robintibor.
If overfitting is still an issue you should also try out EEGNet (uses fewer parameters) and/or add max_norm regularization to the ShallowNet (in particular to the last layer, as discussed here).

robintibor · 2022-10-20T14:30:59Z

hm yeah I think I understand the basic problem that cropped decoding requires normally knowledge of the indices of each time window during evaluation... we may want to rethink our implementation of cropped decoding to be more flexible and also allow other use cases ... this may still take some time thought so in the meantime I advice you to either
a) use manual cross validation
b) simulate cropped decoding by using trialwise decoding and adding a global average pooling layer at the end of your model

gigi-vis · 2022-10-21T11:13:05Z

@martinwimpff @robintibor Thank you both for your input!
I will use manual cross validation to estimate the performance of my model.
I have just one final question related to cropped decoding: is it possible to get a prediction per patient from the predictions per x seconds that are generated during cropped decoding?

robintibor · 2022-10-21T13:32:23Z

yes of course this is the idea behind quite a lot of the codes to correctly summarize the predictions to get a per-recording prediction in the end, like if you use the cropped=True with the EEGClassifier and use the accuracy callback like in our cropped decoding example https://braindecode.org/stable/auto_examples/plot_bcic_iv_2a_moabb_cropped.html then it should already be per-recording accuracies I think

bruAristimunha · 2022-10-28T11:08:22Z

Hi @gigi-vis, @martinwimpff and @robintibor,

Thank you very much for your excellent clarification of your error! @gigi-vis, your description of the problem was very thorough! We discovered a braindecode option that is not working as it should and will be fixed for the next release.

However, something more important, with the information provided by @martinwimpff and @robintibor, were you able to solve your issue? I understand that solving the library issue has a lower priority than solving a problem that a user reported to us directly.

You need to do some cross-validation by hand or modify the deep network to add an extra layer, and both options can be a little confusing at the first moment, at least for me.

gigi-vis · 2022-10-28T11:26:18Z

@bruAristimunha Yes, I have been able to solve that error - what I have done is apply manual cross-validation where I split the dataset into different folds before I turn them into windows datasets, to ensure that the different crops for each patient are kept within the same fold. This seems to work!
The only problem is that my train and test accuracy are 100% after the second or third round, so there appears to be an error there, but I don't know yet whether that is Braindecode related.
Thank you all very much for your help and time! @bruAristimunha @martinwimpff @robintibor

bruAristimunha · 2022-10-28T11:36:53Z

Nice! Glad we were able to help you =)

I don't know if this is the case, but I already had a similar problem, and I solved it by cloning after the definition. Something like that:

from sklearn.base import clone
clf = EEGClassifier(
  model,
  ...
)
clf.initialize()
clf = clone(clf)
clf.fit(train_set, y=None, epochs=n_epochs)

martinwimpff · 2022-10-28T11:42:14Z

@gigi-vis I am almost sure that your problem is that you do not reset/reinitialize your model before each fold. Therefore the model is able to memorize the test data throughout the CV.

gigi-vis · 2022-10-28T11:52:12Z

Ah that makes perfect sense! Although I already use clf.initialize() before clf.fit(train_set, y = None, epochs = n_epochs) in each fold. Or is this not what you meant by reinitializing the model?

so something like this:

for fold in range(K_folds):
      train_subset = windows_datasets_train[fold]
     val_subset = windows_datasets_val[fold]
      clf = EEGClassifier(
       model, 
       ...
     )
     clf.initialize()
     clf = clone(clf)
     clf.fit(train_subset, y = None, epochs = n_epochs)

   preds = clf.predict(val_subset)

.... etc. Does this not create a new model each fold?

bruAristimunha · 2022-10-28T12:25:03Z

I particularly like to clone to make sure it's a new object, especially in these situations where sometimes the modification runs inside and sometimes outside. Please let us know if this resolves your issue.

bruAristimunha · 2022-10-28T12:26:03Z

Yes, create a new model for each fold!

martinwimpff · 2022-10-28T14:02:43Z

I would recommend something like this:

from copy import deepcopy
model = ...
for fold in range(K_folds):
      train_subset = windows_datasets_train[fold]
      val_subset = windows_datasets_val[fold]
      clf = EEGClassifier(
       deepcopy(model), 
       ...
     )
     clf.initialize()
     clf.fit(train_subset, y = None, epochs = n_epochs)
     preds = clf.predict(val_subset)

The problem with clf.initialize() is that it does not initialize the models parameters . Therefore just make a deepcopy of the model and additionally use the initialize function to initialize your scheduler/optimizer/etc.

gigi-vis · 2022-10-28T14:36:07Z

Yes, this works and my accuracies are no longer 100 after two folds! :D

Thanks a lot guys!

bruAristimunha added this to the 0.8 milestone Oct 26, 2022

bruAristimunha closed this as completed Oct 28, 2022

sylvchev mentioned this issue Mar 6, 2023

Adding Braindecode pipeline NeuroTechX/moabb#328

Merged

bruAristimunha mentioned this issue Mar 7, 2023

Allow EEGclassifier in cropped to work with gridsearch cv or pipeline #438

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K-fold cross validation for cropped decoding does not work #422

K-fold cross validation for cropped decoding does not work #422

gigi-vis commented Oct 14, 2022 •

edited

martinwimpff commented Oct 19, 2022

gigi-vis commented Oct 19, 2022

martinwimpff commented Oct 19, 2022

gigi-vis commented Oct 19, 2022 •

edited

martinwimpff commented Oct 19, 2022

gigi-vis commented Oct 19, 2022 •

edited

martinwimpff commented Oct 19, 2022

gigi-vis commented Oct 20, 2022

martinwimpff commented Oct 20, 2022

robintibor commented Oct 20, 2022

gigi-vis commented Oct 21, 2022

robintibor commented Oct 21, 2022

bruAristimunha commented Oct 28, 2022

gigi-vis commented Oct 28, 2022

bruAristimunha commented Oct 28, 2022 •

edited

martinwimpff commented Oct 28, 2022

gigi-vis commented Oct 28, 2022 •

edited

bruAristimunha commented Oct 28, 2022

bruAristimunha commented Oct 28, 2022

martinwimpff commented Oct 28, 2022

gigi-vis commented Oct 28, 2022 •

edited

K-fold cross validation for cropped decoding does not work #422

K-fold cross validation for cropped decoding does not work #422

Comments

gigi-vis commented Oct 14, 2022 • edited

martinwimpff commented Oct 19, 2022

gigi-vis commented Oct 19, 2022

martinwimpff commented Oct 19, 2022

gigi-vis commented Oct 19, 2022 • edited

martinwimpff commented Oct 19, 2022

gigi-vis commented Oct 19, 2022 • edited

martinwimpff commented Oct 19, 2022

gigi-vis commented Oct 20, 2022

martinwimpff commented Oct 20, 2022

robintibor commented Oct 20, 2022

gigi-vis commented Oct 21, 2022

robintibor commented Oct 21, 2022

bruAristimunha commented Oct 28, 2022

gigi-vis commented Oct 28, 2022

bruAristimunha commented Oct 28, 2022 • edited

martinwimpff commented Oct 28, 2022

gigi-vis commented Oct 28, 2022 • edited

bruAristimunha commented Oct 28, 2022

bruAristimunha commented Oct 28, 2022

martinwimpff commented Oct 28, 2022

gigi-vis commented Oct 28, 2022 • edited

gigi-vis commented Oct 14, 2022 •

edited

gigi-vis commented Oct 19, 2022 •

edited

gigi-vis commented Oct 19, 2022 •

edited

bruAristimunha commented Oct 28, 2022 •

edited

gigi-vis commented Oct 28, 2022 •

edited

gigi-vis commented Oct 28, 2022 •

edited