[WIP] Initial implementation for learning curve creation #132

jsosulski · 2021-01-13T15:52:07Z

Once this PR is finalized, it closes #129.

For now, this PR should be seen as a proposal. I implemented this as a new Evaluation Class, but this could be merged into the normal WithinSessionEvaluation if desired.

Currently the PR is based only on master, but I would like to have PR #127 merged to not be forced to parse the session string to find out which permutation / dataset size a result corresponds to.

Please extend this list with necessary tasks / edit current tasks.

Tasks:

Rough example of a possible implementation.
Adapt this once Additional columns that should be evaluated #127 is merged.
Example usage script to run a benchmark / create a learning curve plot.
Check if results.already_computed is compatible with this. (Again, I personally do not use any persistence features from moabb, so I appreciate someone helping me out here.)

jsosulski · 2021-01-13T16:11:44Z

CI does not pass due to #133.
What is the best practice here? Pin sklearn for this PR temporarliy to 0.23.2 to make CI happy?

ErikBjare · 2021-01-14T12:08:29Z

@jsosulski Pinning sklearn would probably be the right call. (edit: I did so in 009d6fa)

@alexandrebarachant was active in some muse-lsl PR recently, but no word on when my fix pyRiemann/pyRiemann#93 will be merged.

This can be used in custom evaluations

jsosulski · 2021-02-08T12:47:09Z

The initial implementation works so far @sylvchev . Notable change from our discussion:

The number of permutations n_perms can (and probably should) now be passed as an array. This way we can have e.g. 50 permutations when the data subset is 5% from the originally drawn training folds and only do e.g. 5 permutations when the data subset is 100% from the drawn training folds.

The current example script runs for 1.5 minutes (compare to the plot_within_session_p300 script that also takes ~1.5mins) and uses the bnci_2014-009 dataset (this one is much smaller compared to EPFLP300). But the speed can be made faster quite arbitrarily by using less permutations. The currently generated plot looks like this:

jsosulski · 2021-02-16T17:24:57Z

I pruned some more run time of the example script and made the code to conform to flake8.
So far I am not sure if an abstract learning_curve class or something could be done easily, as the current setup uses a different cross validation (StratifiedShuffleSplit instead of CV).

@sylvchev we talked about an option to set dataset sizes using, e.g., the absolute number of samples per class. For example:

dataset_size = dict(policy='per_class', values=np.array([10, 20, 50, 200]))

Should these constraints also be applied to the test_data? E.g. lets say I have a class imbalance of 1Target:5Non-Target, should the classifier train on 1:1 but then be evaluated on 1:5?

sylvchev · 2021-02-18T07:31:29Z

examples/plot_learning_curve_p300.py

+elapsed_time = time.time() - start_time
+print(f"Elapsed time: {elapsed_time/60} minutes.")


No need to use time on the pushed version, it is automatically added by Sphinx. You could see it on the bottom of the documentation pages

Oh right, thanks! Btw, how is the documentation generated? I do not see anything in github actions that looks like documentation generation? If I know the command I could build them on my machine for testing.

sylvchev · 2021-02-18T07:32:19Z

examples/plot_within_session_p300.py

+import time
+
+start_time = time.time()


You could revert those changes, it will automatically indicated by Sphinx

sylvchev · 2021-02-18T08:47:09Z

moabb/evaluations/evaluations.py

+        :param n_perms: Number of permutations to perform. If an array
+            is passed it has to be equal in size to the data_size array.
+        :param data_size: Contains the policy to pick the datasizes to
+            evaluate, as well as the actual values. The dict has the
+            key 'policy' with either 'ratio' or 'per_class', and the key
+            'value' with the actual values as an numpy array.


You should use the formatting of MNE/Scikit for the docstring, here is an example : https://github.com/NeuroTechX/moabb/blob/master/moabb/evaluations/base.py#L92

sylvchev · 2021-02-18T08:56:29Z

moabb/evaluations/evaluations.py

+    def __init__(self, n_perms: Union[int, np.ndarray] = 20,
+                 data_size: Optional[dict] = None, **kwargs):


This is very nice to use Python type hints and this is the first time in MOABB. It could be nice to gradually update all the code base.

sylvchev · 2021-02-18T09:32:04Z

moabb/evaluations/evaluations.py

+# TODO implement per class sampling. Not yet clear -> how to sample test set?
+
+
+class WithinSessionEvaluationIncreasingData(BaseEvaluation):


I think it is better to keep a only a few class for evaluation and to dispatch code depending on the parameters. Here, you could modify the evaluate function of WithinSessionEvaluation to something like :

def __init__(self, n_perms: Union[int, np.ndarray, None] = None, data_size: Optional[dict] = None, **kwargs): """...""" def _evaluate(self, dataset, pipelines): # original evaluate() code ... def _evaluate_increasing(self, dataset, pipelines): # your evaluate() code ... def evaluate(self, dataset, pipelines): if self.n_perm: _evaluate_increasing(dataset, pipelines) else: _evaluate(dataset, pipelines)

sylvchev · 2021-02-18T10:06:54Z

examples/plot_learning_curve_p300.py

+from tdlda import Vectorizer as JumpingMeansVectorizer
+from tdlda import TimeDecoupledLda as TDLDA


The example is very nice, but examples (especially those introducing a feature) should be reproducible without requiring to download any external code (other than MOABB or requirements.txt)

That being said, I found it very interesting to have advanced examples relying on external open source code, like you did.

Could you make 2 examples from your code?

one simple example with just RG (no JM + LDA/TD-LDA)

one advanced with the Jumping Means and Time Decoupled LDA, where you could provide further explanation on your method, pointing to your papers and indicating how to install the code.

I agree, and I think having two examples is a very good idea.
I just figured pyRiemann is an external dependency as well, as it is only needed for the examples. Maybe the requirements could be split into requirements_core and requirements_extra or something to indicate what you would need to install when you just want to use moabb vs running the examples.
I also have some additional points that we could discuss in the moabb office hours next week. (Thursday evening unfortunately does not work for me this week. Most weeks actually.)

sylvchev · 2021-02-18T10:48:22Z

Kudos for the example, this is very nice! Thank you so much for keeping the computation low
As indicated in the code review, I think it is important to keep examples auto-contained with MOABB code (or modules in requirements), but I think we should have advanced examples with external code, like yours. So if you could make two different examples, it will be perfect.

Should these constraints also be applied to the test_data? E.g. lets say I have a class imbalance of 1Target:5Non-Target, should the classifier train on 1:1 but then be evaluated on 1:5?
For MI/SSVEP, you could train on balanced set (say 1 sample for each class, thus balanced) and test on the remaining samples (thus not balanced). For ERP, this is not obvious as oddball paradigm are by definition imbalanced. An ideal solution could be to use the indicated sample numbers for target and use stratified split to keep the imbalanced ratio for non target.

sylvchev · 2021-02-18T10:58:54Z

I'm wondering why the code merged from PR #127 is marked as new, as the PR have been merged in master.

jsosulski · 2021-02-18T12:05:10Z

Thanks for the review @sylvchev ! I addressed some of your comments directly in the review.
I am also wondering why #127 is shown as new. I can either squash merge once this PRs ready or alternatively, close this PR and preserve it for the reviews/discussion and open a new clean PR rebased on the current master.

sylvchev · 2021-02-18T15:23:42Z

I think we could safely squash merge this PR directly into master, the part that have been modified for additional_columns have not been modified since.

jsosulski · 2021-03-17T15:28:56Z

After merging the black-reformatted into my ongoing PR, I noticed my branch was suddenly 93 commit ahead of master. To prevent confusion (and work on my part :-) ) I will close this PR, as I added a new one, i.e. #155

initial implementation for learning curve creation

cb19cc5

Merge branch 'master' into creating_learning_curves

f6c4207

jsosulski mentioned this pull request Jan 15, 2021

Additional columns that should be evaluated #127

Merged

jsosulski and others added 13 commits February 1, 2021 09:10

comments from meeting

e34e3d3

allow event lists in P300 paradigm

2d086ad

merge two if-lines + compare type equality with is instead of ==

aa14f9c

fix P300 fake datasets in unit tests

660964e

fix flake8 errors

ce90a66

deleted outdated initialization

2a50e7b

Allow to specify additional columns for evaluation

e0db743

This can be used in custom evaluations

Check if specified columns are also returned in the evaluation

50cd2ae

Add initial unit test for additional columns feature

26de77e

flake8

a2885c8

remove superfluous print statement

14fadf4

add makefile for easier setup and teardown

bef0dcc

Initial working solution

b7a8fbd

jsosulski added 2 commits February 10, 2021 17:21

add reproducibility + timing

ceb7e54

cleanup, some rough flake8

6e9eef8

sylvchev requested changes Feb 18, 2021

View reviewed changes

sylvchev added the enhancement label Feb 18, 2021

jsosulski mentioned this pull request Mar 17, 2021

Implementation for learning curves #155

Merged

jsosulski closed this Mar 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Initial implementation for learning curve creation #132

[WIP] Initial implementation for learning curve creation #132

jsosulski commented Jan 13, 2021 •

edited

Loading

jsosulski commented Jan 13, 2021

ErikBjare commented Jan 14, 2021 •

edited

Loading

jsosulski commented Feb 8, 2021

jsosulski commented Feb 16, 2021

sylvchev Feb 18, 2021

jsosulski Feb 18, 2021

sylvchev Feb 18, 2021

sylvchev Feb 18, 2021

sylvchev Feb 18, 2021

sylvchev Feb 18, 2021

sylvchev Feb 18, 2021

jsosulski Feb 18, 2021

sylvchev commented Feb 18, 2021

sylvchev commented Feb 18, 2021

jsosulski commented Feb 18, 2021

sylvchev commented Feb 18, 2021

jsosulski commented Mar 17, 2021

		elapsed_time = time.time() - start_time
		print(f"Elapsed time: {elapsed_time/60} minutes.")

		def __init__(self, n_perms: Union[int, np.ndarray] = 20,
		data_size: Optional[dict] = None, **kwargs):

		# TODO implement per class sampling. Not yet clear -> how to sample test set?


		class WithinSessionEvaluationIncreasingData(BaseEvaluation):

		from tdlda import Vectorizer as JumpingMeansVectorizer
		from tdlda import TimeDecoupledLda as TDLDA

[WIP] Initial implementation for learning curve creation #132

[WIP] Initial implementation for learning curve creation #132

Conversation

jsosulski commented Jan 13, 2021 • edited Loading

jsosulski commented Jan 13, 2021

ErikBjare commented Jan 14, 2021 • edited Loading

jsosulski commented Feb 8, 2021

jsosulski commented Feb 16, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sylvchev commented Feb 18, 2021

sylvchev commented Feb 18, 2021

jsosulski commented Feb 18, 2021

sylvchev commented Feb 18, 2021

jsosulski commented Mar 17, 2021

jsosulski commented Jan 13, 2021 •

edited

Loading

ErikBjare commented Jan 14, 2021 •

edited

Loading