Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Initial implementation for learning curve creation #132

Closed
wants to merge 17 commits into from

Conversation

jsosulski
Copy link
Collaborator

@jsosulski jsosulski commented Jan 13, 2021

Once this PR is finalized, it closes #129.

For now, this PR should be seen as a proposal. I implemented this as a new Evaluation Class, but this could be merged into the normal WithinSessionEvaluation if desired.

Currently the PR is based only on master, but I would like to have PR #127 merged to not be forced to parse the session string to find out which permutation / dataset size a result corresponds to.

Please extend this list with necessary tasks / edit current tasks.

Tasks:

  • Rough example of a possible implementation.
  • Adapt this once Additional columns that should be evaluated #127 is merged.
  • Example usage script to run a benchmark / create a learning curve plot.
  • Check if results.already_computed is compatible with this. (Again, I personally do not use any persistence features from moabb, so I appreciate someone helping me out here.)

@jsosulski
Copy link
Collaborator Author

CI does not pass due to #133.
What is the best practice here? Pin sklearn for this PR temporarliy to 0.23.2 to make CI happy?

@ErikBjare
Copy link
Collaborator

ErikBjare commented Jan 14, 2021

@jsosulski Pinning sklearn would probably be the right call. (edit: I did so in 009d6fa)

@alexandrebarachant was active in some muse-lsl PR recently, but no word on when my fix pyRiemann/pyRiemann#93 will be merged.

@jsosulski
Copy link
Collaborator Author

The initial implementation works so far @sylvchev . Notable change from our discussion:

The number of permutations n_perms can (and probably should) now be passed as an array. This way we can have e.g. 50 permutations when the data subset is 5% from the originally drawn training folds and only do e.g. 5 permutations when the data subset is 100% from the drawn training folds.

The current example script runs for 1.5 minutes (compare to the plot_within_session_p300 script that also takes ~1.5mins) and uses the bnci_2014-009 dataset (this one is much smaller compared to EPFLP300). But the speed can be made faster quite arbitrarily by using less permutations. The currently generated plot looks like this:

grafik

@jsosulski
Copy link
Collaborator Author

I pruned some more run time of the example script and made the code to conform to flake8.
So far I am not sure if an abstract learning_curve class or something could be done easily, as the current setup uses a different cross validation (StratifiedShuffleSplit instead of CV).

@sylvchev we talked about an option to set dataset sizes using, e.g., the absolute number of samples per class. For example:

dataset_size = dict(policy='per_class', values=np.array([10, 20, 50, 200]))

Should these constraints also be applied to the test_data? E.g. lets say I have a class imbalance of 1Target:5Non-Target, should the classifier train on 1:1 but then be evaluated on 1:5?

Comment on lines +138 to +139
elapsed_time = time.time() - start_time
print(f"Elapsed time: {elapsed_time/60} minutes.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to use time on the pushed version, it is automatically added by Sphinx. You could see it on the bottom of the documentation pages

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, thanks! Btw, how is the documentation generated? I do not see anything in github actions that looks like documentation generation? If I know the command I could build them on my machine for testing.

Comment on lines +35 to +37
import time

start_time = time.time()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could revert those changes, it will automatically indicated by Sphinx

Comment on lines +27 to +32
:param n_perms: Number of permutations to perform. If an array
is passed it has to be equal in size to the data_size array.
:param data_size: Contains the policy to pick the datasizes to
evaluate, as well as the actual values. The dict has the
key 'policy' with either 'ratio' or 'per_class', and the key
'value' with the actual values as an numpy array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should use the formatting of MNE/Scikit for the docstring, here is an example : https://github.com/NeuroTechX/moabb/blob/master/moabb/evaluations/base.py#L92

Comment on lines +24 to +25
def __init__(self, n_perms: Union[int, np.ndarray] = 20,
data_size: Optional[dict] = None, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice to use Python type hints and this is the first time in MOABB. It could be nice to gradually update all the code base.

# TODO implement per class sampling. Not yet clear -> how to sample test set?


class WithinSessionEvaluationIncreasingData(BaseEvaluation):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to keep a only a few class for evaluation and to dispatch code depending on the parameters. Here, you could modify the evaluate function of WithinSessionEvaluation to something like :

def __init__(self, n_perms: Union[int, np.ndarray, None] = None,
                 data_size: Optional[dict] = None, **kwargs):
    """..."""
    def _evaluate(self, dataset, pipelines):
        # original evaluate() code
        ...
    def _evaluate_increasing(self, dataset, pipelines):
        # your evaluate() code
        ...
    def evaluate(self, dataset, pipelines):
        if self.n_perm: _evaluate_increasing(dataset, pipelines)
        else: _evaluate(dataset, pipelines)

Comment on lines +34 to +35
from tdlda import Vectorizer as JumpingMeansVectorizer
from tdlda import TimeDecoupledLda as TDLDA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example is very nice, but examples (especially those introducing a feature) should be reproducible without requiring to download any external code (other than MOABB or requirements.txt)

That being said, I found it very interesting to have advanced examples relying on external open source code, like you did.

Could you make 2 examples from your code?

  • one simple example with just RG (no JM + LDA/TD-LDA)
  • one advanced with the Jumping Means and Time Decoupled LDA, where you could provide further explanation on your method, pointing to your papers and indicating how to install the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, and I think having two examples is a very good idea.
I just figured pyRiemann is an external dependency as well, as it is only needed for the examples. Maybe the requirements could be split into requirements_core and requirements_extra or something to indicate what you would need to install when you just want to use moabb vs running the examples.
I also have some additional points that we could discuss in the moabb office hours next week. (Thursday evening unfortunately does not work for me this week. Most weeks actually.)

@sylvchev
Copy link
Member

Kudos for the example, this is very nice! Thank you so much for keeping the computation low
As indicated in the code review, I think it is important to keep examples auto-contained with MOABB code (or modules in requirements), but I think we should have advanced examples with external code, like yours. So if you could make two different examples, it will be perfect.

Should these constraints also be applied to the test_data? E.g. lets say I have a class imbalance of 1Target:5Non-Target, should the classifier train on 1:1 but then be evaluated on 1:5?
For MI/SSVEP, you could train on balanced set (say 1 sample for each class, thus balanced) and test on the remaining samples (thus not balanced). For ERP, this is not obvious as oddball paradigm are by definition imbalanced. An ideal solution could be to use the indicated sample numbers for target and use stratified split to keep the imbalanced ratio for non target.

@sylvchev
Copy link
Member

I'm wondering why the code merged from PR #127 is marked as new, as the PR have been merged in master.

@jsosulski
Copy link
Collaborator Author

Thanks for the review @sylvchev ! I addressed some of your comments directly in the review.
I am also wondering why #127 is shown as new. I can either squash merge once this PRs ready or alternatively, close this PR and preserve it for the reviews/discussion and open a new clean PR rebased on the current master.

@sylvchev
Copy link
Member

I think we could safely squash merge this PR directly into master, the part that have been modified for additional_columns have not been modified since.

@jsosulski
Copy link
Collaborator Author

After merging the black-reformatted into my ongoing PR, I noticed my branch was suddenly 93 commit ahead of master. To prevent confusion (and work on my part :-) ) I will close this PR, as I added a new one, i.e. #155

@jsosulski jsosulski closed this Mar 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
MOABB
  
Awaiting triage
Development

Successfully merging this pull request may close these issues.

Allow creation of learning curves
4 participants