Skip to content

Commit

Permalink
Updated docs and tutorials with stats and plotting
Browse files Browse the repository at this point in the history
added preprint to the readme
  • Loading branch information
Vinay Jayaram committed Jun 4, 2018
1 parent a219e40 commit 74262f7
Show file tree
Hide file tree
Showing 11 changed files with 213 additions and 13 deletions.
19 changes: 18 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ This document (the README file) is a hub to give you some information about the
* [Documentation](#docs)
* [Architecture and main concepts](#architecture)

We also have a recent arXiv preprint that you can look at [here][link_arxiv]!

## What are we doing?

Expand Down Expand Up @@ -149,7 +150,10 @@ you can submit new dataset by filling this [form](https://docs.google.com/forms/

## <a name="architecture"></a> Architecture and main concepts:

there is 4 main concepts in the MOABB: the datasets, the paradigm, the evaluation, and the pipelines.
<p align="center">
<img alt="banner" src="/images/architecture.png/" width="400">
</p>
there are 4 main concepts in the MOABB: the datasets, the paradigm, the evaluation, and the pipelines. In addition, we offer statistical and visualization utilities to simplify the workflow.

### Datasets

Expand Down Expand Up @@ -177,6 +181,18 @@ across-subject accuracy, or other transfer learning settings.
Pipeline defines all steps required by an algorithm to obtain predictions. Pipelines are typically a chain of sklearn compatible transformers and end with an sklearn compatible estimator.
See [Pipelines](http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html) for more info.

### Statistics and visualization

Once an evaluation has been run, the raw results are returned as a DataFrame. This can be further processed via the following commands to generate some basic visualization and statistical comparisons:

```
from moabb.analysis import analyze
results = evaluation.process(pipeline_dict)
analyze(results)
```

## Generate the documentation

To generate the documentation :
Expand All @@ -191,3 +207,4 @@ make html
[link_neurotechx]: http://neurotechx.com/
[link_neurotechx_signup]: https://docs.google.com/forms/d/e/1FAIpQLSfZyzhVdOLU8_oQ4NylHL8EFoKLIVmryGXA4u7HDsZpkTryvg/viewform
[link_moabb_docs]: http://moabb.neurotechx.com/docs/index.html
[link_arxiv]: https://arxiv.org/abs/1805.06427
Binary file removed architecture.png
Binary file not shown.
36 changes: 36 additions & 0 deletions docs/source/analysis.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
=========
Analysis
=========

.. automodule:: moabb.analysis

.. currentmodule:: moabb.analysis

------------
Plotting
------------

.. autosummary::
:toctree: generated/
:template: class.rst


plotting.score_plot
plotting.paired_plot
plotting.summary_plot
plotting.meta_analysis_plot

------------
Statistics
------------

.. autosummary::
:toctree: generated/
:template: class.rst

meta_analysis.find_significant_differences
meta_analysis.compute_dataset_statistics
meta_analysis.combine_effects
meta_analysis.combine_pvalues
meta_analysis.collapse_session_scores

1 change: 1 addition & 0 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@
.. include:: evaluations.rst
.. include:: paradigms.rst
.. include:: pipelines.rst
.. include:: analysis.rst
1 change: 0 additions & 1 deletion docs/source/datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ Motor Imagery Datasets
Cho2017
MunichMI
Ofner2017
OpenvibeMI
PhysionetMI
Shin2017A
Shin2017B
Expand Down
3 changes: 2 additions & 1 deletion docs/source/pipelines.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Pipelines
:template: class.rst

features.LogVariance
filter_bank.FilterBank
features.FM

------------
Base & Utils
Expand All @@ -26,3 +26,4 @@ Base & Utils
:template: class.rst

utils.create_pipeline_from_config
utils.FilterBank
2 changes: 1 addition & 1 deletion examples/plot_cross_session_motor_imagery.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@

paradigm = LeftRightImagery()
datasets = [BNCI2014001()]
overwrite = False # set to True if we want to overwrite cached results
overwrite = True # set to True if we want to overwrite cached results
evaluation = CrossSessionEvaluation(paradigm=paradigm, datasets=datasets,
suffix='examples', overwrite=overwrite)

Expand Down
14 changes: 8 additions & 6 deletions examples/plot_filterbank_csp_vs_csp.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@
# from 8 to 35 Hz.

datasets = [BNCI2014001()]
overwrite = False # set to True if we want to overwrite cached results
overwrite = True # set to True if we want to overwrite cached results

# broadband filters
fmin = 8
Expand All @@ -86,15 +86,17 @@

results = pd.concat([results, results_fb])



##############################################################################
# Plot Results
# ----------------
#
# Here we plot the results. We the first plot is a pointplot with the average
# performance of each pipeline across session and subjects.
# The second plot is a paired scatter plot. Each point representing the score
# of a single session. An algorithm will outperforms another is most of the
# points are in its quadrant.
# Here we plot the results via normal methods. We the first plot is a pointplot
# with the average performance of each pipeline across session and subjects.
# The second plot is a paired scatter plot. Each point representing the score of
# a single session. An algorithm will outperforms another is most of the points
# are in its quadrant.

fig, axes = plt.subplots(1, 2, figsize=[8, 4], sharey=True)

Expand Down
Binary file added images/architecture.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 6 additions & 3 deletions moabb/analysis/meta_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,9 +192,12 @@ def combine_pvalues(p, nsubs):
return meta-analysis significance
'''
W = np.sqrt(nsubs)
out = stats.combine_pvalues(np.array(p), weights=W, method='stouffer')[1]
return out
if len(p) == 1:
return p.item()
else:
W = np.sqrt(nsubs)
out = stats.combine_pvalues(np.array(p), weights=W, method='stouffer')[1]
return out


def find_significant_differences(df, perm_cutoff=20):
Expand Down
141 changes: 141 additions & 0 deletions tutorials/plot_statistical_analysis.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
"""=======================
Statistical analysis
=======================
The MOABB codebase comes with convenience plotting utilities and some
statistical testing. This tutorial focuses on what those exactly are and how
they can be used.
"""
# Authors: Vinay Jayaram <vinayjayaram13@gmail.com>
#
# License: BSD (3-clause)

import moabb
import matplotlib.pyplot as plt
import moabb.analysis.plotting as moabb_plt
from moabb.analysis.meta_analysis import find_significant_differences, compute_dataset_statistics #flake8: noqa

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline

from mne.decoding import CSP

from pyriemann.estimation import Covariances
from pyriemann.tangentspace import TangentSpace

from moabb.datasets import BNCI2014001
from moabb.paradigms import LeftRightImagery
from moabb.evaluations import CrossSessionEvaluation

moabb.set_log_level('info')

print(__doc__)

###############################################################################
# Results Generation
# ---------------------
#
# First we need to set up a paradigm, dataset list, and some pipelines to
# test. This is explored more in the examples -- we choose a left vs right
# imagery paradigm with a single bandpass. There is only one dataset here but
# any number can be added without changing this workflow.
#
# Create pipelines
# ----------------
#
# Pipelines must be a dict of sklearn pipeline transformer.
#
# The csp implementation from MNE is used. We selected 8 CSP components, as
# usually done in the litterature.
#
# The riemannian geometry pipeline consists in covariance estimation, tangent
# space mapping and finaly a logistic regression for the classification.

pipelines = {}

pipelines['CSP + LDA'] = make_pipeline(CSP(n_components=8),
LDA())

pipelines['RG + LR'] = make_pipeline(Covariances(),
TangentSpace(),
LogisticRegression())

pipelines['CSP + LR'] = make_pipeline(CSP(n_components=8),
LogisticRegression())

pipelines['RG + LDA'] = make_pipeline(Covariances(),
TangentSpace(),
LDA())

##############################################################################
# Evaluation
# ----------
#
# We define the paradigm (LeftRightImagery) and the dataset (BNCI2014001).
# The evaluation will return a dataframe containing a single AUC score for
# each subject / session of the dataset, and for each pipeline.
#
# Results are saved into the database, so that if you add a new pipeline, it
# will not run again the evaluation unless a parameter has changed. Results can
# be overwritten if necessary.

paradigm = LeftRightImagery()
datasets = [BNCI2014001()]
overwrite = True # set to True if we want to overwrite cached results
evaluation = CrossSessionEvaluation(paradigm=paradigm, datasets=datasets,
suffix='examples', overwrite=overwrite)

results = evaluation.process(pipelines)


##############################################################################
# MOABB plotting
# ----------------
#
# Here we plot the results using some of the convenience methods within the
# toolkit. The score_plot visualizes all the data with one score per subject
# for every dataset and pipeline.

fig = moabb_plt.score_plot(results)
plt.show()

###############################################################################
# For a comparison of two algorithms, there is the paired_plot, which plots
# performance in one versus the performance in the other over all chosen
# datasets. Note that there is only one score per subject, regardless of the
# number of sessions.

fig = moabb_plt.paired_plot(results, 'CSP + LDA', 'RG + LDA')
plt.show()


###############################################################################
# Statistical testing and further plots
# ----------------------------------------
#
# If the statistical significance of results is of interest, the method
# compute_dataset_statistics allows one to show a meta-analysis style plot as
# well. For an overview of how all algorithms perform in comparison with each
# other, the method find_significant_differences and the summary_plot are
# possible.


stats = compute_dataset_statistics(results)
P, T = find_significant_differences(stats)

################################################################################
# The meta-analysis style plot shows the standardized mean difference within
# each tested dataset for the two algorithms in question, in addition to a
# meta-effect and significances both per-dataset and overall.
fig = moabb_plt.meta_analysis_plot(stats, 'CSP + LDA', 'RG + LDA')
plt.show()


################################################################################
# The summary plot shows the effect and significance related to the hypothesis
# that the algorithm on the y-axis significantly out-performed the algorithm on
# the x-axis over all datasets
moabb_plt.summary_plot(P,T)
plt.show()

0 comments on commit 74262f7

Please sign in to comment.