Add example to compare classifiers #171

qbarthelemy · 2022-04-26T18:06:50Z

This PR adds an example to compare several Riemannian classifiers on low-dimensional synthetic datasets, adapted to SPD matrices from https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html

@gabelstein

qbarthelemy · 2022-04-26T18:07:53Z

Display of the current example:

agramfort · 2022-04-27T13:39:55Z

decision functions are bit weird no? for example I am surprised that the kNN decision function is not more irregular.

gabelstein · 2022-04-29T10:34:02Z

Thanks, looks good! will include it @qbarthelemy

gabelstein · 2022-04-29T11:32:11Z

On closer inspection, this will probably be very confusing, as we can only display two dimensions of the three dimensional space. it could be alleviated with a 3d-plot, but that plot couldn't show decision boundaries properly. I'll look into another way of including an example.

qbarthelemy · 2022-05-02T17:02:28Z

decision functions are bit weird no? for example I am surprised that the kNN decision function is not more irregular.

Good catch! Digging into the code, I have discovered that current version of KNN does not implement predict_proba() : it only inherites from MDM, which has only one center by class.

will include it

Ok, but do not merge this branch in yours. Wait the merge, and rebase your branch on the last master.

On closer inspection, this will probably be very confusing, as we can only display two dimensions of the three dimensional space. it could be alleviated with a 3d-plot, but that plot couldn't show decision boundaries properly. I'll look into another way of including an example.

Previous example plots the decision boundary for the horizontal 2D plane going through the mean value of the third coordinates.
3D decision boundaries are not easy to show, but I try a new display.

qbarthelemy · 2022-05-02T17:43:54Z

I use RandomState to generate data, but figures are not reproducible from one run to another.

pyriemann/classification.py

agramfort · 2022-05-03T06:26:41Z

pyriemann/classification.py

+            for il, ll in enumerate(self.classes_):
+                prob[m, il] = np.sum(
+                    probas[m, neighbors_classes[m, 0:self.n_neighbors] == ll]
+                )


Can you see how it differs from how sklearn does it?

https://github.com/scikit-learn/scikit-learn/blob/baf828ca126bcb2c0ad813226963621cafe38adb/sklearn/neighbors/_classification.py#L240

Yes, I had checked: methods are very similar.

The difference is the definition of probabilities from distances:
pyRiemann uses a softmax of negative squared distances (this formula is derived from the Riemannian Gaussian distribution, as explained in #100),
whereas sklearn uses reciprocal of distances, and then divides by the sum (to have the sum of probas equal to 1) (but, I don't know the origin of this formula).

before committing and making a choice that differs from sklearn I would run a tiny benchmark on Gaussian data following LDA model to see what approach leads to the best calibrated probability.

I would run a tiny benchmark

Nice idea! But, who is "I"? you or me? ;-)

Joking aside, I don't see how to do this benchmark.
Because inputs of sklearn are multivariate vectors (generated by a mixture of multivariate Gaussian distributions, ok), while inputs of pyRiemann are covariance matrices. So, results can't be compared.

I would do it in sklearn using euclidian data. Basically replace the
predict_proba in sklearn and see what works best
in sklearn. Then copy this in PyRiemann

I plot log-probabilities of kNN applied on bivariate Gaussian distributions:

weights="max_lik" : computing softmax of negative squared distances, equivalent to an Euclidean Gaussian modelization (implem added in this branch).

weights="distance" : computing reciprocal of distances, equivalent to a power law modelization (classical sklearn implem);

Results are really close, but I think that new option is better, because:

it is derived from a Gaussian prior, more coherent than a power law prior;

it naturally deals with the case when we attempt to classify a point that is zero distance from one or more training points, contrary to reciprocal computation giving a singularity in 0.

Moreover, this new option could be added in sklearn.

@agramfort , can we merge the branch?

doc/whatsnew.rst

… tests

examples/simulated/plot_classifier_comparison.py

sylvchev · 2022-05-16T15:09:48Z

I use RandomState to generate data, but figures are not reproducible from one run to another.

I had a look into that, but I did not find a reason. Does the covariances are the same or is it only the resulting classification?

sylvchev · 2022-05-16T22:30:39Z

Good catch for the missing RandomState in make_gaussian_blobs!
The doc is nicely generated, but is it possible to use a different to have a more interpretable example on the 3rd line? See below on the artifact generated by GH Action, the 3rd dataset seems to have only blue class or indecisive probability for the contourf plot. On the 3D example you posted above, the partition of the space seems more conclusive.

agramfort · 2022-05-18T06:23:54Z

I would do it in sklearn using euclidian data. Basically replace the predict_proba in sklearn and see what works best in sklearn. Then copy this in PyRiemann

…

Message ID: ***@***.***>

agramfort · 2022-06-01T14:06:04Z

thx @qbarthelemy !

sorry it had slipped through the cracks

sylvchev · 2022-06-01T14:29:11Z

LGTM
Kudos @qbarthelemy

agramfort reviewed May 3, 2022

View reviewed changes

qbarthelemy added 7 commits May 16, 2022 15:26

add example to compare classifiers

430cd9c

correct flake8

13c6275

remove space

5f6e0ae

add predict_proba to KNN, correct self.classes_ attribute and correct…

3c622d8

… tests

move example from 2d to 3d plots

29a2d3d

correct flake8

c523fb6

complete whatsnew

7002cd1

sylvchev approved these changes May 16, 2022

View reviewed changes

sylvchev reviewed May 16, 2022

View reviewed changes

examples/simulated/plot_classifier_comparison.py Show resolved Hide resolved

qbarthelemy force-pushed the example_classif branch from c768166 to 7002cd1 Compare May 16, 2022 17:34

make make_gaussian_blobs reproducible

d4791b0

add parameters to generate nice datasets

cd1a033

qbarthelemy force-pushed the example_classif branch from c57180e to cd1a033 Compare May 17, 2022 10:40

agramfort approved these changes Jun 1, 2022

View reviewed changes

sylvchev merged commit b3f09bd into pyRiemann:master Jun 1, 2022

qbarthelemy deleted the example_classif branch June 1, 2022 14:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example to compare classifiers #171

Add example to compare classifiers #171

qbarthelemy commented Apr 26, 2022

qbarthelemy commented Apr 26, 2022

agramfort commented Apr 27, 2022

gabelstein commented Apr 29, 2022

gabelstein commented Apr 29, 2022

qbarthelemy commented May 2, 2022

qbarthelemy commented May 2, 2022 •

edited

agramfort May 3, 2022

qbarthelemy May 3, 2022 •

edited

agramfort May 17, 2022

qbarthelemy May 17, 2022

qbarthelemy May 17, 2022

qbarthelemy May 18, 2022 •

edited

qbarthelemy Jun 1, 2022

sylvchev commented May 16, 2022

sylvchev commented May 16, 2022

agramfort commented May 18, 2022 via email

agramfort commented Jun 1, 2022

sylvchev commented Jun 1, 2022

Add example to compare classifiers #171

Add example to compare classifiers #171

Conversation

qbarthelemy commented Apr 26, 2022

qbarthelemy commented Apr 26, 2022

agramfort commented Apr 27, 2022

gabelstein commented Apr 29, 2022

gabelstein commented Apr 29, 2022

qbarthelemy commented May 2, 2022

qbarthelemy commented May 2, 2022 • edited

agramfort May 3, 2022

Choose a reason for hiding this comment

qbarthelemy May 3, 2022 • edited

Choose a reason for hiding this comment

agramfort May 17, 2022

Choose a reason for hiding this comment

qbarthelemy May 17, 2022

Choose a reason for hiding this comment

qbarthelemy May 17, 2022

Choose a reason for hiding this comment

qbarthelemy May 18, 2022 • edited

Choose a reason for hiding this comment

qbarthelemy Jun 1, 2022

Choose a reason for hiding this comment

sylvchev commented May 16, 2022

sylvchev commented May 16, 2022

agramfort commented May 18, 2022 via email

agramfort commented Jun 1, 2022

sylvchev commented Jun 1, 2022

qbarthelemy commented May 2, 2022 •

edited

qbarthelemy May 3, 2022 •

edited

qbarthelemy May 18, 2022 •

edited