FIX Restore support for n_samples == n_features in MinCovDet.#30483
Conversation
virchan
left a comment
There was a problem hiding this comment.
So, if n_samples == n_feautres (and assume support_fraction=None), then
This causes an out-of-range index because we must always have n_support <= n_samples.
@anntzer's solution---to choose the smaller value between n_samples and the originally implemented n_support---makes sense to me.
| launch_mcd_on_dataset(500, 1, 100, 0.02, 0.02, 350, global_random_seed) | ||
|
|
||
| # n_samples == n_features | ||
| launch_mcd_on_dataset(20, 20, 0, 0.1, 0.1, 50, global_random_seed) |
There was a problem hiding this comment.
I think the CI test is failing because tol_support = 50 in the launch_mcd_on_dataset function. We should always have tol_support <= n_samples here.
virchan
left a comment
There was a problem hiding this comment.
I think a changelog is needed here @anntzer. You can refer to the following link for more details:
https://github.com/scikit-learn/scikit-learn/blob/main/doc/whats_new/upcoming_changes/README.md
fc772c7 broke support for the (degenerate) `n_samples == n_features` (and support_fraction unset) case in MinCovDet because this led to `n_support = n_samples + 1`, which was implicitly clipped to `n_support = n_samples` previously (at `np.argsort(dist)[:n_support]`) but not anymore now (and crashes `np.argpartition(dist, n_support - 1)`. To fix this, explicitly clip `n_support`.
|
done |
virchan
left a comment
There was a problem hiding this comment.
LGTM! Thanks @anntzer!
@agramfort, @ogrisel, just a friendly ping—would you like to take a look?
Fixes #30625
#29835 broke support for the (degenerate)
n_samples == n_features(and support_fraction unset) case in MinCovDet because this led ton_support = n_samples + 1, which was implicitly clipped ton_support = n_samplespreviously (atnp.argsort(dist)[:n_support]) but not anymore now (and crashesnp.argpartition(dist, n_support - 1).To fix this, explicitly clip
n_support.Noticed at pyRiemann/pyRiemann#335.
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Any other comments?