New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Add LassoCV
to nilearn.decoding.DecoderRegressor
#3781
Conversation
👋 @michellewang Thanks for creating a PR! Until this PR is ready for review, you can include the [WIP] tag in its title, or leave it as a github draft. Please make sure it is compliant with our contributing guidelines. In particular, be sure it checks the boxes listed below.
For new features:
For bug fixes:
We will review it as quick as possible, feel free to ping us with questions if needed. |
Codecov Report
@@ Coverage Diff @@
## main #3781 +/- ##
=======================================
Coverage 91.74% 91.74%
=======================================
Files 134 134
Lines 15733 15737 +4
Branches 3273 3275 +2
=======================================
+ Hits 14434 14438 +4
Misses 757 757
Partials 542 542
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
LassoCV
to nilearn.decoding.DecoderRegressor
LassoCV
to nilearn.decoding.DecoderRegressor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, looks good so far.
Maybe we could showcase it in an example ?
nilearn/decoding/decoder.py
Outdated
# the default is to generate 100 alphas based on the data | ||
# (alpha values can also be set with the 'alphas' parameter, in which | ||
# case 'n_alphas' is ignored) | ||
param_grid["n_alphas"] = [100] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100 is a lot, and Lasso fit is expensive. I'd change the default to 30 ---at most.
Thanks a lot @michellewang ! LGTM apart from reducing the number of fits as @bthirion requested.
yes I think that's the best thing to do!
that's a good point but let's deal with it in another pr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michellewang LGTM, other than addressing @bthirion 's comment is there anything else left to do on this one?
@ymzayek yes I am trying to think if it would make sense to add something to an example. This one is about cross-validation but it does classification (not regression) and there is a section showing a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just wondering: is there any example that works better when Lasso is used instead of Ridge ?
I think we only showcase regression using |
I simply suggest to try once replacing the SVR with a LassoCV to see the difference in accuracy and computation time. If it not enough or too prohibitive, let's keep the example as-is. Otherwise we can switch here. |
Ok, I modified the example to show both the SVR and Lasso results. Let me know if this is what you had in mind |
@michellewang thanks, the example looks good! Can you run it with and without lasso and let us know the difference in computation time? If it's big, I think we should stick to showcasing one of the 2 regressors, and if not we can keep it like this. Also, can you still add lasso to the list here: https://nilearn.github.io/stable/decoding/estimator_choice.html#different-linear-models |
@ymzayek on my laptop it takes 1min 8s without lasso and 1min 30s with Lasso, so adding the Lasso added about 22 seconds. Let me know if that's okay or if we should only keep one of them. The Lasso has higher CV error than the SVR (MAE is ~12 instead of ~10). |
I would say to keep the example as the original (without lasso). However, maybe for users this is useful to see so I will defer to what @bthirion and others think. |
If it performs worse and takes more time, we should certainly not include it. Sorry for the useless request. |
I reverted the example, no worries @bthirion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx @michellewang ! I think this one is good to go. Maybe another review from someone who has already had a look @bthirion or @jeromedockes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot @michellewang !
Thx merging |
Closes #3769.
Changes proposed in this pull request:
'lasso'
and'lasso_regressor'
as supported estimators (sklearn.linear_model.LassoCV()
)n_alphas=100
instead of directly specifying a list ofalphas
. This is the default forLassoCV
and makes it so that the alphas depend on the data values, so I think this is more desirable than fixingalphas
to always be the same, but let me know if you have other thoughts.I noticed that even though we have
Decoder
for classification andDecoderRegressor
for regression, there is no check for whether the user provides an appropriate estimator (as far as I can tell). As a result, it is technically possible to pass something like'svc'
toDecoderRegressor
, which doesn't make sense since (for example)DecoderRegressor
usesr2
scoring.