Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Use LogisticRegressionCV instead of LogisticRegression to improve Decoder performance #3736

Merged
merged 10 commits into from Jun 7, 2023

Conversation

michellewang
Copy link
Contributor

Closes #2200.

Changes proposed in this pull request:

  • Replace LogisticRegression with LogisticRegressionCV in nilearn.decoding.decoder
    • Note that the param_grid should now contain Cs instead of C, meaning that any code that explicitly provides a param_grid = {'C': [...]} would now break. Does this mean this needs a deprecation cycle? If so how is this usually done?
      • Another option could be to change C to Cs when processing the param_grid (with maybe a warning)
    • I used LogisticRegressionCV for both "logistic_l1" and "logistic_l2". sklearn also has a LassoCV class, but its hyperparameters are different (alphas instead of Cs), so it seems like it would be better to use LogisticRegressionCV for both the L1 and L2 regularization cases (to be consistent)
  • Fix incorrect links to sklearn documentation in Decoder docstring (should be LinearSVC, not SVC)

@github-actions
Copy link
Contributor

github-actions bot commented May 9, 2023

👋 @michellewang Thanks for creating a PR!

Until this PR is ready for review, you can include the [WIP] tag in its title, or leave it as a github draft.

Please make sure it is compliant with our contributing guidelines. In particular, be sure it checks the boxes listed below.

  • PR has an interpretable title.
  • PR links to Github issue with mention Closes #XXXX (see our documentation on PR structure)
  • Code is PEP8-compliant (see our documentation on coding style)
  • Changelog or what's new entry in doc/changes/latest.rst (see our documentation on PR structure)

For new features:

  • There is at least one unit test per new function / class (see our documentation on testing)
  • The new feature is demoed in at least one relevant example.

For bug fixes:

  • There is at least one test that would fail under the original bug conditions.

We will review it as quick as possible, feel free to ping us with questions if needed.

@michellewang michellewang marked this pull request as draft May 9, 2023 13:47
@codecov
Copy link

codecov bot commented May 9, 2023

Codecov Report

Merging #3736 (1293e48) into main (6085633) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #3736      +/-   ##
==========================================
+ Coverage   91.57%   91.58%   +0.01%     
==========================================
  Files         139      139              
  Lines       16572    16593      +21     
  Branches     3236     3243       +7     
==========================================
+ Hits        15175    15196      +21     
  Misses       1394     1394              
  Partials        3        3              
Impacted Files Coverage Δ
nilearn/_utils/docs.py 93.65% <ø> (ø)
nilearn/decoding/decoder.py 99.37% <100.00%> (+0.04%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@bthirion
Copy link
Member

bthirion commented May 9, 2023

Indeed, if we break the API we need to go through deprecation cycles.

@michellewang
Copy link
Contributor Author

I have added a private helper function _replace_param_grid_key to replace "C" by "Cs" in the param_grid. I also added a DeprecationWarning though I'm not sure if we want to deprecate the "C" parameter in a future version. I have also not specified a version number because I am not sure how long deprecation cycles are typically.

As a side note, I noticed that DeprecationWarnings are not shown in Jupyter notebooks (it looks like they are only shown in the __main__ module). I assume that this is known and just wanted to check that this isn't considered an issue.

@michellewang michellewang marked this pull request as ready for review May 11, 2023 14:41
@michellewang michellewang changed the title [WIP] [ENH] Use LogisticRegressionCV instead of LogisticRegression to improve Decoder performance [ENH] Use LogisticRegressionCV instead of LogisticRegression to improve Decoder performance May 11, 2023
Copy link
Member

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx !

@jeromedockes
Copy link
Member

It will probably not affect performance with the current solver, because according to the documentation there is no warm-start when using liblinear -- but that would have to be checked. also it uses a fixed grid of values for C, whereas the LassoCV has a better way of choosing values for alpha. so in a future PR it might still be worth adding the LassoCV as a possible estimator

@bthirion
Copy link
Member

Will merge tomorrow unless I hear otherwise. Thx !

Copy link
Member

@ymzayek ymzayek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @michellewang

@ymzayek ymzayek merged commit db5a727 into nilearn:main Jun 7, 2023
29 checks passed
@michellewang michellewang deleted the decoding/logisticregressioncv branch June 16, 2023 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve performance of the MetaEstimator Decoder object
4 participants