Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT replace log_logistic by logaddexp #27544

Merged
merged 8 commits into from Oct 10, 2023

Conversation

lorentzenchr
Copy link
Member

Reference Issues/PRs

No

What does this implement/fix? Explain your changes.

This PR replaces log_logistic(x) by the more precise and faster -np.logaddexp(0, -x).
The whole Cython module _logistic_sigmoid is therefore removed.

Any other comments?

import numpy as np
from sklearn.utils.extmath import log_logistic
import mpmath as mp

mp.dps = 100

# First about precision
log_logistic(np.array([36.7368005]))  # -2.22044605e-16
log_logistic(np.array([36.7368006]))  # 0
-np.logaddexp(0, -36.7368005)         # -1.1102231019822803e-16
-np.logaddexp(0, -36.7368006)         # -1.1102229909599743e-16
-mp.log1p(mp.exp(-36.7368005))        # mpf('-1.1102231019822803e-16')
-mp.log1p(mp.exp(-36.7368006))        # mpf('-1.1102229909599743e-16')

# Then about speed
x = np.linspace(-30, 30, 1000)
%timeit log_logistic(x)         # 73.4 µs ± 3.24 µs per loop
%timeit (-np.logaddexp(0, -x))  # 19.7 µs ± 539 ns per loop

@github-actions
Copy link

github-actions bot commented Oct 6, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: efd77f3. Link to the linter CI: here

@lorentzenchr
Copy link
Member Author

I set the label "no changelog needed" as it seems off to add an efficiency entry in whatsnew to the neural net module. One might get the wrong impression.

@jjerphan jjerphan added the good first PR to review Simple atomic PR to review label Oct 8, 2023
@jjerphan jjerphan removed the good first PR to review Simple atomic PR to review label Oct 8, 2023
See the blog post describing this implementation:
http://fa.bianp.net/blog/2013/numerical-optimizers-for-logistic-regression/
"""
is_1d = X.ndim == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was still "kind" of public. It can internally call np.logaddexp now.
Can we deprecate it and asking to use the numpy one and then we remove it in 2 versions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is not listed in our API docs. So it is not public.
Do you know of any user/3rd party using that function from us?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/search?q=log_logistic+language%3APython&type=code&l=Python&p=1 gives 904 hits, most of which are forks of scikit-learn. For me, that's strong evidence that we can remove. I can add a changelog and say with what function to replace it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it is not listed in our API docs.

Yes, we have some legacy and bad practices that should be fixed. We should have clear public, developer, and private tools.

So it is not public.

This is tricky because the public/private agreement would be the presence or not of a leading _.

Do you know of any user/3rd party using that function from us?

Apparently some do: https://github.com/search?q=from+sklearn.utils.extmath+import+log_logistic&type=code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, this is one third-party library: https://github.com/VowpalWabbit/vowpal_wabbit with 8.3k stars and 2k forks. Removing without notice will make a lot of people angry at once :)

I am still advocating for a deprecation. I know that this is frustrating for such a tools that should have never been public but this is safer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About vowpalwabbit, their only use is np.exp(log_logistic(...)) which is just crazy.

Copy link
Member

@glemaitre glemaitre Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep it with wrapping logaddexp and deprecate it?

Yes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not happy but will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you see, that's not a 5 minutes thing. It cost me 1/2h.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that prevented potential breakages which my hasty approving review definitely had not envisioned. 🤦

Thank you for the efforts, @lorentzenchr.

@glemaitre glemaitre self-requested a review October 10, 2023 07:29
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lorentzenchr Merging.

@glemaitre glemaitre enabled auto-merge (squash) October 10, 2023 07:33
@glemaitre glemaitre merged commit 3076f29 into scikit-learn:main Oct 10, 2023
25 checks passed
@lorentzenchr lorentzenchr deleted the rm_log_logistic branch October 10, 2023 18:18
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Oct 31, 2023
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants