Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogisticRegression convert to float64 #8769

Closed
GaelVaroquaux opened this issue Apr 20, 2017 · 14 comments · Fixed by #8835
Closed

LogisticRegression convert to float64 #8769

GaelVaroquaux opened this issue Apr 20, 2017 · 14 comments · Fixed by #8835
Labels
Enhancement help wanted Moderate Anything that requires some knowledge of conventions and best practices

Comments

@GaelVaroquaux
Copy link
Member

Looking at the code of LogisticRegression, I have just noticed that it converts automatically the data to float64. I would expect at least the SAG, SAGA, newton-cg and lbfgs solvers to be able to work with float32.

Also, in a similar line of thoughts, the code converts to C-ordered data. How useful / necessary is this for solvers others than liblinear.

I am asking these questions because it seems that we could reduce the memory footprint of LogisticRegression.

Ping @arthurmensch for the SAG part.

@GaelVaroquaux GaelVaroquaux added Enhancement Moderate Anything that requires some knowledge of conventions and best practices Need Contributor labels Apr 20, 2017
@massich
Copy link
Contributor

massich commented Apr 20, 2017

I can work on this

@arthurmensch
Copy link
Contributor

arthurmensch commented Apr 20, 2017 via email

@mblondel
Copy link
Member

Maybe not for L-BFGS:
scipy/scipy#4873

@arthurmensch
Copy link
Contributor

arthurmensch commented Jun 7, 2017 via email

@massich
Copy link
Contributor

massich commented Jun 7, 2017

Just to keep track of what we the effort:

@massich
Copy link
Contributor

massich commented Jun 7, 2017

@arthurmensch if you take care of fuse type in sag.pyx (see PR #9020) make sure to talk to @Henley13

@ogrisel
Copy link
Member

ogrisel commented Jun 7, 2017

saga and sag should indeed be addressed in the same PR.

@massich
Copy link
Contributor

massich commented Jun 9, 2017

@TomDLT suggested to add type32 type 64 support for _preprocess_data here

@vene
Copy link
Member

vene commented Jun 11, 2017

can we add isotonic regression to that list? I'll be happy to take it

@massich
Copy link
Contributor

massich commented Jun 11, 2017 via email

@massich
Copy link
Contributor

massich commented Jun 14, 2017

@ogrisel suggested in #9033 to use [np.float64, np.float32] as the default policy. As here for logistic regression:

if self.solver == 'lbfgs':
    # scipy lbfgs does not support float32 yet:
    # https://github.com/scipy/scipy/issues/4873
    _dtype = np.float64
else:
    # all other solvers work at both float precision levels
    _dtype = [np.float64, np.float32]

As well as adding a test to ensure that type of the predicted output remains consistent. i.e:

    assert_equal(clf_32.predict(X_32).dtype, X_32.dtype)
    assert_equal(clf_64.predict(X_64).dtype, X_64.dtype)

cc: @Henley13, @raghavrv, @ncordier, @vene

@raghavrv
Copy link
Member

@ogrisel advised me that

assert clf_32.predict(X_32).dtype == X_32.dtype
...

is better as it leads to more informative error messages in pytests to which we will be switching to soon.

@jnothman
Copy link
Member

jnothman commented Jun 14, 2017 via email

@GaelVaroquaux
Copy link
Member Author

Fixed by #13273

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement help wanted Moderate Anything that requires some knowledge of conventions and best practices
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

9 participants