Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogisticRegression convert to float64 #8769

Closed
GaelVaroquaux opened this Issue Apr 20, 2017 · 14 comments

Comments

9 participants
@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Apr 20, 2017

Looking at the code of LogisticRegression, I have just noticed that it converts automatically the data to float64. I would expect at least the SAG, SAGA, newton-cg and lbfgs solvers to be able to work with float32.

Also, in a similar line of thoughts, the code converts to C-ordered data. How useful / necessary is this for solvers others than liblinear.

I am asking these questions because it seems that we could reduce the memory footprint of LogisticRegression.

Ping @arthurmensch for the SAG part.

@massich

This comment has been minimized.

Copy link
Contributor

massich commented Apr 20, 2017

I can work on this

@arthurmensch

This comment has been minimized.

Copy link
Contributor

arthurmensch commented Apr 20, 2017

@mblondel

This comment has been minimized.

Copy link
Member

mblondel commented Apr 25, 2017

Maybe not for L-BFGS:
scipy/scipy#4873

@arthurmensch

This comment has been minimized.

Copy link
Contributor

arthurmensch commented Jun 7, 2017

@massich

This comment has been minimized.

Copy link
Contributor

massich commented Jun 7, 2017

Just to keep track of what we the effort:

@massich

This comment has been minimized.

Copy link
Contributor

massich commented Jun 7, 2017

@arthurmensch if you take care of fuse type in sag.pyx (see PR #9020) make sure to talk to @Henley13

@ogrisel

This comment has been minimized.

Copy link
Member

ogrisel commented Jun 7, 2017

saga and sag should indeed be addressed in the same PR.

@massich

This comment has been minimized.

Copy link
Contributor

massich commented Jun 9, 2017

@TomDLT suggested to add type32 type 64 support for _preprocess_data here

@vene

This comment has been minimized.

Copy link
Member

vene commented Jun 11, 2017

can we add isotonic regression to that list? I'll be happy to take it

@massich

This comment has been minimized.

Copy link
Contributor

massich commented Jun 11, 2017

@massich

This comment has been minimized.

Copy link
Contributor

massich commented Jun 14, 2017

@ogrisel suggested in #9033 to use [np.float64, np.float32] as the default policy. As here for logistic regression:

if self.solver == 'lbfgs':
    # scipy lbfgs does not support float32 yet:
    # https://github.com/scipy/scipy/issues/4873
    _dtype = np.float64
else:
    # all other solvers work at both float precision levels
    _dtype = [np.float64, np.float32]

As well as adding a test to ensure that type of the predicted output remains consistent. i.e:

    assert_equal(clf_32.predict(X_32).dtype, X_32.dtype)
    assert_equal(clf_64.predict(X_64).dtype, X_64.dtype)

cc: @Henley13, @raghavrv, @ncordier, @vene

@raghavrv

This comment has been minimized.

Copy link
Member

raghavrv commented Jun 14, 2017

@ogrisel advised me that

assert clf_32.predict(X_32).dtype == X_32.dtype
...

is better as it leads to more informative error messages in pytests to which we will be switching to soon.

@jnothman

This comment has been minimized.

Copy link
Member

jnothman commented Jun 14, 2017

@GaelVaroquaux

This comment has been minimized.

Copy link
Member Author

GaelVaroquaux commented Feb 27, 2019

Fixed by #13273

Sprint Paris 2019 automation moved this from In progress to Done Feb 27, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.