[MRG+2] LogisticRegression convert to float64 (newton-cg) #8835

Merged
merged 26 commits into from Jun 7, 2017

Conversation

Projects
None yet
8 participants
@massich
Contributor

massich commented May 5, 2017

Reference Issue

Fixes #8769

What does this implement/fix? Explain your changes.

Avoids logistic regression to aggressively cast the data to np.float64 when np.float32 is supplied.

Any other comments?

(only for the newton-cg case)

@massich

This comment has been minimized.

Show comment
Hide comment
@massich

massich May 5, 2017

Contributor

@GaelVaroquaux Actually fixing self.coefs_ was straight forward. Where do you wanna go from here?

Contributor

massich commented May 5, 2017

@GaelVaroquaux Actually fixing self.coefs_ was straight forward. Where do you wanna go from here?

@massich massich changed the title from Is/8769 to [WIP] LogisticRegression convert to float64 May 5, 2017

sklearn/linear_model/logistic.py
@@ -1281,9 +1287,9 @@ def fit(self, X, y, sample_weight=None):
self.n_iter_ = np.asarray(n_iter_, dtype=np.int32)[:, 0]
if self.multi_class == 'multinomial':
- self.coef_ = fold_coefs_[0][0]
+ self.coef_ = fold_coefs_[0][0].astype(np.float32)

This comment has been minimized.

@massich

massich May 5, 2017

Contributor

my bad it should be _dtype

@massich

massich May 5, 2017

Contributor

my bad it should be _dtype

@glemaitre

This comment has been minimized.

Show comment
Hide comment
@glemaitre

glemaitre May 5, 2017

Contributor

You can execute the PEP8 check locally:

bash ./build_tools/travis/flake8_diff.sh

That should be useful in the future.

Contributor

glemaitre commented May 5, 2017

You can execute the PEP8 check locally:

bash ./build_tools/travis/flake8_diff.sh

That should be useful in the future.

@massich massich changed the title from [WIP] LogisticRegression convert to float64 to [MRG] LogisticRegression convert to float64 May 19, 2017

sklearn/linear_model/logistic.py
+ _dtype = np.float64
+ if self.solver in ['newton-cg'] \
+ and isinstance(X, np.ndarray) and X.dtype in [np.float32]:
+ _dtype = np.float32

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux May 29, 2017

Member

check_X_y can take a list of acceptable dtypes as a dtype argument. I think that using this feature would be a better way of writing this code. The code would be something like

if self.solver in ['newtown-cg']:
   _dtype = [np.float64, np.float32]
@GaelVaroquaux

GaelVaroquaux May 29, 2017

Member

check_X_y can take a list of acceptable dtypes as a dtype argument. I think that using this feature would be a better way of writing this code. The code would be something like

if self.solver in ['newtown-cg']:
   _dtype = [np.float64, np.float32]
sklearn/linear_model/logistic.py
else:
- self.coef_ = np.asarray(fold_coefs_)
+ self.coef_ = np.asarray(fold_coefs_, dtype=_dtype)

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux May 29, 2017

Member

Is the conversion necessary here? In other word, if we get the code right, doesn't coefs_ get returned in the right dtype?

@GaelVaroquaux

GaelVaroquaux May 29, 2017

Member

Is the conversion necessary here? In other word, if we get the code right, doesn't coefs_ get returned in the right dtype?

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux May 29, 2017

Member

I suspect that the problem isn't really solved: if you look a bit further in the code, you will see that inside 'logistic_regression_path', check_X_y is called again with the np.float64 dtype. And there might be other instances of this problem.

Member

GaelVaroquaux commented May 29, 2017

I suspect that the problem isn't really solved: if you look a bit further in the code, you will see that inside 'logistic_regression_path', check_X_y is called again with the np.float64 dtype. And there might be other instances of this problem.

@massich massich changed the title from [MRG] LogisticRegression convert to float64 to [WIP] LogisticRegression convert to float64 May 30, 2017

@massich

Indeed, logistic_regression_path has a check_array with a np.float64 as a dtype. However, when logistic_regression_path is called with check_input=False, therefore X.dtype remains np.float32. (see here)

Still, w0 starts as an empty list and end up being a np.float64.(see here)

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux May 30, 2017

Member
Member

GaelVaroquaux commented May 30, 2017

@raghavrv

Thanks for the PR!

@@ -1203,7 +1205,12 @@ def fit(self, X, y, sample_weight=None):
raise ValueError("Tolerance for stopping criteria must be "
"positive; got (tol=%r)" % self.tol)
- X, y = check_X_y(X, y, accept_sparse='csr', dtype=np.float64,
+ if self.solver in ['newton-cg']:
+ _dtype = [np.float64, np.float32]

This comment has been minimized.

@raghavrv

raghavrv Jun 2, 2017

Member

Sorry if I am missing something, but why?

@raghavrv

raghavrv Jun 2, 2017

Member

Sorry if I am missing something, but why?

This comment has been minimized.

@massich

massich Jun 2, 2017

Contributor

The idea is that previously check_X_y was converting X and y into np.float64. This is fine, if the user passes a list as X, but if a user passes a np.float32 willingly converting it to np.float64 penalizes them in memory and speed.

Therefore, we are trying to keep the data in np.float32 if the user provides the data in such type.

@massich

massich Jun 2, 2017

Contributor

The idea is that previously check_X_y was converting X and y into np.float64. This is fine, if the user passes a list as X, but if a user passes a np.float32 willingly converting it to np.float64 penalizes them in memory and speed.

Therefore, we are trying to keep the data in np.float32 if the user provides the data in such type.

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Jun 2, 2017

Member

The fact that @raghavrv asks a question tells us that a short comment explaining the logic should probably be useful here.

@GaelVaroquaux

GaelVaroquaux Jun 2, 2017

Member

The fact that @raghavrv asks a question tells us that a short comment explaining the logic should probably be useful here.

This comment has been minimized.

@massich

massich Jun 2, 2017

Contributor

I think that @raghavrv was more concerned in the fact that we were passing a list rather than forcing one or the other. Once we checked that check_X_y was taking care of it, he was ok with it.

@raghavrv any comments?

@massich

massich Jun 2, 2017

Contributor

I think that @raghavrv was more concerned in the fact that we were passing a list rather than forcing one or the other. Once we checked that check_X_y was taking care of it, he was ok with it.

@raghavrv any comments?

+
+ for solver in ['newton-cg']:
+ for multi_class in ['ovr', 'multinomial']:
+

This comment has been minimized.

@raghavrv

raghavrv Jun 2, 2017

Member

can you remove this new line

@raghavrv

raghavrv Jun 2, 2017

Member

can you remove this new line

+
+def test_dtype_missmatch_to_profile():
+ # Test that np.float32 input data is not cast to np.float64 when possible
+

This comment has been minimized.

@raghavrv

raghavrv Jun 2, 2017

Member

and this newline too

@raghavrv

raghavrv Jun 2, 2017

Member

and this newline too

sklearn/utils/class_weight.py
@@ -41,12 +41,17 @@ def compute_class_weight(class_weight, classes, y):
# Import error caused by circular imports.
from ..preprocessing import LabelEncoder
+ if y.dtype == np.float32:
+ _dtype = np.float32

This comment has been minimized.

@raghavrv

raghavrv Jun 2, 2017

Member

why not _dtype=y.dtype...

is it so you can have y.dtype to be int and weight to be of float?

@raghavrv

raghavrv Jun 2, 2017

Member

why not _dtype=y.dtype...

is it so you can have y.dtype to be int and weight to be of float?

+
+ # Check accuracy consistency
+ lr_64 = LogisticRegression(solver=solver, multi_class=multi_class)
+ lr_64.fit(X, Y1)

This comment has been minimized.

@raghavrv

raghavrv Jun 2, 2017

Member

Can you ensure (maybe using astype?) X, Y1 are of float64 before this test? (If in future it is changed, this test will still pass)

@raghavrv

raghavrv Jun 2, 2017

Member

Can you ensure (maybe using astype?) X, Y1 are of float64 before this test? (If in future it is changed, this test will still pass)

+def test_dtype_match():
+ # Test that np.float32 input data is not cast to np.float64 when possible
+
+ X_ = np.array(X).astype(np.float32)

This comment has been minimized.

@raghavrv

raghavrv Jun 2, 2017

Member

X_32 = ... astype(32)
X_64 = ... astype(64)

@raghavrv

raghavrv Jun 2, 2017

Member

X_32 = ... astype(32)
X_64 = ... astype(64)

+ assert_almost_equal(lr_32.coef_, lr_64.coef_.astype(np.float32))
+
+
+def test_dtype_missmatch_to_profile():

This comment has been minimized.

@raghavrv

raghavrv Jun 2, 2017

Member

This test can be removed...

@raghavrv

raghavrv Jun 2, 2017

Member

This test can be removed...

@@ -608,10 +610,10 @@ def logistic_regression_path(X, y, pos_class=None, Cs=10, fit_intercept=True,
# and check length
# Otherwise set them to 1 for all examples
if sample_weight is not None:
- sample_weight = np.array(sample_weight, dtype=np.float64, order='C')
+ sample_weight = np.array(sample_weight, dtype=X.dtype, order='C')

This comment has been minimized.

@raghavrv

raghavrv Jun 2, 2017

Member

Should it be y.dtype?

cc: @agramfort

@raghavrv

raghavrv Jun 2, 2017

Member

Should it be y.dtype?

cc: @agramfort

This comment has been minimized.

@massich

massich Jun 2, 2017

Contributor

we were discussing with @glemaitre (and @GaelVaroquaux) to force X.dtype and y.dtype to be the same.

@massich

massich Jun 2, 2017

Contributor

we were discussing with @glemaitre (and @GaelVaroquaux) to force X.dtype and y.dtype to be the same.

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Jun 2, 2017

Member

Yes, I think the idea should be that the dtype of X conditions the dtype of the computation.

We should be an RFC about this, and include it in the docs.

@GaelVaroquaux

GaelVaroquaux Jun 2, 2017

Member

Yes, I think the idea should be that the dtype of X conditions the dtype of the computation.

We should be an RFC about this, and include it in the docs.

This comment has been minimized.

@massich

massich Jun 2, 2017

Contributor

see #8976

@massich

massich Jun 2, 2017

Contributor

see #8976

massich added some commits Jun 2, 2017

sklearn/linear_model/logistic.py
loss, p, w = _multinomial_loss(w, X, Y, alpha, sample_weight)
sample_weight = sample_weight[:, np.newaxis]
diff = sample_weight * (p - Y)
+ diff = diff.astype(X.dtype)

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Jun 2, 2017

Member

This line introduces a memory copy. I think that we should do it only if X.dtype != diff.dtype.

@GaelVaroquaux

GaelVaroquaux Jun 2, 2017

Member

This line introduces a memory copy. I think that we should do it only if X.dtype != diff.dtype.

This comment has been minimized.

@raghavrv

raghavrv Jun 2, 2017

Member

Yeah. we should set copy=False to avoid copying if it is of the same dtype already...

@raghavrv

raghavrv Jun 2, 2017

Member

Yeah. we should set copy=False to avoid copying if it is of the same dtype already...

sklearn/utils/class_weight.py
if set(y) - set(classes):
raise ValueError("classes should include all valid labels that can "
"be in y")
if class_weight is None or len(class_weight) == 0:
# uniform class weights
- weight = np.ones(classes.shape[0], dtype=np.float64, order='C')
+ weight = np.ones(classes.shape[0], dtype=_dtype, order='C')

This comment has been minimized.

@GaelVaroquaux

GaelVaroquaux Jun 2, 2017

Member

Hum, isn't there a risk here to be casting to integer dtypes, which could later create numerical errors?

@GaelVaroquaux

GaelVaroquaux Jun 2, 2017

Member

Hum, isn't there a risk here to be casting to integer dtypes, which could later create numerical errors?

This comment has been minimized.

@massich

massich Jun 2, 2017

Contributor

Actually class_weight doesn't need to be modified. I though I had fixed it.

@massich

massich Jun 2, 2017

Contributor

Actually class_weight doesn't need to be modified. I though I had fixed it.

This comment has been minimized.

@glemaitre

glemaitre Jun 2, 2017

Contributor

I think that I disagree. You will have an operation between sample_weight and class_weight for instance there. Because you make a multiplication in-place, I agree that sample_weight will be the expected type. However, class_weight should not always be casted to np.float64 if the type of sample_weight is np.float32. It will imply a conversion when making the array conversion.

What I mean is something like that:

x = np.random.random((1000000, 1)).astype(np.float32)
y = np.random.random((1000000, 1)).astype(np.float32)
%timeit x * y
807 µs ± 17.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
x = np.random.random((1000000, 1)).astype(np.float32)
y = np.random.random((1000000, 1)).astype(np.float64)
%timeit x * y
1.83 ms ± 8.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Does it make sense?

@glemaitre

glemaitre Jun 2, 2017

Contributor

I think that I disagree. You will have an operation between sample_weight and class_weight for instance there. Because you make a multiplication in-place, I agree that sample_weight will be the expected type. However, class_weight should not always be casted to np.float64 if the type of sample_weight is np.float32. It will imply a conversion when making the array conversion.

What I mean is something like that:

x = np.random.random((1000000, 1)).astype(np.float32)
y = np.random.random((1000000, 1)).astype(np.float32)
%timeit x * y
807 µs ± 17.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
x = np.random.random((1000000, 1)).astype(np.float32)
y = np.random.random((1000000, 1)).astype(np.float64)
%timeit x * y
1.83 ms ± 8.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Does it make sense?

@@ -337,10 +337,12 @@ def _multinomial_loss_grad(w, X, Y, alpha, sample_weight):
n_classes = Y.shape[1]
n_features = X.shape[1]
fit_intercept = (w.size == n_classes * (n_features + 1))
- grad = np.zeros((n_classes, n_features + bool(fit_intercept)))
+ grad = np.zeros((n_classes, n_features + bool(fit_intercept)),
+ dtype=X.dtype)
loss, p, w = _multinomial_loss(w, X, Y, alpha, sample_weight)

This comment has been minimized.

@agramfort

agramfort Jun 6, 2017

Member

_multinomial_loss does not return float32 if dtype is float32 for X and y?
that would avoid the diff = diff.astype(X.dtype, copy=False) below.

@agramfort

agramfort Jun 6, 2017

Member

_multinomial_loss does not return float32 if dtype is float32 for X and y?
that would avoid the diff = diff.astype(X.dtype, copy=False) below.

This comment has been minimized.

@massich

massich Jun 6, 2017

Contributor

Acutally, _multinomial_loss does keep all the types as float32. The problem is that Y in diff = sample_weight * (p - Y)(here) is of type int64 and therefore diff become float64.

Y is set (here) as the second parameter of args, based on target which is set at its time by y_bin or Y_multi. The former is fine, while the later is determined by the transform() of LabelBinarizer or LabelEncoder. (here, here)

We could transform target to X.dtype as in the saga case (here), in the following manner:

target = target.astype(X.dtype)
args = (X, target, 1. / C, sample_weight)

or we could propagate the change inside transform_fit.

any thoughts @agramfort ?

cc: @Henley13, @GaelVaroquaux, @raghavrv

@massich

massich Jun 6, 2017

Contributor

Acutally, _multinomial_loss does keep all the types as float32. The problem is that Y in diff = sample_weight * (p - Y)(here) is of type int64 and therefore diff become float64.

Y is set (here) as the second parameter of args, based on target which is set at its time by y_bin or Y_multi. The former is fine, while the later is determined by the transform() of LabelBinarizer or LabelEncoder. (here, here)

We could transform target to X.dtype as in the saga case (here), in the following manner:

target = target.astype(X.dtype)
args = (X, target, 1. / C, sample_weight)

or we could propagate the change inside transform_fit.

any thoughts @agramfort ?

cc: @Henley13, @GaelVaroquaux, @raghavrv

This comment has been minimized.

@agramfort

agramfort Jun 6, 2017

Member

I would convert Y to dtype on the top

@agramfort

agramfort Jun 6, 2017

Member

I would convert Y to dtype on the top

This comment has been minimized.

@TomDLT

TomDLT Jun 6, 2017

Member

We could transform target to X.dtype as in the saga case (here), in the following manner:

target = target.astype(X.dtype)
args = (X, target, 1. / C, sample_weight)

+1

@TomDLT

TomDLT Jun 6, 2017

Member

We could transform target to X.dtype as in the saga case (here), in the following manner:

target = target.astype(X.dtype)
args = (X, target, 1. / C, sample_weight)

+1

This comment has been minimized.

@massich

massich Jun 6, 2017

Contributor

@agramfort
y is already in the correnct form.
are you proposing:

Y_multi = le.fit_transform(y).astype(X.dtype, copy=False)
@massich

massich Jun 6, 2017

Contributor

@agramfort
y is already in the correnct form.
are you proposing:

Y_multi = le.fit_transform(y).astype(X.dtype, copy=False)

This comment has been minimized.

@agramfort

agramfort Jun 6, 2017

Member

yes.

+ X_64 = np.array(X).astype(np.float64)
+ y_64 = np.array(Y1).astype(np.float64)
+
+ for solver in ['newton-cg']:

This comment has been minimized.

@agramfort

agramfort Jun 6, 2017

Member

not all solvers are safe with cast to float64? if so it is documented somewhere?

@agramfort

agramfort Jun 6, 2017

Member

not all solvers are safe with cast to float64? if so it is documented somewhere?

This comment has been minimized.

@massich

massich Jun 6, 2017

Contributor

Not all the solvers are safe with cast to float32. The idea is to have X_64 and X_32 add the not safe solver, brake the test and track the failure.

@massich

massich Jun 6, 2017

Contributor

Not all the solvers are safe with cast to float32. The idea is to have X_64 and X_32 add the not safe solver, brake the test and track the failure.

This comment has been minimized.

@agramfort

agramfort Jun 6, 2017

Member

do one solver at a time but I think it's doable for all solvers

@agramfort

agramfort Jun 6, 2017

Member

do one solver at a time but I think it's doable for all solvers

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jun 6, 2017

Member
Member

GaelVaroquaux commented Jun 6, 2017

@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT Jun 6, 2017

Member

You should add a test with a sparse matrix X

Member

TomDLT commented Jun 6, 2017

You should add a test with a sparse matrix X

@Henley13

This comment has been minimized.

Show comment
Hide comment
@Henley13

Henley13 Jun 6, 2017

Contributor

Hi @TomDLT
By sparse, do you mean something like that?

X_sparse_32 = sp.csr_matrix(X, dtype=np.float32)
y_sparse_32 = sp.csr_matrix(Y1, dtype=np.float32)

    for solver in ['newton-cg']:
        for multi_class in ['ovr', 'multinomial']:

            # Check type consistency
            lr_32 = LogisticRegression(solver=solver, multi_class=multi_class)
            lr_32.fit(X_sparse_32, y_sparse_32)
            assert_equal(lr_32.coef_.dtype, X_sparse_32.dtype)
Contributor

Henley13 commented Jun 6, 2017

Hi @TomDLT
By sparse, do you mean something like that?

X_sparse_32 = sp.csr_matrix(X, dtype=np.float32)
y_sparse_32 = sp.csr_matrix(Y1, dtype=np.float32)

    for solver in ['newton-cg']:
        for multi_class in ['ovr', 'multinomial']:

            # Check type consistency
            lr_32 = LogisticRegression(solver=solver, multi_class=multi_class)
            lr_32.fit(X_sparse_32, y_sparse_32)
            assert_equal(lr_32.coef_.dtype, X_sparse_32.dtype)
@TomDLT

This comment has been minimized.

Show comment
Hide comment
@TomDLT

TomDLT Jun 6, 2017

Member

By sparse, do you mean something like that?

yep

Member

TomDLT commented Jun 6, 2017

By sparse, do you mean something like that?

yep

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jun 6, 2017

Member
Member

GaelVaroquaux commented Jun 6, 2017

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Jun 6, 2017

Member
Member

agramfort commented Jun 6, 2017

massich and others added some commits Jun 6, 2017

@massich

This comment has been minimized.

Show comment
Hide comment
@massich

massich Jun 6, 2017

Contributor

Do we miss anything ?

Contributor

massich commented Jun 6, 2017

Do we miss anything ?

@massich massich changed the title from [WIP] LogisticRegression convert to float64 to [MRG] LogisticRegression convert to float64 Jun 6, 2017

+
+ # Check accuracy consistency
+ lr_64 = LogisticRegression(solver=solver, multi_class=multi_class)
+ lr_64.fit(X_64, y_64)

This comment has been minimized.

@TomDLT

TomDLT Jun 6, 2017

Member

please add:

assert_equal(lr_64.coef_.dtype, X_64.dtype)

otherwise this test passes when we transform everything to 32 bits

@TomDLT

TomDLT Jun 6, 2017

Member

please add:

assert_equal(lr_64.coef_.dtype, X_64.dtype)

otherwise this test passes when we transform everything to 32 bits

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Jun 6, 2017

Member

+1 for MRG if travis is happy

Member

agramfort commented Jun 6, 2017

+1 for MRG if travis is happy

@agramfort agramfort changed the title from [MRG] LogisticRegression convert to float64 to [MRG+1] LogisticRegression convert to float64 Jun 6, 2017

@TomDLT TomDLT changed the title from [MRG+1] LogisticRegression convert to float64 to [MRG+2] LogisticRegression convert to float64 Jun 6, 2017

@massich massich changed the title from [MRG+2] LogisticRegression convert to float64 to [MRG+2] LogisticRegression convert to float64 (newton-cg) Jun 6, 2017

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jun 6, 2017

Member
Member

GaelVaroquaux commented Jun 6, 2017

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jun 6, 2017

Member

Anybody has ideas what's wrong with AppVeyor. Cc @ogrisel @lesteve

Member

GaelVaroquaux commented Jun 6, 2017

Anybody has ideas what's wrong with AppVeyor. Cc @ogrisel @lesteve

@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jun 6, 2017

Member

Before we merge, this warrants a whats_new entry.

Member

GaelVaroquaux commented Jun 6, 2017

Before we merge, this warrants a whats_new entry.

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Jun 6, 2017

Member

appveyor is not happy :(

Member

agramfort commented Jun 6, 2017

appveyor is not happy :(

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Jun 7, 2017

Member
Member

jnothman commented Jun 7, 2017

@jnothman

This comment has been minimized.

Show comment
Hide comment
@jnothman

jnothman Jun 7, 2017

Member
Member

jnothman commented Jun 7, 2017

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Jun 7, 2017

Member

all green merging

Member

agramfort commented Jun 7, 2017

all green merging

@agramfort agramfort merged commit 39a4658 into scikit-learn:master Jun 7, 2017

3 of 5 checks passed

codecov/patch No report found to compare against
Details
codecov/project No report found to compare against
Details
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@GaelVaroquaux

This comment has been minimized.

Show comment
Hide comment
@GaelVaroquaux

GaelVaroquaux Jun 7, 2017

Member

Whats_new entry would have been good :)

Member

GaelVaroquaux commented Jun 7, 2017

Whats_new entry would have been good :)

@agramfort

This comment has been minimized.

Show comment
Hide comment
@agramfort

agramfort Jun 7, 2017

Member
Member

agramfort commented Jun 7, 2017

Sundrique added a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

[MRG+2] LogisticRegression convert to float64 (newton-cg) (#8835)
* Add a test to ensure not changing the input's data type

Test that np.float32 input data is not cast to np.float64 when using LR + newton-cg

* [WIP] Force X to remain float32. (self.coef_ remains float64 even if X is not)

* [WIP] ensure self.coef_ same type as X

* keep the np.float32 when multi_class='multinomial'

* Avoid hardcoded type for multinomial

* pass flake8

* Ensure that the results in 32bits are the same as in 64

* Address Gael's comments for multi_class=='ovr'

* Add multi_class=='multinominal' to test

* Add support for multi_class=='multinominal'

* prefer float64 to float32

* Force X and y to have the same type

* Revert "Add support for multi_class=='multinominal'"

This reverts commit 4ac33e8.

* remvert more stuff

* clean up some commmented code

* allow class_weight to take advantage of float32

* Add a test where X.dtype is different of y.dtype

* Address @raghavrv comments

* address the rest of @raghavrv's comments

* Revert class_weight

* Avoid copying if dtype matches

* Address alex comment to the cast from inside _multinomial_loss_grad

* address alex comment

* add sparsity test

* Addressed Tom comment of checking that we keep the 64 aswell

dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

[MRG+2] LogisticRegression convert to float64 (newton-cg) (#8835)
* Add a test to ensure not changing the input's data type

Test that np.float32 input data is not cast to np.float64 when using LR + newton-cg

* [WIP] Force X to remain float32. (self.coef_ remains float64 even if X is not)

* [WIP] ensure self.coef_ same type as X

* keep the np.float32 when multi_class='multinomial'

* Avoid hardcoded type for multinomial

* pass flake8

* Ensure that the results in 32bits are the same as in 64

* Address Gael's comments for multi_class=='ovr'

* Add multi_class=='multinominal' to test

* Add support for multi_class=='multinominal'

* prefer float64 to float32

* Force X and y to have the same type

* Revert "Add support for multi_class=='multinominal'"

This reverts commit 4ac33e8.

* remvert more stuff

* clean up some commmented code

* allow class_weight to take advantage of float32

* Add a test where X.dtype is different of y.dtype

* Address @raghavrv comments

* address the rest of @raghavrv's comments

* Revert class_weight

* Avoid copying if dtype matches

* Address alex comment to the cast from inside _multinomial_loss_grad

* address alex comment

* add sparsity test

* Addressed Tom comment of checking that we keep the 64 aswell

dmohns added a commit to dmohns/scikit-learn that referenced this pull request Aug 7, 2017

[MRG+2] LogisticRegression convert to float64 (newton-cg) (#8835)
* Add a test to ensure not changing the input's data type

Test that np.float32 input data is not cast to np.float64 when using LR + newton-cg

* [WIP] Force X to remain float32. (self.coef_ remains float64 even if X is not)

* [WIP] ensure self.coef_ same type as X

* keep the np.float32 when multi_class='multinomial'

* Avoid hardcoded type for multinomial

* pass flake8

* Ensure that the results in 32bits are the same as in 64

* Address Gael's comments for multi_class=='ovr'

* Add multi_class=='multinominal' to test

* Add support for multi_class=='multinominal'

* prefer float64 to float32

* Force X and y to have the same type

* Revert "Add support for multi_class=='multinominal'"

This reverts commit 4ac33e8.

* remvert more stuff

* clean up some commmented code

* allow class_weight to take advantage of float32

* Add a test where X.dtype is different of y.dtype

* Address @raghavrv comments

* address the rest of @raghavrv's comments

* Revert class_weight

* Avoid copying if dtype matches

* Address alex comment to the cast from inside _multinomial_loss_grad

* address alex comment

* add sparsity test

* Addressed Tom comment of checking that we keep the 64 aswell

NelleV added a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

[MRG+2] LogisticRegression convert to float64 (newton-cg) (#8835)
* Add a test to ensure not changing the input's data type

Test that np.float32 input data is not cast to np.float64 when using LR + newton-cg

* [WIP] Force X to remain float32. (self.coef_ remains float64 even if X is not)

* [WIP] ensure self.coef_ same type as X

* keep the np.float32 when multi_class='multinomial'

* Avoid hardcoded type for multinomial

* pass flake8

* Ensure that the results in 32bits are the same as in 64

* Address Gael's comments for multi_class=='ovr'

* Add multi_class=='multinominal' to test

* Add support for multi_class=='multinominal'

* prefer float64 to float32

* Force X and y to have the same type

* Revert "Add support for multi_class=='multinominal'"

This reverts commit 4ac33e8.

* remvert more stuff

* clean up some commmented code

* allow class_weight to take advantage of float32

* Add a test where X.dtype is different of y.dtype

* Address @raghavrv comments

* address the rest of @raghavrv's comments

* Revert class_weight

* Avoid copying if dtype matches

* Address alex comment to the cast from inside _multinomial_loss_grad

* address alex comment

* add sparsity test

* Addressed Tom comment of checking that we keep the 64 aswell

paulha added a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

[MRG+2] LogisticRegression convert to float64 (newton-cg) (#8835)
* Add a test to ensure not changing the input's data type

Test that np.float32 input data is not cast to np.float64 when using LR + newton-cg

* [WIP] Force X to remain float32. (self.coef_ remains float64 even if X is not)

* [WIP] ensure self.coef_ same type as X

* keep the np.float32 when multi_class='multinomial'

* Avoid hardcoded type for multinomial

* pass flake8

* Ensure that the results in 32bits are the same as in 64

* Address Gael's comments for multi_class=='ovr'

* Add multi_class=='multinominal' to test

* Add support for multi_class=='multinominal'

* prefer float64 to float32

* Force X and y to have the same type

* Revert "Add support for multi_class=='multinominal'"

This reverts commit 4ac33e8.

* remvert more stuff

* clean up some commmented code

* allow class_weight to take advantage of float32

* Add a test where X.dtype is different of y.dtype

* Address @raghavrv comments

* address the rest of @raghavrv's comments

* Revert class_weight

* Avoid copying if dtype matches

* Address alex comment to the cast from inside _multinomial_loss_grad

* address alex comment

* add sparsity test

* Addressed Tom comment of checking that we keep the 64 aswell

AishwaryaRK added a commit to AishwaryaRK/scikit-learn that referenced this pull request Aug 29, 2017

[MRG+2] LogisticRegression convert to float64 (newton-cg) (#8835)
* Add a test to ensure not changing the input's data type

Test that np.float32 input data is not cast to np.float64 when using LR + newton-cg

* [WIP] Force X to remain float32. (self.coef_ remains float64 even if X is not)

* [WIP] ensure self.coef_ same type as X

* keep the np.float32 when multi_class='multinomial'

* Avoid hardcoded type for multinomial

* pass flake8

* Ensure that the results in 32bits are the same as in 64

* Address Gael's comments for multi_class=='ovr'

* Add multi_class=='multinominal' to test

* Add support for multi_class=='multinominal'

* prefer float64 to float32

* Force X and y to have the same type

* Revert "Add support for multi_class=='multinominal'"

This reverts commit 4ac33e8.

* remvert more stuff

* clean up some commmented code

* allow class_weight to take advantage of float32

* Add a test where X.dtype is different of y.dtype

* Address @raghavrv comments

* address the rest of @raghavrv's comments

* Revert class_weight

* Avoid copying if dtype matches

* Address alex comment to the cast from inside _multinomial_loss_grad

* address alex comment

* add sparsity test

* Addressed Tom comment of checking that we keep the 64 aswell

maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

[MRG+2] LogisticRegression convert to float64 (newton-cg) (#8835)
* Add a test to ensure not changing the input's data type

Test that np.float32 input data is not cast to np.float64 when using LR + newton-cg

* [WIP] Force X to remain float32. (self.coef_ remains float64 even if X is not)

* [WIP] ensure self.coef_ same type as X

* keep the np.float32 when multi_class='multinomial'

* Avoid hardcoded type for multinomial

* pass flake8

* Ensure that the results in 32bits are the same as in 64

* Address Gael's comments for multi_class=='ovr'

* Add multi_class=='multinominal' to test

* Add support for multi_class=='multinominal'

* prefer float64 to float32

* Force X and y to have the same type

* Revert "Add support for multi_class=='multinominal'"

This reverts commit 4ac33e8.

* remvert more stuff

* clean up some commmented code

* allow class_weight to take advantage of float32

* Add a test where X.dtype is different of y.dtype

* Address @raghavrv comments

* address the rest of @raghavrv's comments

* Revert class_weight

* Avoid copying if dtype matches

* Address alex comment to the cast from inside _multinomial_loss_grad

* address alex comment

* add sparsity test

* Addressed Tom comment of checking that we keep the 64 aswell

jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017

[MRG+2] LogisticRegression convert to float64 (newton-cg) (#8835)
* Add a test to ensure not changing the input's data type

Test that np.float32 input data is not cast to np.float64 when using LR + newton-cg

* [WIP] Force X to remain float32. (self.coef_ remains float64 even if X is not)

* [WIP] ensure self.coef_ same type as X

* keep the np.float32 when multi_class='multinomial'

* Avoid hardcoded type for multinomial

* pass flake8

* Ensure that the results in 32bits are the same as in 64

* Address Gael's comments for multi_class=='ovr'

* Add multi_class=='multinominal' to test

* Add support for multi_class=='multinominal'

* prefer float64 to float32

* Force X and y to have the same type

* Revert "Add support for multi_class=='multinominal'"

This reverts commit 4ac33e8.

* remvert more stuff

* clean up some commmented code

* allow class_weight to take advantage of float32

* Add a test where X.dtype is different of y.dtype

* Address @raghavrv comments

* address the rest of @raghavrv's comments

* Revert class_weight

* Avoid copying if dtype matches

* Address alex comment to the cast from inside _multinomial_loss_grad

* address alex comment

* add sparsity test

* Addressed Tom comment of checking that we keep the 64 aswell
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment