Automatically force_all_finite=False in non-initial warm_start runs #10600

jnothman · 2018-02-07T12:48:33Z

For estimators that have a warm_start option, we generally expect that the same data is passed in for each call to fit as was done in the first. I think we can save a little on runtime by using check_array with force_all_finite=False in the case that warm_start=True and the model has already been fitted.

The text was updated successfully, but these errors were encountered:

cr458 · 2018-02-08T08:55:00Z

Can I take this?

jnothman · 2018-02-08T09:18:28Z

I'd be interested in seeing if other core devs think we should do this; there may not be consensus. But with that caveat you could certainly propose a PR. you could test it by sneaking a np.inf into the data before a second fit.

mohamed-ali · 2018-02-26T10:12:25Z

@jnothman, what's the status on this? I can work on it, if @cr458 is not available.

jnothman · 2018-02-26T20:42:36Z

we've still not heard if other core devs have an opinion. so you can offer a PR, but with no assurance it will ever get merged

…imator and GradientBoosting scikit-learn#10600

Geoffrey-Paris · 2021-12-28T20:02:44Z

Hello @jnothman

I worked on this issue and i submitted a PR for this.
I know this issue is quite old and it needs approval from other devs but here's what I've done so far. I tried to make the most generic change by updating the _validate_data method of the BaseEstimator class but there are still some other calls to check_array outside the _validate_data method.

amueller · 2022-01-14T20:42:24Z

I think warm_start has inconsistent semantics across modules. In linear models, you can warm-start with different datasets and potentially get a speed benefit but essentially an equivalent solution, while for trees, you have to warm-start on the same dataset. So I don't think we can actually do this consistently and probably first maybe clarify our warm-start semantics?

jnothman added Easy Well-defined and straightforward way to resolve Enhancement help wanted labels Feb 7, 2018

Geoffrey-Paris added a commit to Geoffrey-Paris/scikit-learn that referenced this issue Dec 28, 2021

scikit-learn#10600 force_all_finite=False on warmed-up forest estimators

28e6aed

Geoffrey-Paris added a commit to Geoffrey-Paris/scikit-learn that referenced this issue Dec 28, 2021

ENH doesnt check if finite in non-initial warm_start runs for BaseEst…

2ee0040

…imator and GradientBoosting scikit-learn#10600

Geoffrey-Paris added a commit to Geoffrey-Paris/scikit-learn that referenced this issue Dec 28, 2021

ENH Applied scikit-learn#10600 for coordinate descent

9612414

Geoffrey-Paris mentioned this issue Dec 28, 2021

ENH Automatically force_all_finite=False in non-initial warm_start runs for GradientBoosting #22089

Closed

cmarmo removed the help wanted label Jan 4, 2022

cmarmo removed the Easy Well-defined and straightforward way to resolve label Jan 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically force_all_finite=False in non-initial warm_start runs #10600

Automatically force_all_finite=False in non-initial warm_start runs #10600

jnothman commented Feb 7, 2018

cr458 commented Feb 8, 2018

jnothman commented Feb 8, 2018 via email

mohamed-ali commented Feb 26, 2018

jnothman commented Feb 26, 2018 via email

Geoffrey-Paris commented Dec 28, 2021

amueller commented Jan 14, 2022

Automatically force_all_finite=False in non-initial warm_start runs #10600

Automatically force_all_finite=False in non-initial warm_start runs #10600

Comments

jnothman commented Feb 7, 2018

cr458 commented Feb 8, 2018

jnothman commented Feb 8, 2018 via email

mohamed-ali commented Feb 26, 2018

jnothman commented Feb 26, 2018 via email

Geoffrey-Paris commented Dec 28, 2021

amueller commented Jan 14, 2022