Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically force_all_finite=False in non-initial warm_start runs #10600

Open
jnothman opened this issue Feb 7, 2018 · 6 comments
Open

Automatically force_all_finite=False in non-initial warm_start runs #10600

jnothman opened this issue Feb 7, 2018 · 6 comments

Comments

@jnothman
Copy link
Member

jnothman commented Feb 7, 2018

For estimators that have a warm_start option, we generally expect that the same data is passed in for each call to fit as was done in the first. I think we can save a little on runtime by using check_array with force_all_finite=False in the case that warm_start=True and the model has already been fitted.

@jnothman jnothman added Easy Well-defined and straightforward way to resolve Enhancement help wanted labels Feb 7, 2018
@cr458
Copy link

cr458 commented Feb 8, 2018

Can I take this?

@jnothman
Copy link
Member Author

jnothman commented Feb 8, 2018 via email

@mohamed-ali
Copy link
Contributor

@jnothman, what's the status on this? I can work on it, if @cr458 is not available.

@jnothman
Copy link
Member Author

jnothman commented Feb 26, 2018 via email

@Geoffrey-Paris
Copy link
Contributor

Hello @jnothman

I worked on this issue and i submitted a PR for this.
I know this issue is quite old and it needs approval from other devs but here's what I've done so far. I tried to make the most generic change by updating the _validate_data method of the BaseEstimator class but there are still some other calls to check_array outside the _validate_data method.

@cmarmo cmarmo removed the help wanted label Jan 4, 2022
@amueller
Copy link
Member

I think warm_start has inconsistent semantics across modules. In linear models, you can warm-start with different datasets and potentially get a speed benefit but essentially an equivalent solution, while for trees, you have to warm-start on the same dataset. So I don't think we can actually do this consistently and probably first maybe clarify our warm-start semantics?

@cmarmo cmarmo removed the Easy Well-defined and straightforward way to resolve label Jan 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants