Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Support for infinite values in GBDTs #14406

Merged
merged 3 commits into from Jul 19, 2019

Conversation

@NicolasHug
Copy link
Contributor

commented Jul 18, 2019

ping @ogrisel @adrinjalali

I think we need this merged before the missing values support :)

# This is not strictly True, but it's needed since
# force_all_finite=False means accept both nans and infinite values.
# Without the tag, common checks would fail.
# This comment must be removed once we merge PR 13911

This comment has been minimized.

Copy link
@adrinjalali

adrinjalali Jul 18, 2019

Member

Maybe add a "TODO", we sometimes go through them and it'll be easier to find it then. But if you're gonna fix it yourself, then no big deal.

@adrinjalali

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

ping when tests pass?

@NicolasHug

This comment has been minimized.

Copy link
Contributor Author

commented Jul 18, 2019

ping @adrinjalali They pass ^^ it's a docker issue

@ogrisel
Copy link
Member

left a comment

LGTM. Just a quick comment to make the atol in a test more easy to understand but not big deal. Feel free to merge without addressing it if you don't like my suggestion :)


gbdt = HistGradientBoostingRegressor(min_samples_leaf=1)
gbdt.fit(X, y)
np.testing.assert_allclose(gbdt.predict(X), y, atol=1e-4)

This comment has been minimized.

Copy link
@ogrisel

ogrisel Jul 19, 2019

Member

Why such a high value for atol? Maybe max_iter it too small for the default value of the learning rate? Maybe you could set the learning rate to 1.0 and a single split in a single tree (max_iter=1, max_leaf_nodes=2)would be enough to perfectly fit the data?

@ogrisel

This comment has been minimized.

Copy link
Member

commented Jul 19, 2019

I launched a rebuild of azure and circle as the failures did not look related to this PR.

@ogrisel

This comment has been minimized.

Copy link
Member

commented Jul 19, 2019

The tests pass. Let's merge, we can always improve the test later :)

@ogrisel ogrisel merged commit dd78658 into scikit-learn:master Jul 19, 2019

15 of 17 checks passed

LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No new or fixed alerts
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 96.87%)
Details
codecov/project Absolute coverage decreased by -<.01% but relative coverage increased by +3.12% compared to 5d95b7a
Details
scikit-learn.scikit-learn Build #20190719.35 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_ubuntu_atlas) Linux py35_ubuntu_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda_mkl_pandas) Linux pylatest_conda_mkl_pandas succeeded
Details
scikit-learn.scikit-learn (Linux32 py35_ubuntu_atlas_32bit) Linux32 py35_ubuntu_atlas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py35_pip_openblas_32bit) Windows py35_pip_openblas_32bit succeeded
Details
scikit-learn.scikit-learn (Windows py37_conda_mkl) Windows py37_conda_mkl succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda_mkl) macOS pylatest_conda_mkl succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.