Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] iforest chunks for score_samples #13283

Merged
merged 6 commits into from Mar 18, 2019

Conversation

@ngoix
Copy link
Contributor

commented Feb 26, 2019

fixes #12040

is currently based on top of #13260 (so do not review yet)
needs to be rebased on #13251

@agramfort

This comment has been minimized.

Copy link
Member

commented Mar 2, 2019

@ngoix please put the PR title to MRG when ready on your side

ENH chunk data - iforest score_samples
tests

fix flake8

fix ci

@ngoix ngoix force-pushed the ngoix:iforest_chunk branch from 3aac238 to 40a5a4b Mar 2, 2019

@@ -325,3 +326,26 @@ def test_behaviour_param():
clf2 = IsolationForest(behaviour='new', contamination='auto').fit(X_train)
assert_array_equal(clf1.decision_function([[2., 2.]]),
clf2.decision_function([[2., 2.]]))


# mock get_chunk_n_rows to actually test more than one chunk (here one

This comment has been minimized.

Copy link
@ngoix

ngoix Mar 2, 2019

Author Contributor

any idea how to merge these 2 tests ? they are the same, just with a different mocked chunk size

This comment has been minimized.

Copy link
@jnothman

jnothman Mar 6, 2019

Member

Use pytest's monkeypatch fixture rather than unittest.mock alone?

But for this to be a reasonable test, you also need to test that the Mock has been called (or does patch do that?)

@ngoix ngoix changed the title [WIP] iforest chunks for score_samples [MRG] iforest chunks for score_samples Mar 2, 2019

@ngoix

This comment has been minimized.

Copy link
Contributor Author

commented Mar 4, 2019

@agramfort this is ready for reviews

@agramfort
Copy link
Member

left a comment

Update what’s new needed

@jnothman
Copy link
Member

left a comment

Mostly looking good

" be removed in 0.22.", DeprecationWarning)
return self._threshold_

def _compute_chunked_score_samples(self, X, working_memory=None):

This comment has been minimized.

Copy link
@jnothman

jnothman Mar 6, 2019

Member

If not for testing, why do we need the working_melody parameter here?

This comment has been minimized.

Copy link
@jnothman

jnothman Mar 17, 2019

Member

Why do we need the working_memory parameter here?

This comment has been minimized.

Copy link
@ngoix

ngoix Mar 18, 2019

Author Contributor

it is not useful indeed, thanks

@@ -325,3 +326,26 @@ def test_behaviour_param():
clf2 = IsolationForest(behaviour='new', contamination='auto').fit(X_train)
assert_array_equal(clf1.decision_function([[2., 2.]]),
clf2.decision_function([[2., 2.]]))


# mock get_chunk_n_rows to actually test more than one chunk (here one

This comment has been minimized.

Copy link
@jnothman

jnothman Mar 6, 2019

Member

Use pytest's monkeypatch fixture rather than unittest.mock alone?

But for this to be a reasonable test, you also need to test that the Mock has been called (or does patch do that?)

@jnothman

This comment has been minimized.

Copy link
Member

commented Mar 11, 2019

Tests failing

ngoix added 3 commits Mar 12, 2019
@jnothman
Copy link
Member

left a comment

Otherwise lgtm

" be removed in 0.22.", DeprecationWarning)
return self._threshold_

def _compute_chunked_score_samples(self, X, working_memory=None):

This comment has been minimized.

Copy link
@jnothman

jnothman Mar 17, 2019

Member

Why do we need the working_memory parameter here?

@@ -144,6 +144,9 @@ Support for Python 3.4 and below has been officially dropped.
by avoiding keeping in memory each tree prediction. :issue:`13260` by
`Nicolas Goix`_.

- |Efficiency| :class:`ensemble.IsolationForest` now uses chunks of data at
prediction step. :issue:`13283` by `Nicolas Goix`_.

This comment has been minimized.

Copy link
@jnothman

jnothman Mar 17, 2019

Member

Perhaps say something like "capping the memory usage"

@jnothman jnothman merged commit c22b871 into scikit-learn:master Mar 18, 2019

16 checks passed

LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python 1 fixed alert
Details
ci/circleci: deploy Your tests passed on CircleCI!
Details
ci/circleci: doc Your tests passed on CircleCI!
Details
ci/circleci: doc-min-dependencies Your tests passed on CircleCI!
Details
ci/circleci: lint Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 92.37%)
Details
codecov/project 92.38% (+0.01%) compared to 12705bb
Details
scikit-learn.scikit-learn Build #20190318.8 succeeded
Details
scikit-learn.scikit-learn (Linux py35_conda_openblas) Linux py35_conda_openblas succeeded
Details
scikit-learn.scikit-learn (Linux py35_np_atlas) Linux py35_np_atlas succeeded
Details
scikit-learn.scikit-learn (Linux pylatest_conda) Linux pylatest_conda succeeded
Details
scikit-learn.scikit-learn (Windows py35_32) Windows py35_32 succeeded
Details
scikit-learn.scikit-learn (Windows py37_64) Windows py37_64 succeeded
Details
scikit-learn.scikit-learn (macOS pylatest_conda) macOS pylatest_conda succeeded
Details
@jnothman

This comment has been minimized.

Copy link
Member

commented Mar 18, 2019

THanks @ngoix!

xhlulu added a commit to xhlulu/scikit-learn that referenced this pull request Apr 28, 2019
xhlulu added a commit to xhlulu/scikit-learn that referenced this pull request Apr 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.