[MRG+2] Fix #6420 Cloning decision tree estimators breaks criterion objects #7680

olologin · 2016-10-16T18:37:39Z

Reference Issue

What does this implement/fix? Explain your changes.

This fix changes __reduce__ methods of criterion objects which are used for pickling/cloning.

There was memory management problem, function sizet_ptr_to_ndarray from tree/_utils.pyx returned np.array object, but this np.array object didn't own underlying memory. In some cases we call __repr__ on criterion object, this __repr__ returns needed info for pickling/cloning, this info includes n_classes array (for ClassificationCriterion), and Criterion object gets removed. If we read info returned by __repr__ after this - n_classes will be filled with garbage.

from sklearn.tree import tree
import copy
import numpy as np

gini = tree._criterion.ClassificationCriterion(n_outputs=1, n_classes=np.array([2]))
gini.__reduce__()
copy.deepcopy(gini).__reduce__()
    Out[1] (sklearn.tree._criterion.ClassificationCriterion, (1, array([4314681712])), {})

Also I added __repr__ functions for classes which didn't have them, because otherwise class of object after unpickling/cloning changes to more common. For example after cloning Gini criterion we will get ClassificationCriterion.

from sklearn.tree import tree
import copy
import numpy as np

gini = tree._criterion.Gini(n_outputs=1, n_classes=np.array([2]))
gini_copy = copy.deepcopy(gini)
gini_copy.__class__
    Out[1]: sklearn.tree._criterion.ClassificationCriterion

Any other comments?

Make suggestions

olologin · 2016-10-16T18:55:20Z

@amueller, @jnothman

Hmm, travis-ci fails with:

ValueError: sklearn.tree._criterion.Criterion has the wrong size, try recompiling. Expected 160, got 152
ImportError: cannot import name _tree

I saw this error on local machine (Python 2.7 Ubuntu 16.04), but after make clean it disappeared. Can we do something similar on server? Seems like cache related problem for me.

nelson-liu · 2016-10-17T04:37:03Z

@olologin , yup that's an issue with the travis cache. maybe time to look into #7094 again?

raghavrv · 2016-10-17T17:09:02Z

Wait I can reset the cache for you. Thanks for working on this. I'll review in a while

amueller · 2016-10-17T22:00:23Z

Maybe @jmschrei also wants to check it out, or @glouppe ?

amueller · 2016-10-17T22:00:42Z

please add a test!

raghavrv · 2016-10-17T22:13:30Z

sklearn/tree/_criterion.pyx

        free(self.n_classes)

    def __reduce__(self):
        return (ClassificationCriterion,
                (self.n_outputs,
-                 sizet_ptr_to_ndarray(self.n_classes, self.n_outputs)),
+                 sizet_ptr_to_ndarray(self.n_classes, self.n_outputs).copy()),


Why do you make a copy?

because numpy array which is returned by

sizet_ptr_to_ndarray(self.n_classes, self.n_outputs))

Doesn't own underlying memory (I've added a comment in sizet_ptr_to_ndarray), it just uses pointer on values which are stored in C array self.n_classes. When you get this tuple from __repr__, and object which you got it from gets removed, you can't read this tuple anymore (You can actually, but self.n_classes gets freed, and returned numpy array will show you some garbage, because it still uses pointer to provided C array for him). copy method does deep copy, and resulting array will own underlying memory.

I could alternatively just make arr = np.zeros(self.n_outputs), and copy all values to it in cycle. But this approach with sizet_ptr_to_ndarray().copy() existed in_tree.pyx before me, so I thought it will be more convenient to use it here too.

jnothman · 2016-10-18T10:03:21Z

sklearn/tree/_criterion.pyx

@@ -596,6 +602,12 @@ cdef class Gini(ClassificationCriterion):
              = 1 - \sum_{k=0}^{K-1} count_k ** 2
    """

+    def __reduce__(self):


Why did we not need these before? Is it not enough to use type(self) as the first member of the tuple in {Classification,Regression}Criterion?

Hmm, you're right, it's enough to use type(self) in {Classification,Regression}Criterion.

jnothman · 2016-10-18T10:04:47Z

sklearn/tree/_utils.pyx

@@ -62,7 +62,10 @@ cdef inline UINT32_t our_rand_r(UINT32_t* seed) nogil:


 cdef inline np.ndarray sizet_ptr_to_ndarray(SIZE_t* data, SIZE_t size):
-    """Encapsulate data into a 1D numpy array of intp's."""
+    """Encapsulate data into a 1D numpy array of intp's.
+       You should ensure that the provided data pointer is not freed while the returned array


It seems like everywhere this is used it is now used with a copy. I propose we just do the copying here. WDYT? @raghavrv?

jnothman · 2016-10-18T10:05:41Z

Thanks for both your great investigation and patch!

jnothman

I'm not sure if we want a separate non-regression test for clone... And apart from these nitpicks and making sizet_ptr_to_ndarray do the copying, this LGTM

jnothman · 2016-10-18T23:16:51Z

sklearn/tree/tests/test_tree.py

@@ -1609,3 +1612,22 @@ def test_mae():
    dt_mae.fit([[3],[5],[3],[8],[5]],[6,7,3,4,3], [0.6,0.3,0.1,1.0,0.3])
    assert_array_equal(dt_mae.tree_.impurity, [7.0/2.3, 3.0/0.7, 4.0/1.6])
    assert_array_equal(dt_mae.tree_.value.flat, [4.0, 6.0, 4.0])
+
+def test_criterion_copy():


PEP8: extra blank line, please

jnothman · 2016-10-18T23:17:39Z

sklearn/tree/tests/test_tree.py

+    n_classes = np.arange(3)
+    for _, typename in CRITERIA_CLF.items():
+        criteria = typename(n_outputs, n_classes)
+        typename_, (n_outputs_, n_classes_), _ = copy(criteria).__reduce__()


Please check deepcopy too, just to be sure

jnothman · 2016-10-18T23:18:19Z

sklearn/tree/tests/test_tree.py

+        typename_, (n_outputs_, n_samples_), _ = copy(criteria).__reduce__()
+        assert_equal(typename, typename_)
+        assert_equal(n_outputs, n_outputs_)
+        assert_equal(n_samples, n_samples_)


pedantic: should have newline at end of file

raghavrv · 2016-10-19T02:48:10Z

sklearn/tree/_utils.pyx

@@ -65,7 +65,7 @@ cdef inline np.ndarray sizet_ptr_to_ndarray(SIZE_t* data, SIZE_t size):
    """Encapsulate data into a 1D numpy array of intp's."""
    cdef np.npy_intp shape[1]
    shape[0] = <np.npy_intp> size
-    return np.PyArray_SimpleNewFromData(1, shape, np.NPY_INTP, data)
+    return np.PyArray_SimpleNewFromData(1, shape, np.NPY_INTP, data).copy()


@jnothman Thanks for suggesting this change. Indeed this is better.

@olologin Could you also remove the copy from this line and update the comment there to note that the function returns a copy and does not refer to the passed pointers...

Additionally it would be nice to change the docstring of this helper to note that we return a copy of numpy array and not not simply 'Encapsulate data' as it is mentioned currently...

raghavrv · 2016-10-19T03:49:56Z

sklearn/tree/tests/test_tree.py

+    for copy_func in [copy.copy, copy.deepcopy]:
+        for _, typename in CRITERIA_CLF.items():
+            criteria = typename(n_outputs, n_classes)
+            result = copy_func(criteria).__reduce__()


Rather could you pickle, load all the Criteria?

Currently this fails in master...

import pickle from sklearn.tree._criterion import FriedmanMSE fmse = FriedmanMSE(1, 10) pickle.loads(pickle.dumps(fmse))

I've added pickling to this list.
Pickle and copy could work differently, but particularly with criteria they use the same reduce method.

raghavrv · 2016-10-19T03:50:01Z

sklearn/tree/tests/test_tree.py

 from sklearn import datasets

 from sklearn.utils import compute_sample_weight

 CLF_CRITERIONS = ("gini", "entropy")
-REG_CRITERIONS = ("mse", "mae")
+REG_CRITERIONS = ("mse", "mae", "friedman_mse")


Oh boy wasn't this tested at all ;(

jnothman · 2016-10-19T04:54:08Z

LGTM, thanks!

olologin · 2016-10-19T08:00:51Z

Build on one of travis-ci configurations fails with:

build_tools/travis/test_script.sh: line 23: 4113 Segmentation fault (core dumped) nosetests -s --with-timer --timer-top-n 20 sklearn

I wonder if it's related to my changes. I didn't see this error on travis-ci after first commits, It appeared only when I changed new array creation to .copy(). Is it coincidence? If not - I think I should rewrite sizet_ptr_to_ndarray so it will just copy all values from C array to previously created ndarray, avoiding all this fancy PyArray_SimpleNewFromData methods.

jnothman · 2016-10-19T12:14:16Z

Yes, I was a little uncertain about the .copy approach. Perhaps it's not so reliable, despite my LGTM. Use np.frombuffer?

raghavrv

I've cleared the cache for this PR and restarted the travis build. I don't think the current changes cause the failure.

raghavrv · 2016-10-19T12:27:50Z

sklearn/tree/_tree.pyx

@@ -542,13 +542,12 @@ cdef class Tree:
        reaching node i.
    """
    # Wrap for outside world.
-    # WARNING: these reference the current `nodes` and `value` buffers, which
+    # WARNING: these copy the current `nodes` and `value` buffers, which


No they still reference. Only the n_classes is copied. This comment should be left intact :)

jnothman · 2016-10-19T14:09:29Z

I don't think the current changes cause the failure.

I suspect they do :)

raghavrv · 2016-10-19T14:22:46Z

I pulled this PR, and pushed it to my fork. Travis passes smoothly there. The restart build option in travis doesn't clear the cache and retry, which is why this PR is still failing after clearing it's cache :)

I think @olologin needs to force push to rebuild this PR without any cache side effects...

jnothman · 2016-10-19T14:28:12Z

Oh, okay. You're probably right.

On 20 October 2016 at 01:22, Raghav RV notifications@github.com wrote:

I pulled this PR, and pushed it to my fork. Travis passes smoothly there
https://travis-ci.org/raghavrv/scikit-learn/builds/168931248. The
restart build option in travis doesn't clear the cache and retry, which is
why this PR is still failing after clearing it's cache :)

I think @olologin https://github.com/olologin needs to force push to
rebuild this PR without any cache side effects...

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#7680 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz6xsET_Fo73oSCouH65wbLItP-Bq0ks5q1ie4gaJpZM4KYC0m
.

olologin · 2016-10-19T15:31:10Z

@raghavrv mystical behaviour of Travis :), it fails with the same problems here again.

raghavrv · 2016-10-19T19:48:14Z

@olologin I've made a small PR (olologin#3) to your fork. Can you accept and see if the travis failure vanishes?

olologin · 2016-10-19T19:50:09Z

@raghavrv, yep, let's try this.

raghavrv · 2016-10-19T20:11:22Z

Yohoo. It passes now. Now, could you delete my commit, the merge commit and force push to see if the cache is reset completely?

raghavrv · 2016-10-19T20:34:35Z

Argh it now reuses the old travis cache. Apologies for putting you through this... :/ As a final attempt, could you squash all commits into one and force push? (I've cleared your cache again)

nelson-liu · 2016-10-19T20:39:42Z

what usually worked for me is to push trivial commits travis on the files that weren't being recythonized (a comment will do), have them pass and cached, then revert back those changes.

raghavrv · 2016-10-19T20:59:47Z

I was looking for an automated way for all such future failures and came up with this

curl -Ls https://goo.gl/jhGwkV | git apply -v --index; git commit -m "NOMERGE recythonize all";

nelson-liu · 2016-10-19T21:09:48Z

i think you need to also change recythonize _splitter, no?

olologin · 2016-10-19T21:13:21Z

@nelson-liu , I didn't change that file, I don't think it needs recompilation. Will try now approach proposed by @raghavrv.

nelson-liu · 2016-10-19T21:19:57Z

hmm, the errors look like you should have to, but i just always recythonized all the tree files whenever i got the error for good measure.

nelson-liu · 2016-10-19T21:27:57Z

doc/whats_new.rst

@@ -56,6 +56,10 @@ Bug fixes
     <https://github.com/scikit-learn/scikit-learn/pull/6178>`_) by `Bertrand
     Thirion`_

+   - Tree splitting criterion classes cloning/pickling are now memory safe


classes -> classes'?

You're right, thanks!

…on objects

raghavrv · 2016-10-19T21:53:33Z

Finally travis passes :D After #7708, one should be able to add [ci recythonize] and force a recythonizing of all the cython sources. That should save all of us some trouble in the future...

raghavrv · 2016-10-19T23:07:46Z

@jnothman please merge if happy with the changes...

jnothman · 2016-10-19T23:10:40Z

Thanks again, @olologin

…ion objects (scikit-learn#7680) # Conflicts: # doc/whats_new.rst

…ion objects (scikit-learn#7680)

* tag '0.18.1': (144 commits) skip tree-test on 32bit do the warning test as we do it in other places. Replase assert_equal by assert_almost_equal in cosine test version bump 0.18.1 fix merge conflict mess in whatsnew add the python2.6 warning to 0.18.1 fix learning_curve test that I messed up in cherry-picking the "reentrant cv" PR. sync whatsnew with master [MRG] TST Ensure __dict__ is unmodified by predict, transform, etc (scikit-learn#7553) FIX scikit-learn#6420: Cloning decision tree estimators breaks criterion objects (scikit-learn#7680) Add whats new entry for scikit-learn#6282 (scikit-learn#7629) [MGR + 2] fix selectFdr bug (scikit-learn#7490) fixed whatsnew cherry-pick mess (somewhat) [MRG + 2] FIX LogisticRegressionCV to correctly handle string labels (scikit-learn#5874) [MRG + 2] Fixed parameter setting in SelectFromModel (scikit-learn#7764) [MRG+2] DOC adding separate `fit()` methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824) Fix docstring typo (scikit-learn#7844) n_features --> n_components [MRG + 1] DOC adding :user: role to whats_new (scikit-learn#7818) [MRG+1] label binarizer not used consistently in CalibratedClassifierCV (scikit-learn#7799) DOC : fix docstring of AIC/BIC in GMM ...

* releases: (144 commits) skip tree-test on 32bit do the warning test as we do it in other places. Replase assert_equal by assert_almost_equal in cosine test version bump 0.18.1 fix merge conflict mess in whatsnew add the python2.6 warning to 0.18.1 fix learning_curve test that I messed up in cherry-picking the "reentrant cv" PR. sync whatsnew with master [MRG] TST Ensure __dict__ is unmodified by predict, transform, etc (scikit-learn#7553) FIX scikit-learn#6420: Cloning decision tree estimators breaks criterion objects (scikit-learn#7680) Add whats new entry for scikit-learn#6282 (scikit-learn#7629) [MGR + 2] fix selectFdr bug (scikit-learn#7490) fixed whatsnew cherry-pick mess (somewhat) [MRG + 2] FIX LogisticRegressionCV to correctly handle string labels (scikit-learn#5874) [MRG + 2] Fixed parameter setting in SelectFromModel (scikit-learn#7764) [MRG+2] DOC adding separate `fit()` methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824) Fix docstring typo (scikit-learn#7844) n_features --> n_components [MRG + 1] DOC adding :user: role to whats_new (scikit-learn#7818) [MRG+1] label binarizer not used consistently in CalibratedClassifierCV (scikit-learn#7799) DOC : fix docstring of AIC/BIC in GMM ... Conflicts: removed sklearn/externals/joblib/__init__.py sklearn/externals/joblib/_parallel_backends.py sklearn/externals/joblib/testing.py

* dfsg: (144 commits) skip tree-test on 32bit do the warning test as we do it in other places. Replase assert_equal by assert_almost_equal in cosine test version bump 0.18.1 fix merge conflict mess in whatsnew add the python2.6 warning to 0.18.1 fix learning_curve test that I messed up in cherry-picking the "reentrant cv" PR. sync whatsnew with master [MRG] TST Ensure __dict__ is unmodified by predict, transform, etc (scikit-learn#7553) FIX scikit-learn#6420: Cloning decision tree estimators breaks criterion objects (scikit-learn#7680) Add whats new entry for scikit-learn#6282 (scikit-learn#7629) [MGR + 2] fix selectFdr bug (scikit-learn#7490) fixed whatsnew cherry-pick mess (somewhat) [MRG + 2] FIX LogisticRegressionCV to correctly handle string labels (scikit-learn#5874) [MRG + 2] Fixed parameter setting in SelectFromModel (scikit-learn#7764) [MRG+2] DOC adding separate `fit()` methods (and docstrings) for DecisionTreeClassifier and DecisionTreeRegressor (scikit-learn#7824) Fix docstring typo (scikit-learn#7844) n_features --> n_components [MRG + 1] DOC adding :user: role to whats_new (scikit-learn#7818) [MRG+1] label binarizer not used consistently in CalibratedClassifierCV (scikit-learn#7799) DOC : fix docstring of AIC/BIC in GMM ...

…ion objects (scikit-learn#7680)

olologin mentioned this pull request Oct 16, 2016

Cloning decision tree estimators breaks criterion objects #6420

Closed

olologin changed the title ~~[WIP] Fix #6420 Cloning decision tree estimators breaks criterion objects~~ [MRG] Fix #6420 Cloning decision tree estimators breaks criterion objects Oct 17, 2016

amueller added this to the 0.18.1 milestone Oct 17, 2016

raghavrv reviewed Oct 17, 2016

View reviewed changes

jnothman requested changes Oct 18, 2016

View reviewed changes

olologin force-pushed the 6420_dt_criterion branch 4 times, most recently from 0b7992c to 04a5437 Compare October 18, 2016 23:46

raghavrv suggested changes Oct 19, 2016

View reviewed changes

jnothman changed the title ~~[MRG] Fix #6420 Cloning decision tree estimators breaks criterion objects~~ [MRG+1] Fix #6420 Cloning decision tree estimators breaks criterion objects Oct 19, 2016

olologin force-pushed the 6420_dt_criterion branch 2 times, most recently from ac24db4 to bc3c530 Compare October 19, 2016 07:36

raghavrv suggested changes Oct 19, 2016

View reviewed changes

raghavrv mentioned this pull request Oct 19, 2016

Tests failing on master due to tree recythonization issues #7094

Closed

olologin force-pushed the 6420_dt_criterion branch from bc3c530 to a0fd37a Compare October 19, 2016 15:14

olologin force-pushed the 6420_dt_criterion branch 2 times, most recently from cdb6133 to a895cd4 Compare October 19, 2016 20:21

olologin force-pushed the 6420_dt_criterion branch from a895cd4 to da5a85b Compare October 19, 2016 20:43

olologin force-pushed the 6420_dt_criterion branch from b41f9e3 to a5fe72e Compare October 19, 2016 21:12

nelson-liu reviewed Oct 19, 2016

View reviewed changes

Fix scikit-learn#6420 Cloning decision tree estimators breaks criteri…

98eaad3

…on objects

olologin force-pushed the 6420_dt_criterion branch from a5fe72e to 98eaad3 Compare October 19, 2016 21:38

raghavrv approved these changes Oct 19, 2016

View reviewed changes

raghavrv changed the title ~~[MRG+1] Fix #6420 Cloning decision tree estimators breaks criterion objects~~ [MRG+2] Fix #6420 Cloning decision tree estimators breaks criterion objects Oct 19, 2016

jnothman merged commit 74e4c42 into scikit-learn:master Oct 19, 2016

amueller added a commit to amueller/scikit-learn that referenced this pull request Oct 25, 2016

FIX scikit-learn#6420: Cloning decision tree estimators breaks criter…

0d69158

…ion objects (scikit-learn#7680) # Conflicts: # doc/whats_new.rst

amueller pushed a commit to amueller/scikit-learn that referenced this pull request Nov 9, 2016

FIX scikit-learn#6420: Cloning decision tree estimators breaks criter…

2bb913d

…ion objects (scikit-learn#7680)

Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

FIX scikit-learn#6420: Cloning decision tree estimators breaks criter…

2f7f34c

…ion objects (scikit-learn#7680)

paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

FIX scikit-learn#6420: Cloning decision tree estimators breaks criter…

1fcfc36

…ion objects (scikit-learn#7680)

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

FIX scikit-learn#6420: Cloning decision tree estimators breaks criter…

2255f4f

…ion objects (scikit-learn#7680)

Uh oh!

[MRG+2] Fix #6420 Cloning decision tree estimators breaks criterion objects #7680

[MRG+2] Fix #6420 Cloning decision tree estimators breaks criterion objects #7680

Conversation

olologin commented Oct 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

olologin commented Oct 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nelson-liu commented Oct 17, 2016

Uh oh!

raghavrv commented Oct 17, 2016

Uh oh!

amueller commented Oct 17, 2016

Uh oh!

amueller commented Oct 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

olologin Oct 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Oct 18, 2016

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Oct 19, 2016

Uh oh!

olologin commented Oct 19, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Oct 19, 2016

Uh oh!

raghavrv left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Oct 19, 2016

Uh oh!

raghavrv commented Oct 19, 2016

Uh oh!

jnothman commented Oct 19, 2016

Uh oh!

olologin commented Oct 19, 2016

Uh oh!

raghavrv commented Oct 19, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

olologin commented Oct 19, 2016

Uh oh!

raghavrv commented Oct 19, 2016

olologin commented Oct 16, 2016 •

edited

Loading

olologin commented Oct 16, 2016 •

edited

Loading

olologin Oct 18, 2016 •

edited

Loading

olologin commented Oct 19, 2016 •

edited

Loading

raghavrv commented Oct 19, 2016 •

edited

Loading

raghavrv commented Oct 19, 2016 •

edited

Loading