Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BaseEstimator.get_params and clone thread-safe are not thread safe #2755

Closed
ogrisel opened this issue Jan 15, 2014 · 20 comments
Closed

BaseEstimator.get_params and clone thread-safe are not thread safe #2755

ogrisel opened this issue Jan 15, 2014 · 20 comments
Assignees

Comments

@ogrisel
Copy link
Member

ogrisel commented Jan 15, 2014

The handling of deprecated constructor parameters is leveraging the execution Python warnings machinery which is not thread-safe and could therefore caused hard to diagnose bugs in

There was a tentative lock-based workaround in #2729. A better solution however would avoid executing the warning machinery at all in get_params by leveraging declarative deprecation introspection possibly using a class decorator.

I will try to issue a PR for this next week.

@ghost ghost assigned ogrisel Jan 15, 2014
@GaelVaroquaux
Copy link
Member

What's the status on this?

@ogrisel
Copy link
Member Author

ogrisel commented Feb 26, 2014

It is still not addressed although as solution based on a class decorator to manage deprecated constructor params declaratively should be possible.

@stuarteberg
Copy link

So sorry for the spam, but I just want to +1 this issue. My app calls sklearn (specifically, sklearn.ensemble.RandomForestRegressor.fit()) from multiple threads, and one of my users is running into this snag. It's a shame the Python warnings module isn't thread-safe.

@jcrist
Copy link

jcrist commented Feb 17, 2017

This just bit me as well when parallelizing scikit-learn with dask. I was able to get around it by not explicitly calling clone in each thread, but there's still the possibility of estimators calling get_params internally (which may lead to the same issue).

@jnothman
Copy link
Member

Yes. And I'm not altogether sure how a neat implementation is possible with the class decoration suggested by @ogrisel.

@ogrisel
Copy link
Member Author

ogrisel commented Feb 21, 2017

I am sure I had a clear plan in my head at the time but I forgot...

BTW could it be the case that this is actually the cause of #8410?

@stuarteberg
Copy link

BTW could it be the case that this is actually the cause of #8410?

I don't think so. That issue complains about incorrect results, whereas the issue here is that the warnings module isn't safe, leading to tracebacks such as this:

  File "C:\Program Files\ilastik-1.1.6\python\lib\site-packages\sklearn\ensemble\forest.py", line 302, in fit
    for i in range(n_jobs))
  File "C:\Program Files\ilastik-1.1.6\python\lib\site-packages\sklearn\externals\joblib\parallel.py", line 517, in __call__
    self.dispatch(function, args, kwargs)
  File "C:\Program Files\ilastik-1.1.6\python\lib\site-packages\sklearn\externals\joblib\parallel.py", line 312, in dispatch
    job = ImmediateApply(func, args, kwargs)
  File "C:\Program Files\ilastik-1.1.6\python\lib\site-packages\sklearn\externals\joblib\parallel.py", line 136, in __init__
    self.results = func(*args, **kwargs)
  File "C:\Program Files\ilastik-1.1.6\python\lib\site-packages\sklearn\ensemble\forest.py", line 82, in _parallel_build_trees
    tree = forest._make_estimator(append=False)
  File "C:\Program Files\ilastik-1.1.6\python\lib\site-packages\sklearn\ensemble\base.py", line 57, in _make_estimator
    estimator = clone(self.base_estimator)
  File "C:\Program Files\ilastik-1.1.6\python\lib\site-packages\sklearn\base.py", line 49, in clone
    params_set = new_object.get_params(deep=False)
  File "C:\Program Files\ilastik-1.1.6\python\lib\site-packages\sklearn\base.py", line 218, in get_params
    warnings.filters.pop(0)
IndexError: pop from empty list

@lesteve
Copy link
Member

lesteve commented Feb 22, 2017

@stuarteberg it would be great if you can post a stand-alone snippet reproducing the problem leading to your traceback.

@stuarteberg
Copy link

it would be great if you can post a stand-alone snippet reproducing the problem leading to your traceback.

For reasons I don't understand, the only users of my software to have seen this are Windows users. I don't have a windows machine, so it's not easy for me to generate a reduced test case.

IIUC, the problem should be reproducible by simply calling RandomForestRegressor.fit(n_estimators=10) on a Windows machine.

@wiseman
Copy link

wiseman commented Apr 27, 2017

I run into this issue about half the time I run a particular script:

Traceback (most recent call last):
  File "predict.py", line 174, in <module>
    lemapp.App().run()
  File "/anaconda/lib/python3.5/site-packages/lemapp.py", line 131, in run
    self.main(argv)
  File "predict.py", line 163, in main
    score = k_fold_eval(10, training_data)
  File "predict.py", line 155, in k_fold_eval
    scores = pmap(foo, kf.split(data))
  File "predict.py", line 35, in pmap
    return pool.map(fn, args)
  File "/anaconda/lib/python3.5/multiprocessing/pool.py", line 260, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/anaconda/lib/python3.5/multiprocessing/pool.py", line 608, in get
    raise self._value
  File "/anaconda/lib/python3.5/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/anaconda/lib/python3.5/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "predict.py", line 149, in foo
    vectorizer, forest = train(data.iloc[train_indices])
  File "predict.py", line 110, in train
    forest = forest.fit(train_data_features, training_data['sentiment'])
  File "/anaconda/lib/python3.5/site-packages/sklearn/ensemble/forest.py", line 314, in fit
    random_state=random_state)
  File "/anaconda/lib/python3.5/site-packages/sklearn/ensemble/base.py", line 119, in _make_estimator
    estimator = clone(self.base_estimator_)
  File "/anaconda/lib/python3.5/site-packages/sklearn/base.py", line 71, in clone
    params_set = new_object.get_params(deep=False)
  File "/anaconda/lib/python3.5/site-packages/sklearn/base.py", line 248, in get_params
    warnings.filters.pop(0)
IndexError: pop from empty list

@jcrist
Copy link

jcrist commented Apr 27, 2017

it would be great if you can post a stand-alone snippet reproducing the problem leading to your traceback.

This reproduces it on (most) calls:

from multiprocessing.pool import ThreadPool
from sklearn.linear_model import LogisticRegression

def f(x):
    for i in range(10):
        x.get_params()

lr = LogisticRegression()
p = ThreadPool(4)
p.map(f, [lr] * 1000)

@lesteve
Copy link
Member

lesteve commented Apr 28, 2017

Thanks for the stand-alone snippet @jcrist. I can reproduce the problem indeed.

@jnothman
Copy link
Member

Is it worth using locks as a fix for now??

@GiuseppeLaurenza
Copy link

Hi,
i started yesterday to use python3 and scikit and I faced this problem, is there any workaround?

@jnothman
Copy link
Member

jnothman commented Aug 16, 2017

FWIW, even without this pop, as long as we use catch_warnings, things will not be thread-safe. I run the script below, and may get output such as that beneath it.

import time
import io
import warnings
import pprint
import difflib
import sys
from joblib import Parallel, delayed


def do_stuff():
    with warnings.catch_warnings(record=True):
        warnings.simplefilter('always', DeprecationWarning)
        warnings.warn('blah', DeprecationWarning)
        time.sleep(0.01)


def diag(file):
    pprint.pprint({k: v for k, v in warnings.__dict__.items()
                   if not k.startswith('__') or not k.endswith('__')},
                  stream=file)

before = io.StringIO()
diag(before)
Parallel(n_jobs=-1, backend='threading')(delayed(do_stuff)()
                                         for i in range(10000))
after = io.StringIO()
diag(after)
before.seek(0)
after.seek(0)
sys.stdout.writelines(difflib.unified_diff(before.readlines(),
                                           after.readlines()))
assert before.getvalue() == after.getvalue()

Output:

--- 
+++ 
@@ -12,7 +12,8 @@
  '_setoption': <function _setoption at 0x1020edd08>,
  'catch_warnings': <class 'warnings.catch_warnings'>,
  'defaultaction': 'default',
- 'filters': [('ignore',
+ 'filters': [('always', None, <class 'DeprecationWarning'>, None, 0),
+             ('ignore',
               re.compile('numpy.ndarray size changed', re.IGNORECASE),
               <class 'Warning'>,
               re.compile(''),
@@ -41,7 +42,7 @@
  'formatwarning': <function formatwarning at 0x1020ed9d8>,
  'onceregistry': {},
  'resetwarnings': <function resetwarnings at 0x1020edbf8>,
- 'showwarning': <function showwarning at 0x1020ed730>,
+ 'showwarning': <function catch_warnings.__enter__.<locals>.showwarning at 0x1084916a8>,
  'simplefilter': <function simplefilter at 0x1020edae8>,
  'sys': <module 'sys' (built-in)>,
  'warn': <built-in function warn>,

The first difference (in filters) is an effect of adding a filter within the catch_warnings context and exiting the catch_warnings contexts out of order. The second difference (in showwarning) is an effect of exiting out of order, regardless of whether or not we add a filter.

@jnothman
Copy link
Member

What are the risks of locking (@GaelVaroquaux)?

@amueller
Copy link
Member

See my comments here. This code is not used in scikit-learn. I think we should get rid of it. If we want to keep it, we can find ways to detect deprecations that don't rely on warnings.

@jnothman
Copy link
Member

jnothman commented Aug 16, 2017 via email

@amueller
Copy link
Member

amueller commented Sep 8, 2017

ohhh yeah, 3.5 year old bug!

@jnothman
Copy link
Member

jnothman commented Sep 9, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants