Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with n_jobs=-1 and n_jobs >1 #560

Closed
LakshMatai opened this issue Mar 30, 2019 · 9 comments
Closed

Error with n_jobs=-1 and n_jobs >1 #560

LakshMatai opened this issue Mar 30, 2019 · 9 comments

Comments

@LakshMatai
Copy link

@LakshMatai LakshMatai commented Mar 30, 2019

Description

In Python 2 the error is "Exception has occurred: OSError, [Errno 32] Broken pipe" and in Python 3 the "error is semaphore or lock released too many times"

Steps/Code to Reproduce

classifier=BalancedBaggingClassifier(base_estimator=base_estimator,
n_estimators=NUM_ESTIMATORS, max_samples=MAX_SAMPLES,
max_features=MAX_FEATURES,
bootstrap=True, bootstrap_features=False,
oob_score=False, warm_start=False,
sampling_strategy='majority', #sampling_strategy=0.8,
replacement=True, n_jobs=-1,
random_state=RANDOM_SEED, verbose=VERBOSE)

classifier.fit(x,y)

Expected Results

It should just fit the classifier without any error

Actual Results

for python 2

Exception has occurred: OSError
[Errno 32] Broken pipe
File "/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/backend/semaphore_tracker.py", line 134, in _send
nbytes = os.write(self._fd, msg)
File "/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/backend/semaphore_tracker.py", line 112, in _check_alive
self._send('PROBE', '')
File "/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/backend/semaphore_tracker.py", line 66, in ensure_running
if self._check_alive():
File "/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/backend/semaphore_tracker.py", line 120, in register
self.ensure_running()
File "/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/backend/synchronize.py", line 90, in init
semaphore_tracker.register(self._semlock.name)
File "/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/backend/synchronize.py", line 174, in init
super(Lock, self).init(SEMAPHORE, 1, 1)
File "/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/backend/context.py", line 225, in Lock
return Lock()
File "/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/backend/queues.py", line 46, in init
self._rlock = ctx.Lock()
File "/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/process_executor.py", line 286, in init
super(_SafeQueue, self).init(max_size, reducers=reducers, ctx=ctx)
File "/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/externals/loky/process_executor.py", line 940, in _setup_queues
reducers=job_reducers, ctx=self._context)

Python3

Exception has occurred: ValueError
semaphore or lock released too many times
File "/Users/apple/anaconda3/lib/python3.7/site-packages/sklearn/externals/joblib/externals/loky/backend/synchronize.py", line 107, in exit
return self._semlock.release()
File "/Users/apple/anaconda3/lib/python3.7/site-packages/sklearn/externals/joblib/externals/loky/process_executor.py", line 1017, in _ensure_executor_running
self._start_queue_management_thread()
File "/Users/apple/anaconda3/lib/python3.7/site-packages/sklearn/externals/joblib/externals/loky/process_executor.py", line 1042, in submit
self._ensure_executor_running()
File "/Users/apple/anaconda3/lib/python3.7/site-packages/sklearn/externals/joblib/externals/loky/reusable_executor.py", line 151, in submit
fn, *args, **kwargs)
File "/Users/apple/anaconda3/lib/python3.7/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 510, in apply_async
future = self._workers.submit(SafeFunction(func))
File "/Users/apple/anaconda3/lib/python3.7/site-packages/sklearn/externals/joblib/parallel.py", line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/Users/apple/anaconda3/lib/python3.7/site-packages/sklearn/externals/joblib/parallel.py", line 759, in dispatch_one_batch
self._dispatch(tasks)
File "/Users/apple/anaconda3/lib/python3.7/site-packages/sklearn/externals/joblib/parallel.py", line 917, in call
if self.dispatch_one_batch(iterator):
File "/Users/apple/anaconda3/lib/python3.7/site-packages/sklearn/ensemble/bagging.py", line 378, in _fit
for i in range(n_jobs))
File "/Users/apple/anaconda3/lib/python3.7/site-packages/imblearn/ensemble/_bagging.py", line 245, in fit
return self._fit(X, y, self.max_samples, sample_weight=None)
File "/Users/apple/Documents/CAIS/_Laksh/lily-x/PAWS/iware/iware.py", line 445, in train_iware
classifier.fit(train_x_filter, train_y_filter)

Versions

Python 3.7.1 (default, Dec 14 2018, 13:28:58)
[Clang 4.0.1 (tags/RELEASE_401/final)]
NumPy 1.15.4
SciPy 1.1.0
Scikit-Learn 0.20.1
Imbalanced-Learn 0.5.0.dev0

@chkoar

This comment has been minimized.

Copy link
Member

@chkoar chkoar commented Mar 30, 2019

I suppose that we can't reproduce it without a minimal reproducible example.

For instance the following code works in my virtual machine.

import numpy as np
from imblearn.ensemble import BalancedBaggingClassifier

np.random.seed(0)

X = np.random.randn(100, 10)
y = np.random.choice([0, 1], size=100, p=[0.9, 0.1])
classifier = BalancedBaggingClassifier(n_jobs=-1)

classifier.fit(X, y)
classifier.score(X, y)
Windows-8.1-6.3.9600-SP0
Python 3.6.3
NumPy 1.15.4
SciPy 1.2.1
Scikit-Learn 0.20.2
Imbalanced-Learn 0.5.0.dev0
@LakshMatai

This comment has been minimized.

Copy link
Author

@LakshMatai LakshMatai commented Mar 30, 2019

Thanks for looking in to this Chris

I guess it's working fine on Windows. My Machine is a MacBook Pro 2016 model.

Here's my code snipped

from sklearn import tree
from imblearn.ensemble import BalancedBaggingClassifier
def get_classifier(base_estimator):
    return BalancedBaggingClassifier(base_estimator=base_estimator,n_jobs=-1)



base_estimator = tree.DecisionTreeClassifier(random_state=RANDOM_SEED)
classifier = get_classifier(base_estimator)

x = np.random.rand(100, 3)
y = np.random.randint(2, size=100)

classifier.fit(x, y)
@chkoar

This comment has been minimized.

Copy link
Member

@chkoar chkoar commented Mar 30, 2019

@LakshMatai might be irrelevant but please include your constants in the above code snippet in order to be self-contained.

@LakshMatai

This comment has been minimized.

Copy link
Author

@LakshMatai LakshMatai commented Mar 30, 2019

@chkoar I was able to reproduce the error with the updated code too and it works fine if I use n_jobs = 1

@chkoar

This comment has been minimized.

Copy link
Member

@chkoar chkoar commented Mar 30, 2019

@LakshMatai interesting. As I can see from the tests we test against n_jobs>1. The CI never reported a problem, but we do not test against macOS. @glemaitre any thoughts on this?

@hayesall

This comment has been minimized.

Copy link
Contributor

@hayesall hayesall commented Mar 30, 2019

I'm working on macOS High Sierra (10.13.6) and could not reproduce this with imbalanced-learn==0.4.3 and n_jobs=-1

System Information and Python Versions
Processor 3.4 GHz Intel Core i5
Memory 16 GB 2400 MHz DDR4
>>> import sklearn; sklearn.show_versions()

System:
    python: 3.7.2 (default, Dec 29 2018, 00:00:04)  [Clang 4.0.1 (tags/RELEASE_401/final)]
executable: /Users/hayesall/anaconda3/envs/DS/bin/python
   machine: Darwin-17.7.0-x86_64-i386-64bit

BLAS:
    macros: NO_ATLAS_INFO=3, HAVE_CBLAS=None
  lib_dirs: 
cblas_libs: cblas

Python deps:
       pip: 18.1
setuptools: 40.6.3
   sklearn: 0.20.2
     numpy: 1.16.0
     scipy: 1.2.0
    Cython: None
    pandas: 0.23.4
Code
import numpy as np
from sklearn import tree
from imblearn.ensemble import BalancedBaggingClassifier


def get_classifier(base_estimator):
    return BalancedBaggingClassifier(
        base_estimator=base_estimator, n_jobs=-1, verbose=3
    )


base_estimator = tree.DecisionTreeClassifier(random_state=42)
classifier = get_classifier(base_estimator)

x = np.random.rand(100, 3)
y = np.random.randint(2, size=100)

classifier.fit(x, y)
Output
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
Building estimator 1 of 3 for this parallel run (total 10)...
Building estimator 1 of 3 for this parallel run (total 10)...
Building estimator 2 of 3 for this parallel run (total 10)...
Building estimator 2 of 3 for this parallel run (total 10)...
Building estimator 3 of 3 for this parallel run (total 10)...
Building estimator 1 of 2 for this parallel run (total 10)...
Building estimator 3 of 3 for this parallel run (total 10)...
Building estimator 2 of 2 for this parallel run (total 10)...
Building estimator 1 of 2 for this parallel run (total 10)...
Building estimator 2 of 2 for this parallel run (total 10)...
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    1.1s finished

I ran the same example with imbalanced-learn==0.5.0.dev0, and didn't have an issue there either.

Output
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
Building estimator 1 of 3 for this parallel run (total 10)...
Building estimator 1 of 3 for this parallel run (total 10)...
Building estimator 1 of 2 for this parallel run (total 10)...
Building estimator 2 of 3 for this parallel run (total 10)...
Building estimator 2 of 2 for this parallel run (total 10)...
Building estimator 2 of 3 for this parallel run (total 10)...
Building estimator 3 of 3 for this parallel run (total 10)...
Building estimator 3 of 3 for this parallel run (total 10)...
Building estimator 1 of 2 for this parallel run (total 10)...
Building estimator 2 of 2 for this parallel run (total 10)...
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    1.0s finished

I'm not familiar with how jobs are spawned by sklearn, but I don't think it explicitly requires joblib (I don't have joblib installed in this environment).

Maybe check your XCode version? I've seen different XCode versions cause issues like this occasionally.

@LakshMatai

This comment has been minimized.

Copy link
Author

@LakshMatai LakshMatai commented Mar 30, 2019

My xcode version is - Version 10.1 (10B61)
I'll check it on a different machine with same config. It could be because of Xcode or the OS because in the Python 2 it says broken pipe and in python 3 it says semaphore or lock released too many times. So it has do something with how the OS handles concurrent threads.

@hayesall

This comment has been minimized.

Copy link
Contributor

@hayesall hayesall commented Mar 30, 2019

I've been digging around for a while and this has me a bit stumped.

  • There is a comment at the top of some of the sklearn.externals.joblib files that state these may occasionally fail on macOS for values >1, but that doesn't explain why it works for me.
  • There was some discussion on pyinstaller/pyinstaller#2322 (comment) fairly recently that was trying to debug a similar problem, but some more recent comments are discussing it with relation to sklearn==0.20.2 as well.

Maybe try pip install --upgrade sklearn scikit-learn since your above version is 0.20.1.

If that still doesn't work, my next strategy would be to see if there could be an issue with multiprocessing, maybe try some of the online multiprocessing examples?

@glemaitre

This comment has been minimized.

Copy link
Member

@glemaitre glemaitre commented Jun 11, 2019

This should be linked to Accelerate and this is something linked to joblib and scikit-learn.
Then, I am closing this issue.

@glemaitre glemaitre closed this Jun 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.