Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not enable Parallel if multiprocessing.synchronize is N/A #39

Closed
wants to merge 1 commit into from

Conversation

yarikoptic
Copy link
Contributor

Having finally tracked down not working multiprocessing.cpu_count on kfreebsd debian systems I ended up with
https://buildd.debian.org/status/fetch.php?pkg=statsmodels&arch=kfreebsd-i386&ver=0.4.2-1&stamp=1342080335
(look in the bottom)

So I wondered, shouldn't I just patch joblib also to require 'import multiprocessing.synchronize' to complete correctly for the check of multiprocessing existence

…herwise

problem is currently present with python on bsd systems
@GaelVaroquaux
Copy link
Member

Having finally tracked down not working multiprocessing.cpu_count

Thanks a lot!

on kfreebsd debian systems I ended up with
https://buildd.debian.org/status/fetch.php?pkg=statsmodels&arch=kfreebsd-i386&ver=0.4.2-1&stamp=1342080335
(look in the bottom)

So I wondered, shouldn't I just patch joblib also to require 'import multiprocessing.synchronize' to complete correctly for the check of multiprocessing existence

Another option would be to provide only limited parallel functionality.
For instance having joblib.cpu_count just return 1 with a warning. Thus
'n_jobs=-1' wouldn't work, but 'n_jobs=2' would work.

Would that solve the problem on kfreebsd?

Gael

@yarikoptic
Copy link
Contributor Author

Sorry Gael -- I am confused...

for a moment I thought that there is some other implementation in Parallel allowing to avoid multiprocessing.Pool but that seems to be not the case, so what did you mean by "limited parallel functionality"

if joblib.cpu_count just returns 1 -- that would indeed be in effect for no explicit n_jobs setting and would fail for both n_jobs=-1 and 2

@GaelVaroquaux
Copy link
Member

On Fri, Jul 13, 2012 at 11:21:08AM -0700, Yaroslav Halchenko wrote:

for a moment I thought that there is some other implementation in Parallel allowing to avoid multiprocessing.Pool but that seems to be not the case, so what did you mean by "limited parallel functionality"

Indeed, it is not the case. Are you saying that multiprocessing.Pool is
completely broken in kfreebsd?

G

@yarikoptic
Copy link
Contributor Author

to say the truth -- I do not know for sure (I am not a freebsd user/developer, yet ;) ) -- but that is what error msg/bug report suggests
http://bugs.python.org/issue3770

@yarikoptic
Copy link
Contributor Author

I guess there should be no harm if I push this patch into Debian package (need to fix FTBFS for statsmodels with it), right? ;-)

@GaelVaroquaux
Copy link
Member

Sorry Yarik for the delay. I am merging this in.

Do you need me to release quickly a bug fix release, or can I wait a bit. I'd like to merge in a pull request by Olivier Grisel before the next release, but it's not quite ready.

@yarikoptic
Copy link
Contributor Author

and I was slow myself only yesterday to discover that this is not effective with python2.6 (works fine with 2.7) -- I will look into it shortly...
no release is necessary for my purposes -- I can always carry such patches and a release (unless purely bugfix) -- might be trickier to persuade release team to allow to sneak in

@GaelVaroquaux
Copy link
Member

and I was slow myself only yesterday to discover that this is not effective with python2.6 (works fine with 2.7) -- I will look into it shortly...

OK, just do a new pull request, and I'll try to be quicker to merge it.

@yarikoptic
Copy link
Contributor Author

Not sure what would be the clean way now... I see two possibilities

  1. do check by the type of system and disable multiprocessing for bsd systems.

    pros -- descriptive and quick

    cons -- they might come up with a locking mechanism later on so for transparent support you would require other solution (e.g. 2.) anyways

  2. Elaborate the test for availability to actually trying to construct some dummy pool, e.g. smth like (just typing here untested)

try:
   _ = mp.Pool()
except (ImportError, OSError):
  # ImportError would be raised by python >= 2.7 (including 3.x) and OSError by python2.6
  multiprocessing = False

what do you think? should I proceed with 2?

@GaelVaroquaux
Copy link
Member

On Mon, Jul 23, 2012 at 10:31:29AM -0700, Yaroslav Halchenko wrote:

  1. Elaborate the test for availability to actually trying to construct some dummy pool, e.g. smth like (just typing here untested)
try:
   _ = mp.Pool()
except (ImportError, OSError):
  # ImportError would be raised by python >= 2.7 (including 3.x) and OSError by python2.6
  multiprocessing = False

I think that 2 garanties to always have a workable system and not prevent
multiprocessing when it could work. One problem I see is that it would
slow down significantly the import, and I know that some people care
about this. I think that we should probably either test like this only on
systems that require it, or try to test a more atomic error. Would you
mind pasting the full traceback that this create on bsd with Python 2.6.

Also, if we go down that route, we need to make really sure that that
temporary pool gets closed immediately, to avoid leaving zombie
processes.

Thanks a lot for your efforts!

@yarikoptic
Copy link
Contributor Author

I think that 2 garanties to always have a workable system and not prevent
multiprocessing when it could work. One problem I see is that it would
slow down significantly the import, and I know that some people care
about this. I think that we should probably either test like this only on
systems that require it

good idea -- so 1. + 2. then ;) (i.e. test 2. only on bsd systems)

or try to test a more atomic error.

there indeed might be a better check -- I just have failed to find it since it
is somewhere deeper in _multiprocessing extension

Would you
mind pasting the full traceback that this create on bsd with Python 2.6.

yoh@nebsd:~/deb/statsmodels-0.4.2$ python2.6 -c 'import multiprocessing as mp; _ = mp.Pool()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.6/multiprocessing/__init__.py", line 227, in Pool
    return Pool(processes, initializer, initargs)
  File "/usr/lib/python2.6/multiprocessing/pool.py", line 84, in __init__
    self._setup_queues()
  File "/usr/lib/python2.6/multiprocessing/pool.py", line 131, in _setup_queues
    self._inqueue = SimpleQueue()
  File "/usr/lib/python2.6/multiprocessing/queues.py", line 328, in __init__
    self._rlock = Lock()
  File "/usr/lib/python2.6/multiprocessing/synchronize.py", line 117, in __init__
    SemLock.__init__(self, SEMAPHORE, 1, 1)
  File "/usr/lib/python2.6/multiprocessing/synchronize.py", line 49, in __init__
    sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue)
OSError: [Errno 78] Function not implemented

Also, if we go down that route, we need to make really sure that that
temporary pool gets closed immediately, to avoid leaving zombie
processes.

do you mean _.close() for that pool? (if it succeeded to create if of
cause ;))

Yaroslav O. Halchenko
Postdoctoral Fellow, Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik

@GaelVaroquaux
Copy link
Member

On Mon, Jul 23, 2012 at 10:42:42AM -0700, Yaroslav Halchenko wrote:

there indeed might be a better check -- I just have failed to find it since it
is somewhere deeper in _multiprocessing extension

yoh@nebsd:~/deb/statsmodels-0.4.2$ python2.6 -c 'import multiprocessing as mp; _ = mp.Pool()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.6/multiprocessing/__init__.py", line 227, in Pool
    return Pool(processes, initializer, initargs)
  File "/usr/lib/python2.6/multiprocessing/pool.py", line 84, in __init__
    self._setup_queues()
  File "/usr/lib/python2.6/multiprocessing/pool.py", line 131, in _setup_queues
    self._inqueue = SimpleQueue()
  File "/usr/lib/python2.6/multiprocessing/queues.py", line 328, in __init__
    self._rlock = Lock()
  File "/usr/lib/python2.6/multiprocessing/synchronize.py", line 117, in __init__
    SemLock.__init__(self, SEMAPHORE, 1, 1)
  File "/usr/lib/python2.6/multiprocessing/synchronize.py", line 49, in __init__
    sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue)
OSError: [Errno 78] Function not implemented

How about trying to instanciate a multiprocessing.Semaphore? It is part
of the public API and should be significantly faster to instanciate than
a full 'Pool'. The question is: will it fail when a Pool fails?

Also, if we go down that route, we need to make really sure that that
temporary pool gets closed immediately, to avoid leaving zombie
processes.

do you mean _.close() for that pool? (if it succeeded to create if of
cause ;))

Yes, join and close.

@yarikoptic
Copy link
Contributor Author

On Mon, 23 Jul 2012, Gael Varoquaux wrote:

File "/usr/lib/python2.6/multiprocessing/synchronize.py", line 49, in init
sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue)
OSError: [Errno 78] Function not implemented

How about trying to instanciate a multiprocessing.Semaphore? It is part
of the public API and should be significantly faster to instanciate than
a full 'Pool'. The question is: will it fail when a Pool fails?

Good call -- it does fail the same way:

yoh@nebsd:~/deb/statsmodels-0.4.2$ python2.6 -c 'import multiprocessing as mp; _ = mp.Semaphore()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.6/multiprocessing/__init__.py", line 192, in Semaphore
    return Semaphore(value)
  File "/usr/lib/python2.6/multiprocessing/synchronize.py", line 81, in __init__
    SemLock.__init__(self, SEMAPHORE, value, SEM_VALUE_MAX)
  File "/usr/lib/python2.6/multiprocessing/synchronize.py", line 49, in __init__
    sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue)
OSError: [Errno 78] Function not implemented

yoh@nebsd:~/deb/statsmodels-0.4.2$ python2.7 -c 'import multiprocessing as mp; _ = mp.Semaphore()'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.7/multiprocessing/__init__.py", line 196, in Semaphore
    from multiprocessing.synchronize import Semaphore
  File "/usr/lib/python2.7/multiprocessing/synchronize.py", line 59, in <module>
    " function, see issue 3770.")
ImportError: This platform lacks a functioning sem_open implementation, therefore, the required synchronization primitives needed will not function, see issue 3770.

yoh@nebsd:~/deb/statsmodels-0.4.2$ python3.2 -c 'import multiprocessing as mp; _ = mp.Semaphore()'
Traceback (most recent call last):
  File "/usr/lib/python3.2/multiprocessing/synchronize.py", line 54, in <module>
    from _multiprocessing import SemLock
ImportError: cannot import name SemLock

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.2/multiprocessing/__init__.py", line 195, in Semaphore
    from multiprocessing.synchronize import Semaphore
  File "/usr/lib/python3.2/multiprocessing/synchronize.py", line 59, in <module>
    " function, see issue 3770.")
ImportError: This platform lacks a functioning sem_open implementation, therefore, the required synchronization primitives needed will not function, see issue 3770.

Also, if we go down that route, we need to make really sure that that
temporary pool gets closed immediately, to avoid leaving zombie
processes.
do you mean _.close() for that pool? (if it succeeded to create if of
cause ;))
Yes, join and close.

with semaphore it should be easier... would be ok if I just del it? i.e.

In [17]: sem = multiprocessing.Semaphore()
In [18]: del sem

to leave no trace?

Yaroslav O. Halchenko
Postdoctoral Fellow, Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik

@GaelVaroquaux
Copy link
Member

On Mon, Jul 23, 2012 at 10:53:51AM -0700, Yaroslav Halchenko wrote:

How about trying to instanciate a multiprocessing.Semaphore? It is part
of the public API and should be significantly faster to instanciate than
a full 'Pool'. The question is: will it fail when a Pool fails?

Good call -- it does fail the same way:

I do believe that this is the root of our problem. Looks like we have a
winner.

On my box, it take 30us to create a Semaphore:

In [1]: import multiprocessing

In [2]: %timeit multiprocessing.Semaphore()
10000 loops, best of 3: 28.9 us per loop

thus, I think that we can put this check on every os, which avoids ugly
code.

@yarikoptic
Copy link
Contributor Author

BTW -- while unittesting for the new "fix" on the kfreebsd box (will send a PR
after test it enough ;) ) on top of current master -- got following
failure:

======================================================================
FAIL: joblib.test.test_disk.test_disk_used
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/yoh/deb/joblib/joblib/test/test_disk.py", line 41, in test_disk_used
    nose.tools.assert_true(disk_used(cachedir) < target_size + 12)
AssertionError: False is not true

----------------------------------------------------------------------
Ran 294 tests in 23.476s

FAILED (failures=1)

and that is because

print "DISK USED: %r target_size: %r" % (disk_used(cachedir), target_size)

gives

DISK USED: 1042 target_size: 1024

Yaroslav O. Halchenko
Postdoctoral Fellow, Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik

@yarikoptic
Copy link
Contributor Author

btw -- is python3 fully supported ? I am just getting some failures and
not sure if that is just my sloppy way to test (so I should not whine
and do not bother about python3 for now) or joblib's problem... ?

e.g. stalls for me:

> nosetests3 -s -v /home/yoh/deb/gits/joblib/build/py3k/joblib
...
joblib.test.test_my_exceptions.test_inheritance ... ok
Failure: NameError (name 'file' is not defined) ... ERROR
joblib.test.test_parallel.test_cpu_count ... ok
joblib.test.test_parallel.test_simple_parallel([0, 1, 4, 9, 16], [0, 1, 4, 9, 16]) ... ok
joblib.test.test_parallel.test_simple_parallel([0, 1, 4, 9, 16], [0, 1, 4, 9, 16]) ... ok
joblib.test.test_parallel.test_simple_parallel([0, 1, 4, 9, 16], [0, 1, 4, 9, 16]) ... ok
joblib.test.test_parallel.test_simple_parallel([0, 1, 4, 9, 16], [0, 1, 4, 9, 16]) ... ok
Unhandled exception in thread started by 

Yaroslav O. Halchenko
Postdoctoral Fellow, Department of Psychological and Brain Sciences
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik

vene pushed a commit to vene/joblib that referenced this pull request Jul 18, 2014
@tlevine tlevine mentioned this pull request Oct 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants