Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iostream fix for multiprocessing with no performance hit to single process #2734

Closed
wants to merge 1 commit into from

Conversation

piti118
Copy link
Contributor

@piti118 piti118 commented Jan 2, 2013

Better version of #2712

@piti118
Copy link
Contributor Author

piti118 commented Jan 2, 2013

So, what it does here is that it has two buffers. Whenever multiprocess is detected, it switch to multiprocess mode and use que as buffer instead of StringIO.

This way there would be no performance hit to single process use case.

%%timeit
for i in range(100):
    print i

Before any kind of fork it gives ~2ms/loop. After something like

import multiprocessing as mp
def f(x):
    print 'hello',x
pool = [mp.Process(target=f,args=(i,)) for i in range(10)]
for p in pool: p.start()
for p in pool: p.join()
%%timeit
for i in range(100):
    print i

will be around 200ms/loop.

@piti118
Copy link
Contributor Author

piti118 commented Jan 2, 2013

Still couldn't quite figure out how to write a test for this though....

@minrk
Copy link
Member

minrk commented Jan 3, 2013

Why switch modes for the master process? Why doesn't the master process only write to the regular buffer, and just use the Queue for subprocesses? I know it ensures ordering of print statements between the master process and the subprocesses, but that's not a guarantee that I feel like we need to make (it's not even a guarantee that the terminal makes, really).

@piti118
Copy link
Contributor Author

piti118 commented Jan 3, 2013

import multiprocessing as mp
from time import sleep
manager = mp.Manager()
v = manager.Value('i',9)
def f(x, v):
    while(v.value!=x):
        pass
    print 'hello',x
    v.value -= 1
pool = [mp.Process(target=f,args=(i,v)) for i in range(10)]
for p in pool: p.start()
for p in pool: p.join()

It's not 100% guaranteed(since the buffer will be merge only when the master process know what's going on) but It definitely gives a more sensible ordering though.

@minrk
Copy link
Member

minrk commented Jan 3, 2013

I think this approach makes sense. If you make any print statements from a forked process you will trigger (forever) the super slow MP-safe approach. It's unfortunate that it will stick around forever, even while no subprocesses are running, but I think it's acceptable for now.

Things to check:

  • fast-printing subprocesses (while True: print i)
  • non-printing subprocesses shouldn't trigger this
  • Windows (doesn't have a proper fork, so mp.Process is a totally different beast)

One alternative would be to only write to the Queue on flush, rather than on every write. This would change the sync point for interleaved output, but would result in many fewer Queue items.

@piti118
Copy link
Contributor Author

piti118 commented Jan 3, 2013

Could you write one simple test to do the testing? I can implement the others but i'm not sure where to start writing test.

@minrk
Copy link
Member

minrk commented Jan 3, 2013

Could you write one simple test to do the testing? I can implement the
others but i'm not sure where to start writing test.

Sure, no problem.

@piti118
Copy link
Contributor Author

piti118 commented Jan 3, 2013

Found one bug with current implementation though.

If I let fork.process run and kill ipython notebook server before it joins... it segfault. I'm guessing it's trying to do something funny with that socket.

minrk added a commit to minrk/ipython that referenced this pull request Jan 3, 2013
tests new mp.Process behavior for PR ipython#2734
@minrk
Copy link
Member

minrk commented Jan 3, 2013

If you pull from my '2734' branch: https://github.com/minrk/ipython/tree/2734, you will get a few basic tests.

@minrk
Copy link
Member

minrk commented Jan 4, 2013

Hm, there is an issue with the mp manager, though. mp.Manager creates subprocesses, and they don't get properly cleaned up if the master process is terminated:

import os, signal
from multiprocessing import Manager
m = Manager()
os.kill(os.getpid(), signal.SIGTERM)

will always leave the manager's process running. This will have to be addressed (it may be another reason that using a Manager will simply not be acceptable).

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

Thanks, will pull that soon. But for issue your raise, I'm not sure about inner working of ipython signal handling and manger but wouldn't something like this work?

def __del__(self):
    self._manager.shutdown()

Not sure what happens if the old process still trying to write to it though.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

Or may be we should implement a poor man queue from mp.sharedctypes

@minrk
Copy link
Member

minrk commented Jan 4, 2013

On Thu, Jan 3, 2013 at 4:44 PM, Piti Ongmongkolkul <notifications@github.com

wrote:

Or may be we should implement a poor man queue from mp.sharedctypes

It seems every mp datatype that doesn't use a Manager has corruption issues,
and Manager can't be used unless we can fix the zombie manager issue.


Reply to this email directly or view it on GitHubhttps://github.com//pull/2734#issuecomment-11867934.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

mp.sharedctypes has lock though(those without raw in its name). What do you mean by corruption issues?

http://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.sharedctypes

Although I'm not sure about dynamic allocation...

@minrk
Copy link
Member

minrk commented Jan 4, 2013

On Thu, Jan 3, 2013 at 5:07 PM, Piti Ongmongkolkul <notifications@github.com

wrote:

mp.sharedctypes has lock though(those without raw in its name). What do
you mean by corruption issues?

http://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.sharedctypes

The mp docs are littered with warnings about corruption issues.

But even without corruption, there are other issues. For instance, what
happens when a subprocess is terminated while it has a lock?


Reply to this email directly or view it on GitHubhttps://github.com//pull/2734#issuecomment-11868489.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

The doc said you are screwed

Avoid terminating processes

Using the Process.terminate() method to stop a process is liable to cause any shared resources (such as locks, semaphores, pipes and queues) currently being used by the process to become broken or unavailable to other processes.

Therefore it is probably best to only consider using Process.terminate() on processes which never use any shared resources.

I think that's what manager.shutdown is for though if it's gets called that the right places we should be ok.

@minrk
Copy link
Member

minrk commented Jan 4, 2013

I think that's what manager.shutdown is for though if it's gets called that the right places we should be ok.

But that's exactly the point - this simply cannot be guaranteed. This means that Manager can never be an acceptable tool for this.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

Why wouldn't something like this work?

def __del__(self):
    if self.is_masterprocess():
         self._manager.shutdown()

Wouldn't del get called when terminating ipython kernel? I'm not talking about kill -9 but SIGTERM or something proper.

@minrk
Copy link
Member

minrk commented Jan 4, 2013

On Thu, Jan 3, 2013 at 6:37 PM, Piti Ongmongkolkul <notifications@github.com

wrote:

Why wouldn't something like this work?

def del():
if self.is_masterprocess():
self._manager.shutdown()

Wouldn't on terminating ipython kernel every single del gets called?

No. __del__ is only called on a clean exit (sys.exit()).


Reply to this email directly or view it on GitHubhttps://github.com//pull/2734#issuecomment-11870272.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

Hmmm that cuts manager out from picture completely....

I'm thinking about memory map file solution. That way I delegate the problem to OS level instead.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

Just thinking out loud here:

ZMQ with pull on flush... would screwed up the order.
But with poller or listener....we have the same zombie problem as manager.

@minrk
Copy link
Member

minrk commented Jan 4, 2013

On Thu, Jan 3, 2013 at 6:59 PM, Piti Ongmongkolkul <notifications@github.com

wrote:

Just thinking out loud there:

ZMQ with pull on flush... would screwed up the order.
But with poller or listener....we have the same zombie problem as manager.

Not true - Manager spawns a new process, with zmq that is not necessary.
For instance, you could have PULL running in a thread in the master,
and write could PUSH.


Reply to this email directly or view it on GitHubhttps://github.com//pull/2734#issuecomment-11870685.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

That brings me to one of the previous question: whether the forked process after threading will have a copy of thread from the master process attach to it or not. (sorry I learned this long time ago and couldn't remember the finer details)

Concretely

Our master process is p0.
p0 spawn a thread t0.
Now I fork p1 from p0
The question is does p1 have t1 which is a copy of t0 attach to it or not.

@minrk
Copy link
Member

minrk commented Jan 4, 2013

On Thu, Jan 3, 2013 at 7:18 PM, Piti Ongmongkolkul <notifications@github.com

wrote:

That brings me to one of the previous question: whether the forked process
after threading will have a copy of thread from the master process attach
to it or not. (sorry I learned this long time ago and couldn't remember the
finer details)

Concretely

Our master process is p0.
p0 spawn a thread t0.
Now I fork p1 from p0
The question is does p1 have t1 which is a copy of t0 attach to it or
not. io.rprint("message sent from fork")

Yes, it will have that thread, but it will be defunct (you shouldn't have
to worry about it).

I have a prototype working with zmq push/pull that I can show you, if you
like


Reply to this email directly or view it on GitHubhttps://github.com//pull/2734#issuecomment-11871163.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

I stop writing zmq with polling thread midway because of the following concern.

def poller():
    lock()
    write_to_buffer() #<<< supposed fork happens when this thread is here
    unlock()

Lock is acquired forever(??) in the forked process or double unlock or what happen?

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

Not true - Manager spawns a new process, with zmq that is not necessary.
For instance, you could have PULL running in a thread in the master,
and write could PUSH.

Hmm thread gets automatically killed when main process is killed? If not then we, have a zombie.

@minrk
Copy link
Member

minrk commented Jan 4, 2013

On Thu, Jan 3, 2013 at 8:09 PM, Piti Ongmongkolkul <notifications@github.com

wrote:

Not true - Manager spawns a new process, with zmq that is not necessary.
For instance, you could have PULL running in a thread in the master,
and write could PUSH.

Hmm thread gets automatically killed when main process is killed?

yes, threads are parts of a process - when a process dies, all threads die


Reply to this email directly or view it on GitHubhttps://github.com//pull/2734#issuecomment-11871998.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

You may be right. From my vague memory I recall that child thread will have no idea the parent thread gets kill and need signal handler to do that. I may be wrong though let me consult pthread manual.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

Or maybe python thread is something different... I give up. I trust you though 😉

@minrk
Copy link
Member

minrk commented Jan 4, 2013

you don't need any inter-process locks at all.
Here's how my test works:

  • main process:
    • PULL socket in background thread polls
      • acquire thread lock while writing to buffer
    • main thread writes directly to buffer as before (with lock, if multiprocessing)
  • background process
    • write to PUSH socket

So the background process never grabs any lock.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

good point. will have another PR (hopefully will fix this for real this time). Thanks for being so patient with me.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

Now that I think about it again it wouldn't work though. Here is the problem
In polling thread t0 of the main process p0

def poll(self):
    while True:
        if ismasterprocess(): #doesn't guard sufficiently though
            thread_lock()
            pull_socket.pull() <<< suppose fork on p0 happens while t0 is deep inside that call
            writer_to_buffer()
            thread_unlock()
        else:
            break

Now after fork from p0, thread t1 a copy of t0 will still be running at that same point? (you said defuct but i guess it's still running at the same place, right?)

This would mean that it might reuse the zmq socket and casue the segfault again.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

from fork man page:

A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called. [THR] Fork handlers may be established by means of the pthread_atfork() function in order to maintain application invariants across fork() calls.

When the application calls fork() from a signal handler and any of the fork handlers registered by pthread_atfork() calls a function that is not asynch-signal-safe, the behavior is undefined.

Exactly 1 single calling thread is duplicated. I worried too much.

http://pubs.opengroup.org/onlinepubs/009695399/functions/fork.html

@minrk
Copy link
Member

minrk commented Jan 4, 2013

If you want to pull from my forksafe_nomp branch, that should get us most of the way there.

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

I also have my zmq. The implementation is a little different, a bit shorter. The major different from yours is that child process flush get cascaded to parent. Mine doesn't have uuid though.

However, for me if fork and join is fast message drops like crazy.

https://github.com/piti118/ipython/tree/mp_fix_iostream_zmq <<< a bunch of debug message is still there

@piti118
Copy link
Contributor Author

piti118 commented Jan 4, 2013

Hmmm actually I'm not sure if your solution will have this problem. Run this like 3 times and it will definitely hung up.
The problem I think is because socket has LINGER by default. If I try to close them in order but the message que is in different order it will go in to a dead lock. Without linger message might get dropped.

import zmq
import multiprocessing as mp
import threading as t

def listener(quit):
    context = zmq.Context()
    puller = context.socket(zmq.PULL)
    puller.bind('tcp://127.0.0.1:5555')
    i=0
    while not quit.is_set():
        i+=1
        print 'poll',i
        s = puller.poll(1000)
        if s:
            m = puller.recv_string()
            print m


def sender(i):
    context = zmq.Context()
    pusher = context.socket(zmq.PUSH)
    pusher.connect('tcp://127.0.0.1:5555')
    pusher.send_string(str(i))


quit = t.Event()
listen_thread = t.Thread(target=listener, args=(quit,))
listen_thread.start()
numsend=100
pool = [mp.Process(target=sender, args=(i,)) for i in range(numsend)]
for i, p in enumerate(pool):
    print 'start', i
    p.start()
for i, p in enumerate(pool):
    print 'join',i
    p.join()
print 'hey'
quit.set()
listen_thread.join()

@minrk
Copy link
Member

minrk commented Jan 4, 2013

On Fri, Jan 4, 2013 at 2:35 AM, Piti Ongmongkolkul <notifications@github.com

wrote:

Hmmm actually I'm not sure if your solution will have this problem. Run
this like 3 times and it will definitely hung up.
The problem I think is because socket has LINGER by default. If I try to
close them in order but the message que is in different order it will go in
to a dead lock. Without linger message might get dropped.

I thought it might be LINGER as well, but that doesn't appear to be the
case.
The default linger is -1 (infinite) on zeromq-2, and 1000 (1 second) on
zeromq-3,
but the subprocess is clearly exiting before LINGER is able to have any
effect (mp.Process must be forcing it to halt, rather than allowing clean
exit).
This is what my tracker/wait code in flush() for a subprocesses works
around.

import zmqimport multiprocessing as mpimport threading as t
def listener(quit):
context = zmq.Context()
puller = context.socket(zmq.PULL)
puller.bind('tcp://127.0.0.1:5555')
i=0
while not quit.is_set():
i+=1
print 'poll',i
s = puller.poll(1000)
if s:
m = puller.recv_string()
print m

def sender(i):
context = zmq.Context()
pusher = context.socket(zmq.PUSH)
pusher.connect('tcp://127.0.0.1:5555')
pusher.send_string(str(i))

quit = t.Event()listen_thread = t.Thread(target=listener, args=(quit,))listen_thread.start()numsend=100pool = [mp.Process(target=sender, args=(i,)) for i in range(numsend)]for i, p in enumerate(pool):
print 'start', i
p.start()for i, p in enumerate(pool):
print 'join',i
p.join()print 'hey'quit.set()listen_thread.join()


Reply to this email directly or view it on GitHubhttps://github.com//pull/2734#issuecomment-11879104.

@piti118
Copy link
Contributor Author

piti118 commented Jan 7, 2013

Feel free to close this PR since your solution is clearly superior.

@minrk minrk mentioned this pull request Jan 15, 2013
2 tasks
@minrk
Copy link
Member

minrk commented Jan 18, 2013

superseded by #2791

@minrk minrk closed this Jan 18, 2013
minrk added a commit to minrk/ipython that referenced this pull request Feb 4, 2013
tests new mp.Process behavior for PR ipython#2734
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
tests new mp.Process behavior for PR ipython#2734
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants