Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gather potential user input #2

Closed
richardsheridan opened this issue Jan 30, 2021 · 4 comments
Closed

Gather potential user input #2

richardsheridan opened this issue Jan 30, 2021 · 4 comments
Labels
help wanted Extra attention is needed

Comments

@richardsheridan
Copy link
Owner

Having split this off from python-trio/trio#1783, some people that have been interested in python-trio/trio#1781 should be aware of this new home for my multiprocessing-based Trio worker process code.

CC @asvetlov, @belm0, @auvipy, @goodboy

I realize y'all are busy people but your input would make all the difference! 😁

@goodboy
Copy link

goodboy commented Jan 31, 2021

I think this looks great.

I did a brief once over of the code base and I actually think this adds even more incentive to decipher what low level parts of multiprocessing can be rebuilt outside the stdlib for use in consuming async IO frameworks.

I really like that you were able to leverage multiprocessing.Process (and thus os.fork()) by running in a new thread and requiring sync functions be submitted; this is something we obviously can't do in trio directly due to the global scheduler state.

I think you've helped settled some issues we have in tractor as well:

  • we were thinking of removing support for invoking sync functions (Do we need to support invoking sync functions? goodboy/tractor#77)
    • seeing your approach here I actually think it makes more sense to point users who want this in tractor to come here instead since,
      1. the spawning time should be faster since you're able to use os.fork()
      2. sync functions block the trio scheduler in subactors anyway meaning they effectively block the entire runtime; users might as well allocate a dedicated process using your api for sync work that comes with free cancellation handling (which I'm not sure we've even tested very much in tractor)
      3. you can always call a sync function from an async one in tractor if you desire.
  • we were thinking of rolling our own forkserver (Roll our own os.fork() server on *nix? goodboy/tractor#146) since the stdlib's version is not resilient to nested subprocesses. This might be a (sub)project which would allow us to better collaborate on simpler forking systems that can be used more flexibly in async and non-async code bases.

Let me know what you think 🏄🏼

@goodboy
Copy link

goodboy commented Jan 31, 2021

Hmm thinking about this further I wonder if we could use this as a backend as well to sidestep the issues with the current mp forkserver method?

That should work right?

Is there any reason we can't invoke a new trio.run() in the subproc?
Actually, is there any reason we can't use this as the base for rolling a fork server?

The only drawback I can think of is a lot of lingering threads and that we can't do true sub-proc nesting (which you don't really need as long as you do it logically from tractor's perspective, i think).

I would be super interested to benchmark spawn times versus the current forkserver in mp.

@richardsheridan
Copy link
Owner Author

richardsheridan commented Jan 31, 2021

At the moment, the following code produces the trio wakeup socketpair error on linux

import trio
import trio_parallel
import os
import multiprocessing

async def in_worker():
    await trio.sleep(.1)
    print("look, i'm in worker PID", os.getpid())

async def in_main():
    print("started in main PID", os.getpid())
    await trio_parallel.run_sync(trio.run, in_worker)
    print("back in main PID", os.getpid())

if __name__ == "__main__":
    # multiprocessing.set_start_method("forkserver") # fix the problem!
    trio.run(in_main)
started in main PID 324676
/home/rjs80/miniconda3/envs/trio-test/lib/python3.7/site-packages/trio/_core/_wakeup_socketpair.py:85: RuntimeWarning: It looks like Trio's signal handling code might have collided with another library you're using. If you're running Trio in guest mode, then this might mean you should set host_uses_signal_set_wakeup_fd=True. Otherwise, file a bug on Trio and we'll help you figure out what's going on.
  "It looks like Trio's signal handling code might have "
look, i'm in worker PID 324746
back in main PID 324676

So fixing that might be a quick solution for a Trio forkserver, if you like living on the edge with risk of deadlock. See #4 for more!

@richardsheridan richardsheridan added the help wanted Extra attention is needed label Feb 3, 2021
@richardsheridan
Copy link
Owner Author

Some chatter indicates that a worker initialization function could be useful, and possibly dispatching work across a network. These options might better be part of #12.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants