Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forkserver singleton #21

Merged
merged 7 commits into from
Aug 4, 2018
Merged

Forkserver singleton #21

merged 7 commits into from
Aug 4, 2018

Conversation

goodboy
Copy link
Owner

@goodboy goodboy commented Jul 25, 2018

Temporary solution for #6 by overriding some stdlib code in multiprocessing.forkserver.py.
A proper patch should be made eventually to get this more granular server management into the stdlib. For more detailed discussion see #20.

@vodik ping ping!

@goodboy goodboy force-pushed the forkserver_singleton branch 2 times, most recently from caa232c to 47c2e47 Compare July 25, 2018 05:35
)
from multiprocessing.forkserver import (
ForkServer, MAXFDS_TO_SEND, write_unsigned,
# _serve_one,
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be relied on in 3.6?

SIGNED_STRUCT = struct.Struct('q') # large enough for pid_t


class AdultForkServer(ForkServer):
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe SingletonForkServer is a better name?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PatchedForkServer?

# Incoming fork request
with listener.accept()[0] as s:

# Thing changed - be tolerant of socket disconnects
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This the 2nd change.

semaphore_tracker.getfd()]
allfds += fds

# This is the only part changed
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First change.

The calling process should write to data_w the pickled preparation and
process data.
'''
# self.ensure_running() # treat our users like adults expecting
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expect client code to call this manually, once, at server startup.

@@ -26,6 +28,7 @@ def __init__(self, actor, supervisor=None):
# self.cancel_scope = cancel_scope
self._children = {}
self.cancelled = False
self._fs = None
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe _forkserver is better.

@goodboy goodboy mentioned this pull request Jul 30, 2018
Copy link

@vodik vodik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me short of one style issue, but I kinda want to dig into the weird OSError/BADF error with you sometime.

'''
# self.ensure_running() # treat our users like adults expecting
# them to spawn the server on their own
if len(fds) + 4 >= MAXFDS_TO_SEND:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, what's the fourth descriptor?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea - ain't my code. There's 2 pipes and a socket iirc.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely 3 pipes: stdin, stdout, stderr.

reduction.sendfds(client, allfds)
break
except OSError as err:
if err.args[0] == 9:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really should use errno.EBADF instead of 9. Also errno attribute maps to args[0] - imho clearer:

import errno

...

except OSError as err:
    if err.errno = errno.EBADF

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

SIGNED_STRUCT = struct.Struct('q') # large enough for pid_t


class AdultForkServer(ForkServer):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PatchedForkServer?

print(f"Bad FD {err}")
client = socket.socket(socket.AF_UNIX)
client.connect(self._forkserver_address)
continue
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many times does this hit? Might be interesting doing an os.walk on `/proc/self/fd/'

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iirc a couple times per proc. Yeah definitely worth further investigation.
Fwiw it seems forkserver.py is under pretty active development in cpython 3.8.

fs._forkserver_pid,
semaphore_tracker._semaphore_tracker._pid,
semaphore_tracker._semaphore_tracker._fd,
) = self._actor._forkserver_info
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two assignments are kinda tricky: you're building a tuple, unpacking tuples two different ways, while also chaining assignment. Its clever, but maybe borderline not readable?

At the very least, I could see someone assuming these two statements as similar - as they're written to look parallel - but they're not. The dominating block of the top constructs a tuple, while its unpacking one in the bottom. That last assignment could be visually easy to overlook.

Personal rule of thumb (imho) - the further you get to the end of a line, the less important the details should be.

I'd consider rewriting this, even if it makes it more verbose - but I don't necessarily consider this an egregiously offense either.

@goodboy
Copy link
Owner Author

goodboy commented Aug 3, 2018

but I kinda want to dig into the weird OSError/BADF error with you sometime.

@vodik definitely. We can do that as part of #20!

Tyler Goodlet added 7 commits August 4, 2018 18:15
The stdlib insists on creating multiple forkservers and semaphore trackers
for each sub-sub-process launched. This isn't ideal since it costs each
`tractor` sub-actor an additional 2 more processes then necessary and is
confusing when viewed as a process tree (eg. via `pstree`).

The majority of the change is simply avoiding the call to
`forkserver.ensure_running()` and `semaphore_tracker.ensure_running()`
in `ForkServer.connect_new_process()` and instead treating the user like
an adult and expecting those calls to be made *once* in the parent most
process (i.e. what `multiprocessing` calls the `MainProcess`).

Really a proper patch should be made against cpython which allows for
similar manual management of the server along with a mechanism to communicate
forkserver and semaphore tracker fd info to sub-processes such that
further calls to `Process.start()` work as expected.

Relates to #6
Start a forkserver once in the main (parent-most) process
and pass ipc info (fds) to subprocesses manually such that embedded
calls to `multiprocessing.Process.start()` just work. Note that this
relies on our overridden version of the stdlib's
`multiprocessing.forkserver` module.

Resolves #6
@goodboy goodboy merged commit 7f0f2e5 into master Aug 4, 2018
@goodboy goodboy deleted the forkserver_singleton branch July 21, 2020 04:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants