Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intra-host communication #2046

Open
mrocklin opened this issue Jun 11, 2018 · 3 comments
Open

Intra-host communication #2046

mrocklin opened this issue Jun 11, 2018 · 3 comments

Comments

@mrocklin
Copy link
Member

Currently we have two kinds of comms:

  1. inproc:// for intra-process communication with queues
  2. tcp:// and tls:// for inter-process or inter-node communication with sockets

There is also a possibility in between for inter-process intra-node, that is for processes communicating to each other on the same machine but in different memory spaces.

Do we expect to see performance improvements from handling this? How expensive would this be to implement?

cc @pitrou in case he has general thoughts

@adamklein
Copy link
Contributor

This is a great topic and could greatly benefit me in certain cases, especially if we could avoid overhead of serializing and copying. I have been generally curious about using the plasma store from arrow to share data across process boundaries but haven't had a chance to play with it.

@mrocklin
Copy link
Member Author

To be clear, serialization will always be necessary if you want to move between processes. However, serialization is also pretty much free in the case of numpy arrays, pandas dataframes, or anything else that is mostly binary data.

The use of posix shared memory (the trick that Plasma uses) would probably be the biggest benefit here if it ends up being worthwhile.

@jakirkham
Copy link
Member

UNIX domain sockets ( #3630 ) would be one case of this. Though I guess the idea here is to handle any platform?

One option would be to copy frames into multiprocessing.Arrays before transmitting them. This may require some knowledge of where the data is going ( #400 ) in order to benefit from this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants