-
-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Numpy arrays shareable among related processes. #7533
Conversation
Can you send a mail to the mailing list about this? My first observations:
In the end the big question is probably whether or not to include it into NumPy proper (especially if it only works on some systems) and the answer to that should be discussed on the list. |
Note that if tests are in |
…ponding test suite.
Thank you for your suggestions, I will certainly post to the mailing list, but polishing the request a bit first seems as a good idea to me. Anyway, good news everyone --- I have checked today and I have been able to get it working on MS Windows too. |
to stay the same even if you use the :mod:`threading` module due to a | ||
CPython feature called `GIL <https://wiki.python.org/moin/GlobalInterpreterLock>`_ | ||
(global interpreter lock). GIL ensures that only one thread is active | ||
at a time, so threre is no true multitasking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not quite right. Many operations with NumPy/SciPy/pandas release the GIL, which makes multi-threading quite variable. Alternatively, IO also generally also releases the GIL. So multi-threading is only not viable if the inner loop of your code is in pure Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also there is a typo in threre
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Nodd The typo is fixed.
@shoyer As I have explained in the ML, I see the main benefit of this PR in use cases when for third-party modules that do complicated calculations (involving creation of Python
objects) that have a numpy
interface.
I totally agree with you that NumPy/SciPy/pandas don't fall into this cathegory.
This needs tests and justification for custom pickling methods, which are not used in any of the current examples. I'm pretty sure the reason why they exist is so you can pass these arrays into methods like I do see some value in providing a canonical right way to construct shared memory arrays in NumPy, but I'm not very happy with this solution, because it appears to require convoluted and error prone setup (you definitely need to create shared memory arrays in the parent process before launching any subprocesses) and terrible code organization (with the global variables). Frankly, I would switch to using something like Numba or Cython for my inner loop (or another programming language without the GIL) before I would suggest adopting the current approach. If there's some way to we can paper over the boilerplate such that users can use it without understanding the arcana of multiprocessing, then yes, that would be great. But otherwise I'm not sure there's anything to be gained by putting it in a library rather than referring users to the examples on StackOverflow [1] [2]. In my mind, there is roughly an appropriate level of boilerplate for user code:
Finally, even if this can be done cleanly it's not clear to me that this belongs in NumPy. Joblib already solves exactly this sort of problem. Because it write temporary files to [1] http://stackoverflow.com/questions/10721915/shared-memory-objects-in-python-multiprocessing |
The discussion not concerning implementation details is likely to move to the ML: |
I have added a quite complete example of a module that wraps around |
The concurrent module is new in Python 3.2 so not available in 2.7. Tests and whatever else needs to be fixed to account for that. |
Please squash the commits into relevant bits and follow the commit message format in Also, the name |
@charris Can I assume that merging this to numpy is not ruled out? I will gladly work on improving this PR, but obviously not if it is guaranteed that it won't be accepted. |
It is not my area, but the comments on the mailing list were somewhat skeptical. I'd be inclined to leave this out unless several people make a strong case for its inclusion. @njsmith @sturlamolden Thoughts. |
@matejak please take a look at the recent mailing list discussion. That's where you need to convince (some) people that this is useful. But my opinion is that is that this abstraction is too leaky to be a good fit for numpy. |
It might fit better in SciPy, if there were such a package as scipy.parallel, bit there isn't. On the other hand, joblib is in a package of its own, so maybe it fits better outside of everything? I don't know. |
As the original author of shmarray, I think it's unfortunately too much of a cludge to go in either numpy or scipy. It's useful for what I do, and I'd love to see an easy and robust way of handling shared numpy arrays, but whilst shmarray is easy, it's not robust or particularly intuitive and you really need to know the limitations /internal architecture to be able to use it. To clear up a few of the points made above and add some context:
def rendJitTri(im, x, y, jsig, mcp, imageBounds, pixelSize, n=1):
'''Helper function which runs on each spawned process'''
for i in range(n):
scipy.random.seed()
Imc = scipy.rand(len(x)) < mcp
if isinstance(jsig, numpy.ndarray):
jsig2 = jsig[Imc]
else:
jsig2 - float(jsig)
T = delaunay.Triangulation(x[Imc] + jsig2*scipy.randn(Imc.sum()), y[Imc] + jsig2*scipy.randn(Imc.sum()))
rendTri(T, imageBounds, pixelSize, im=im)
def rendJitTriang(x,y,n,jsig, mcp, imageBounds, pixelSize):
'''Perform kernel density estimation using jittered triangulation'''
sizeX = int((imageBounds.x1 - imageBounds.x0)/pixelSize)
sizeY = int((imageBounds.y1 - imageBounds.y0)/pixelSize)
im = shmarray.zeros((sizeX, sizeY))
x = shmarray.create_copy(x)
y = shmarray.create_copy(y)
if type(jsig) == numpy.ndarray:
jsig = shmarray.create_copy(jsig)
nCPUs = multiprocessing.cpu_count()
tasks = (n/nCPUs)*numpy.ones(nCPUs, 'i')
tasks[:(n%nCPUs)] += 1
processes = [multiprocessing.Process(target = rendJitTri, args=(im, x, y, jsig, mcp, imageBounds, pixelSize, nIt)) for nIt in tasks]
for p in processes:
p.start()
for p in processes:
p.join()
return im/n |
Anonymous shared memory must be allocated before the fork call. Named shared memory can be allocated after the fork call. We have to use named segments (System V IPC |
Joblib is also crippled on Mac and FreeBSD, since /tmp is not backed by tmpfs on these systems. In practice, joblib only provides shared memory on Linux. |
Taking all this together, I'm going to close this. @matejak If you want to pursue the matter, feel free to yell. |
Not to necro an old thread, but more to share something with those that may look for this sort of thing. One can build a custom ctypes array that has all the advantages of |
This pull request introduces work that has been seen here earlier:
https://bitbucket.org/cleemesser/numpy-sharedmem/overview (shmarray.py)
Originally written by David Baddeley, then maintained by Chris Lee-Messer, and finally treated by me. The license is a BSD one.
This pull requests introduces
np.shm
module withempty
,ones
,zeros
andcopy
functions that can behave the same as their ordinarynumpy
counterparts with one exception - the output array can be shared between processes, allowing for streamlined parallel data processing with low overhead.This PR also features a test suite and documentation.
Accepting it would open door to parallelize complicated data processing (for instance processing using external modules) if the problem has a well-defined array-in --- array-out interface.
What I am not too sure about:
shm.py
file,__new__
method of theshmarray
class are still a mystery to me although I have read the "subclassing ndarray" guide. Could you please drop an eye on it?