ENH: Numpy arrays shareable among related processes. #7533

matejak · 2016-04-09T12:43:37Z

This pull request introduces work that has been seen here earlier:
https://bitbucket.org/cleemesser/numpy-sharedmem/overview (shmarray.py)
Originally written by David Baddeley, then maintained by Chris Lee-Messer, and finally treated by me. The license is a BSD one.

This pull requests introduces np.shm module with empty, ones, zeros and copy functions that can behave the same as their ordinary numpy counterparts with one exception - the output array can be shared between processes, allowing for streamlined parallel data processing with low overhead.

This PR also features a test suite and documentation.
Accepting it would open door to parallelize complicated data processing (for instance processing using external modules) if the problem has a well-defined array-in --- array-out interface.

What I am not too sure about:

Where to place the shm.py file,
Does it really work on non-Unix platforms?
Contents of the __new__ method of the shmarray class are still a mystery to me although I have read the "subclassing ndarray" guide. Could you please drop an eye on it?

seberg · 2016-04-10T11:57:51Z

Can you send a mail to the mailing list about this? My first observations:

This does not work on windows (I assume no; see the test errors)?
The name(s) seems unnecessarily shortened.
Can you explain why the ndarray subclass is needed? Subclasses can be rather annoying to get right, and also for other reasonsl.

In the end the big question is probably whether or not to include it into NumPy proper (especially if it only works on some systems) and the answer to that should be discussed on the list.

charris · 2016-04-10T15:30:21Z

Note that if tests are in numpy/core, the implementation should also be there.

…ponding test suite.

matejak · 2016-04-10T21:11:03Z

Thank you for your suggestions, I will certainly post to the mailing list, but polishing the request a bit first seems as a good idea to me.

Anyway, good news everyone --- I have checked today and I have been able to get it working on MS Windows too.

shoyer · 2016-04-10T23:37:19Z

numpy/doc/shm.py

+to stay the same even if you use the :mod:`threading` module due to a
+CPython feature called `GIL <https://wiki.python.org/moin/GlobalInterpreterLock>`_
+(global interpreter lock). GIL ensures that only one thread is active
+at a time, so threre is no true multitasking.


This is not quite right. Many operations with NumPy/SciPy/pandas release the GIL, which makes multi-threading quite variable. Alternatively, IO also generally also releases the GIL. So multi-threading is only not viable if the inner loop of your code is in pure Python.

Also there is a typo in threre

@Nodd The typo is fixed.
@shoyer As I have explained in the ML, I see the main benefit of this PR in use cases when for third-party modules that do complicated calculations (involving creation of Python objects) that have a numpy interface.
I totally agree with you that NumPy/SciPy/pandas don't fall into this cathegory.

shoyer · 2016-04-11T00:28:02Z

This needs tests and justification for custom pickling methods, which are not used in any of the current examples. I'm pretty sure the reason why they exist is so you can pass these arrays into methods like multiprocessing.Pool.apply_async, concurrent.futures.Executor.submit or joblib.Parallel as an argument. If that works, I would definitely use it for the examples -- it's much cleaner form to pass around arrays as arguments instead of relying on global variables.

I do see some value in providing a canonical right way to construct shared memory arrays in NumPy, but I'm not very happy with this solution, because it appears to require convoluted and error prone setup (you definitely need to create shared memory arrays in the parent process before launching any subprocesses) and terrible code organization (with the global variables). Frankly, I would switch to using something like Numba or Cython for my inner loop (or another programming language without the GIL) before I would suggest adopting the current approach.

If there's some way to we can paper over the boilerplate such that users can use it without understanding the arcana of multiprocessing, then yes, that would be great. But otherwise I'm not sure there's anything to be gained by putting it in a library rather than referring users to the examples on StackOverflow [1] [2].

In my mind, there is roughly an appropriate level of boilerplate for user code:

def do_something(shape, args):
    shared_array = np.shared_empty_array(shape)
    joblib.Parallel(delayed(func)(shared_array, other, ...) for other in args)

Finally, even if this can be done cleanly it's not clear to me that this belongs in NumPy. Joblib already solves exactly this sort of problem. Because it write temporary files to /dev/shm/ if available, this is basically a solved problem on Linux.

[1] http://stackoverflow.com/questions/10721915/shared-memory-objects-in-python-multiprocessing
[2] http://stackoverflow.com/questions/7894791/use-numpy-array-in-shared-memory-for-multiprocessing

matejak · 2016-04-11T12:42:30Z

The discussion not concerning implementation details is likely to move to the ML:
https://mail.scipy.org/pipermail/numpy-discussion/2016-April/075342.html

…iprocessing pool case.

…rray

matejak · 2016-04-18T20:26:41Z

I have added a quite complete example of a module that wraps around multiprocessing.Pool and that can also perform progress notification. I believe that some of the boilerplate can be avoided by introducing factory functions, would you like me to look into that?

charris · 2016-05-09T18:48:25Z

The concurrent module is new in Python 3.2 so not available in 2.7. Tests and whatever else needs to be fixed to account for that.

charris · 2016-05-09T18:58:56Z

Please squash the commits into relevant bits and follow the commit message format in doc/source/dev/gitwash/development_workflow.rst.

Also, the name shm is not informative, perhaps a simple shared_memory or something along those lines?

matejak · 2016-05-16T16:34:11Z

@charris Can I assume that merging this to numpy is not ruled out? I will gladly work on improving this PR, but obviously not if it is guaranteed that it won't be accepted.

charris · 2016-05-16T16:41:59Z

It is not my area, but the comments on the mailing list were somewhat skeptical. I'd be inclined to leave this out unless several people make a strong case for its inclusion. @njsmith @sturlamolden Thoughts.

shoyer · 2016-05-16T16:43:31Z

@matejak please take a look at the recent mailing list discussion. That's where you need to convince (some) people that this is useful. But my opinion is that is that this abstraction is too leaky to be a good fit for numpy.

sturlamolden · 2016-05-16T16:50:51Z

It might fit better in SciPy, if there were such a package as scipy.parallel, bit there isn't. On the other hand, joblib is in a package of its own, so maybe it fits better outside of everything? I don't know.

David-Baddeley · 2016-05-19T18:30:40Z

As the original author of shmarray, I think it's unfortunately too much of a cludge to go in either numpy or scipy. It's useful for what I do, and I'd love to see an easy and robust way of handling shared numpy arrays, but whilst shmarray is easy, it's not robust or particularly intuitive and you really need to know the limitations /internal architecture to be able to use it.

To clear up a few of the points made above and add some context:

joblib is not really an option if you want shared memory under windows
I'd only only ever use this approach when it's unavoidable. In my use case, the inner loop was already coded in c and calling a c-library which is not thread safe.
global variables are not necessary. The "canonical" test case (i.e. the use case I wrote the module for) does not use globals (see snippet below, or full file at https://bitbucket.org/david_baddeley/python-microscopy/src/tip/PYME/LMVis/visHelpers.py). The args to multiprocessing.Process, are however pickled within the multiprocessing module before being sent to the child processes, hence the need for the custom pickling behaviour.
when writing the module I'd originally hoped that the custom pickling would also allow the object to be passed into Queues and Pipes, but this doesn't work (it's been a while, but I think that the memory needs to be allocated before the 'fork' call).

def rendJitTri(im, x, y, jsig, mcp, imageBounds, pixelSize, n=1):
    '''Helper function which runs on each spawned process'''
    for i in range(n):
        scipy.random.seed()

        Imc = scipy.rand(len(x)) < mcp

        if isinstance(jsig, numpy.ndarray):
            jsig2 = jsig[Imc]
        else:
            jsig2 - float(jsig)

        T = delaunay.Triangulation(x[Imc] +  jsig2*scipy.randn(Imc.sum()), y[Imc] +  jsig2*scipy.randn(Imc.sum()))

        rendTri(T, imageBounds, pixelSize, im=im)


def rendJitTriang(x,y,n,jsig, mcp, imageBounds, pixelSize):
    '''Perform kernel density estimation using jittered triangulation'''
    sizeX = int((imageBounds.x1 - imageBounds.x0)/pixelSize)
    sizeY = int((imageBounds.y1 - imageBounds.y0)/pixelSize)

    im = shmarray.zeros((sizeX, sizeY))


    x = shmarray.create_copy(x)
    y = shmarray.create_copy(y)
    if type(jsig) == numpy.ndarray:
        jsig = shmarray.create_copy(jsig)


    nCPUs = multiprocessing.cpu_count()

    tasks = (n/nCPUs)*numpy.ones(nCPUs, 'i')
    tasks[:(n%nCPUs)] += 1

    processes = [multiprocessing.Process(target = rendJitTri, args=(im, x, y, jsig, mcp, imageBounds, pixelSize, nIt)) for nIt in tasks]

    for p in processes:
        p.start()

    for p in processes:
        p.join()

    return im/n

sturlamolden · 2016-05-19T20:30:46Z

when writing the module I'd originally hoped that the custom pickling would also allow the object to be >passed into Queues and Pipes, but this doesn't work (it's been a while, but I think that the memory >needs to be allocated before the 'fork' call).

Anonymous shared memory must be allocated before the fork call. Named shared memory can be allocated after the fork call. multiprocessing.Array allocates anonymous shared memory. It must be instatiated before forking, and can only be shared by inheritance (not by pickling).

We have to use named segments (System V IPC shmget, POSIX open_shm, or mmap from /tmp on Linux) in order for multiprocessing.Queue to work.

sturlamolden · 2016-05-19T20:33:26Z

joblib is not really an option if you want shared memory under windows

Joblib is also crippled on Mac and FreeBSD, since /tmp is not backed by tmpfs on these systems. In practice, joblib only provides shared memory on Linux.

charris · 2016-05-19T21:47:18Z

Taking all this together, I'm going to close this. @matejak If you want to pursue the matter, feel free to yell.

jakirkham · 2018-07-17T04:01:22Z

Not to necro an old thread, but more to share something with those that may look for this sort of thing.

One can build a custom ctypes array that has all the advantages of multiprocessing's RawArray. Plus it keeps its metadata (shape, type, etc.) even when being pickled, which make it easy to share between processes. Can also use the buffer of this object with NumPy easily. Here's an example. Code needed to implement this is fairly simple as it has multiprocessing do all the heavy lifting.

matejak added 3 commits April 9, 2016 00:14

Initial work on shared memory array support

5772d95

Paid attention to documentation

7e9b802

Fixed subtle bug and test cases

df7d6df

charris added 01 - Enhancement component: numpy._core labels Apr 10, 2016

Fixed bugs of the numpy.shm module, relocated and improved the corres…

5cd7c22

…ponding test suite.

matejak added 2 commits April 10, 2016 23:19

Fixed test for case when just importing multiprocessing.Pool fails.

13a8ffa

Fixed typo in test_shm.py.

4577b3b

shoyer reviewed Apr 10, 2016
View reviewed changes

matejak added 3 commits April 18, 2016 22:08

shared memory arrays: Expanded documentation, added test for the mult…

f7f2870

…iprocessing pool case.

Merge https://github.com/numpy/numpy into shmarray

f06f16e

Merge branch 'shmarray' of https://github.com/matejak/numpy into shma…

4f53dc5

…rray

Fixed tests handling

1de3ed6

charris changed the title ~~Numpy arrays shareable among related processes.~~ ENH: Numpy arrays shareable among related processes. May 9, 2016

Expanded tests by fatal cases

88794b6

charris closed this May 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Numpy arrays shareable among related processes. #7533

ENH: Numpy arrays shareable among related processes. #7533

matejak commented Apr 9, 2016 •

edited

Loading

seberg commented Apr 10, 2016

charris commented Apr 10, 2016

matejak commented Apr 10, 2016

shoyer Apr 10, 2016

Nodd Apr 18, 2016

matejak Apr 18, 2016

shoyer commented Apr 11, 2016

matejak commented Apr 11, 2016

matejak commented Apr 18, 2016

charris commented May 9, 2016

charris commented May 9, 2016

matejak commented May 16, 2016

charris commented May 16, 2016

shoyer commented May 16, 2016

sturlamolden commented May 16, 2016

David-Baddeley commented May 19, 2016

sturlamolden commented May 19, 2016 •

edited

Loading

sturlamolden commented May 19, 2016

charris commented May 19, 2016

jakirkham commented Jul 17, 2018

ENH: Numpy arrays shareable among related processes. #7533

ENH: Numpy arrays shareable among related processes. #7533

Conversation

matejak commented Apr 9, 2016 • edited Loading

seberg commented Apr 10, 2016

charris commented Apr 10, 2016

matejak commented Apr 10, 2016

shoyer Apr 10, 2016

Choose a reason for hiding this comment

Nodd Apr 18, 2016

Choose a reason for hiding this comment

matejak Apr 18, 2016

Choose a reason for hiding this comment

shoyer commented Apr 11, 2016

matejak commented Apr 11, 2016

matejak commented Apr 18, 2016

charris commented May 9, 2016

charris commented May 9, 2016

matejak commented May 16, 2016

charris commented May 16, 2016

shoyer commented May 16, 2016

sturlamolden commented May 16, 2016

David-Baddeley commented May 19, 2016

sturlamolden commented May 19, 2016 • edited Loading

sturlamolden commented May 19, 2016

charris commented May 19, 2016

jakirkham commented Jul 17, 2018

matejak commented Apr 9, 2016 •

edited

Loading

sturlamolden commented May 19, 2016 •

edited

Loading