-
-
Notifications
You must be signed in to change notification settings - Fork 29.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
problem using multiprocessing with really big objects? #61760
Comments
I ran into a problem using multiprocessing to create large data objects (in this case numpy float64 arrays with 90,000 columns and 5,000 rows) and return them to the original python process. It breaks in both Python 2.7 and 3.3, using numpy 1.7.0 (but with different error messages). It is possible the array is too large to be serialized (450 million 64-bit numbers exceeds a 32-bit limit)? Python 2.7 Process PoolWorker-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 99, in worker
put((job, i, result))
File "/usr/lib/python2.7/multiprocessing/queues.py", line 390, in put
return send(obj)
SystemError: NULL result without error in PyObject_Call Python 3.3 Traceback (most recent call last):
File "multi.py", line 18, in <module>
results = pool.map_async(make_data, range(5)).get(9999999)
File "/usr/lib/python3.3/multiprocessing/pool.py", line 562, in get
raise self._value
multiprocessing.pool.MaybeEncodingError: Error sending result: '[array([[ 0.74628629, 0.36130663, -0.65984794, ..., -0.70921838,
0.34389663, -1.7135126 ],
[ 0.60266867, -0.40652402, -1.31590562, ..., 1.44896246,
-0.3922366 , -0.85012842],
[ 0.59629641, -0.00623001, -0.12914128, ..., 0.99925511,
-2.30418136, 1.73414009],
...,
[ 0.24246639, 0.87519509, 0.24109069, ..., -0.48870107,
-0.20910332, 0.11749621],
[ 0.62108937, -0.86217542, -0.47357384, ..., 1.59872243,
0.76639995, -0.56711461],
[ 0.90976471, 1.73566475, -0.18191821, ..., 0.19784432,
-0.29741643, -1.46375835]])]'. Reason: 'error("'i' format requires -2147483648 <= number <= 2147483647",)' |
A multiprocessing queue currently uses a 32-bit signed int to encode object length (in bytes): def _send_bytes(self, buf):
# For wire compatibility with 3.2 and lower
n = len(buf)
self._send(struct.pack("!i", n))
# The condition is necessary to avoid "broken pipe" errors
# when sending a 0-length buffer if the other end closed the pipe.
if n > 0:
self._send(buf) I *think* we need to keep compatibility with the wire format, but perhaps we could use a special length value (-1?) to introduce a longer (64-bit) length value. |
Yes we could, although that would not help on Windows pipe connections (where byte oriented messages are used instead). Also, does pickle currently handle byte strings larger than 4GB? But I can't help feeling that multigigabyte arrays should be transferred using shared mmaps rather than serialization. numpy.frombuffer() could be used to recreate the array from the mmap. multiprocessing currently only allows sharing of such shared arrays using inheritance. Perhaps we need a picklable mmap type which can be sent over pipes and queues. (On Unix this would probably require fd passing.) |
On a machine with 256GB of RAM, it makes more sense to send arrays of this size than say on a laptop... |
On 27/03/2013 5:13pm, mrjbq7 wrote:
I was thinking more of speed than memory consumption. |
The 2.7 failure is indeed a pickle limitation, which should now be fixed by issue bpo-13555.
Richard was saying that you shouldn't serialize such a large array, that's just a huge performance bottleneck. The right way would be to use a shared memory.
You mean through fork() COW?
If you use POSIX semaphores, you could pass the semaphore path and use sem_open in the other process (but that would mean you can't unlink it right after open). |
Gotcha, for clarification, my original use case was to *create* them |
On 27/03/2013 5:47pm, Charles-François Natali wrote:
Through fork, yes, but "shared" rather than "copy-on-write".
I assume you mean "shared memory" and shm_open(), not "semaphores" and By using fd passing you can get the operating system to do ref counting |
There's a subtlety: because of refcounting, just treating a COW object
Yes ;-)
File-backed mmap() will incur disk I/O (although some of the data will |
On 27/03/2013 7:27pm, Charles-François Natali wrote:
I mean "write-through" (as opposed to "read-only" or "copy-on-write").
Apart from creating, unlinking and resizing the file I don't think there On Linux disk I/O only occurs when fsync() or close() are called. Once the file has been unlink()ed then any sensible operating system |
What?
It's the same on Linux, depending on your mount options, data will be
Why? |
On 27/03/2013 8:14pm, Charles-François Natali wrote:
I meant when there is no memory pressure.
Googling suggsests that MAP_SHARED on Linux is equivalent to MAP_SHARED The Linux man page refuses to specify MAP_SHARED
Can you demonstrate a slowdown with a benchmark? |
http://lwn.net/Articles/326552/
""" This just means that it will reduce synchronous writeback, but On Linux, writeback can be done by background kernel threads The "mount option" thing is the following: man mount: commit=nrsec data={journal|ordered|writeback}
Specifies the journalling mode for file data. Metadata
is always journaled. To use modes other than ordered on the root
filesystem, pass the mode to the kernel
as boot parameter, e.g. rootflags=data=journal.
being written into the main filesystem.
directly out to the main file system prior to its metadata being
written into the main filesystem after its metadata has been committed commit=nrsec
Sync all data and metadata every nrsec seconds. The
default value is 5 seconds. Zero means default.
"""
> The Linux man page refuses to specify
>
> MAP_SHARED
> Share this mapping. Updates to the mapping are visible to other
> processes that map this file, and are carried through to the
> underlying file. **The file may not actually be updated until
> msync(2) or munmap() is called.** *may*,:just as fsync() is required to make sure data is committed to
I could, but I don't have to: a shared memory won't incur any I/O or |
On 27/03/13 21:09, Charles-François Natali wrote:
You are right. |
I have implemented a custom subclass of the multiprocessing Pool to be able plug custom pickling strategy for this specific use case in joblib: https://github.com/joblib/joblib/blob/master/joblib/pool.py#L327 In particular it can:
Here is some doc: https://github.com/joblib/joblib/blob/master/doc/parallel_numpy.rst I could submit the part that makes it possible to customize the picklers of multiprocessing.pool.Pool instance to the standard library if people are interested. The numpy specific stuff would stay in third party projects such as joblib but at least that would make it easier for people to plug their own optimizations without having to override half of the multiprocessing class hierarchy. |
I forgot to end a sentence in my last comment:
should read:
|
2.7 and 3.3 are in bugfix mode now, so they will not change. In 3.3 you can do from multiprocessing.forking import ForkingPickler
ForkingPickler.register(MyType, reduce_MyType) Is this sufficient for you needs? This is private (and its definition has moved in 3.4) but it could be made public. |
Indeed I forgot that the multiprocessing pickler was made already made |
This is still an issue in Python 2.7.5 Will it be fixed? |
@artxyz: The current release of 2.7 is 2.7.13 -- if you are still using 2.7.5 you might consider updating to the latest release. As pointed out in the text of the issue, the multiprocessing pickler has been made pluggable in 3.3 and it's been made more conveniently so in 3.6. The issue reported here arises from the constraints of working with large objects and pickle, hence the enhanced ability to take control of the multiprocessing pickler in 3.x applies. I'll assign this issue to myself as a reminder to create a blog post around this example and potentially include it as a motivating need for controlling the multiprocessing pickler in the documentation. |
@davin Thanks for your answer! I will update to the current version. |
Pickle currently handle byte strings and unicode strings larger than 4GB only with protocol 4. But multiprocessing currently uses the default protocol which currently equals 3. There was suggestions to change the default pickle protocol (bpo-23403), the pickle protocol for multiprocessing (bpo-26507) or customize the serialization method for multiprocessing (bpo-28053). There is also a patch that implements the support of byte strings and unicode strings larger than 4GB with all protocols (bpo-25370). Beside this I think that using some kind of shared memory is better way for transferring large data between subprocesses. |
There's also the other multiprocessing limitation Antoine mentioned early on, where queues/pipes used a 32-bit signed integer to encode object length. Is there a way or plan to get around this limitation? |
As I said above in https://bugs.python.org/issue17560#msg185345, it should be easy to improve the current protocol to allow for larger than 2GB data. |
Davin, when you write-up a blog post, I think it would be helpful to mention that creating really large objects with multi-processing is mostly an anti-pattern (the cost of pickling and interprocess communication tends to drown-out the benefits of parallel processing). |
I apologize if this isn't the right forum to post this message, but Davin, if you have since put together the guide/blogpost mentioned in your message, would you be able to share a link to the material please? I am interested in gaining a better understanding of multiprocessing's pluggable pickler and, more generally, dealing with large objects (the comment about this being being an anti-pattern is noted). Thank you. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: