Skip to content

[3.6] bpo-34134: Advise to use imap or imap_unordered when handling long iterables. (gh-8324)#11674

Closed
miss-islington wants to merge 1 commit intopython:3.6from
miss-islington:backport-3bab40d-3.6
Closed

[3.6] bpo-34134: Advise to use imap or imap_unordered when handling long iterables. (gh-8324)#11674
miss-islington wants to merge 1 commit intopython:3.6from
miss-islington:backport-3bab40d-3.6

Conversation

@miss-islington
Copy link
Copy Markdown
Contributor

@miss-islington miss-islington commented Jan 25, 2019

(cherry picked from commit 3bab40d)

Co-authored-by: Windson yang wiwindson@outlook.com

https://bugs.python.org/issue34134

…erables. (pythongh-8324)

(cherry picked from commit 3bab40d)

Co-authored-by: Windson yang <wiwindson@outlook.com>
@miss-islington
Copy link
Copy Markdown
Contributor Author

@Windsooon and @pitrou: Status check is done, and it's a success ✅ .

@pitrou
Copy link
Copy Markdown
Member

pitrou commented Jan 25, 2019

@ned-deily I'll let you make a call on this. Obviously this is a very optional change :-)

@miss-islington
Copy link
Copy Markdown
Contributor Author

@Windsooon and @pitrou: Status check is done, and it's a success ✅ .

@rhettinger
Copy link
Copy Markdown
Contributor

-1 on this change.

  • The difference between functions that return lists and functions one that return iterators is a generic topic belonging in a FAQ or tutorial -- it is not multiprocessing specific.

  • The tone of the new paragraph makes map() seem like it is broken or that using it is ill-advised. The actual situation is that sometimes you want map() and sometimes you don't. Usually our docs try to avoid being preachy. The suggestion in the dev guide is to take an affirmative tone showing what a function does and how to use it rather than create an unnecessary sense of risk.

  • The new paragraph skirts a more essential issue. In general, mapped functions in multiprocessing should only be returning small volumes of data. If result objects are large, performance is always impacted due to pickling and unpickling. Since there is no shared memory between processes, moving data between processing has a significantly different cost model than that observed in threading where data can be returning from functions without being copied.

@pitrou
Copy link
Copy Markdown
Member

pitrou commented Jan 28, 2019

The difference between functions that return lists and functions one that return iterators is a generic topic belonging in a FAQ or tutorial

The difference between Pool.map and Pool.imap is multiprocessing-specific. It doesn't exist in the Python 3 builtins, where map is lazy.

The tone of the new paragraph makes map() seem like it is broken

I don't read it like this. It mentions a specific situation ("very long iterables"). The wording is very prudent.

The new paragraph skirts a more essential issue. In general, mapped functions in multiprocessing should only be returning small volumes of data.

Perhaps, perhaps not. In scientific computing, many people are using multiprocessing (or concurrent.futures) with non-trivial volumes of data. It really depends on the computation time / transfer time ratio.

@Windsooon
Copy link
Copy Markdown
Contributor

Windsooon commented Jan 31, 2019

Maybe we can treat this Note like glob

Note Using the “**” pattern in large directory trees may consume an inordinate amount of time.

just for a warning.

@ned-deily
Copy link
Copy Markdown
Member

This doc change is technically out of scope for 3.6 which is now in security-fix mode so I'm going to close this backport. Keep in mind that the proposed wording has already been committed to master and 3.7 in other PRs.

@ned-deily ned-deily closed this Feb 16, 2019
@miss-islington miss-islington deleted the backport-3bab40d-3.6 branch February 16, 2019 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting review docs Documentation in the Doc dir skip news

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants