`return_generator={True,False}` -> `return_as={'list','generator'}` #1458

fcharras · 2023-06-22T09:54:11Z

Change the boolean return_generator keyword, to return_as that is expected to take values in {'list','submitted'}, to anticipate a future 'completed' keyword when implementing #1449 .

codecov · 2023-06-22T12:21:00Z

Codecov Report

Patch coverage: 92.30% and project coverage change: -0.02 ⚠️

Comparison is base (5d88860) 94.88% compared to head (8862e02) 94.87%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1458      +/-   ##
==========================================
- Coverage   94.88%   94.87%   -0.02%     
==========================================
  Files          45       45              
  Lines        7471     7474       +3     
==========================================
+ Hits         7089     7091       +2     
- Misses        382      383       +1

Impacted Files	Coverage Δ
joblib/_parallel_backends.py	`93.47% <ø> (+1.08%)`	⬆️
joblib/parallel.py	`96.90% <83.33%> (+0.01%)`	⬆️
joblib/test/test_parallel.py	`96.11% <100.00%> (ø)`

... and 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

tomMoral

LGTM, a few nitpicks. Happy to have the opinion of @GaelVaroquaux on this one.

CHANGES.rst

doc/parallel.rst

examples/parallel_generator.py

joblib/parallel.py

betatim · 2023-06-23T08:36:22Z

Came here via the sprint/Franck at the sprint.

How about Parallel(..., results=...), with possible values: immediate for "return results in any order as soon as they are ready", ordered for "return results in the order in which tasks were submitted" and complete for "return all results in order in which tasks were submitted when all tasks are ready". Could also use "unordered", "ordered" and "all" (or something?). I think main my comment is that I prefer results= over return_as=.

rth · 2023-06-23T08:39:59Z

As @adrinjalali mentioned it could be worth at least considering adding an extra method for different output type. Something like (but the name can certainly be better),

joblib.Parallel(...).call_as_generator(delayed(sqrt)(i**2) for i in range(10)))

It also feels that something that returns a variable type depending on the input parameter is not great API wise, and would confuse type checkers. I mean between list and iterators of results it could still work. But if you want to add an iterator of future results later that's a completely different return type.

ogrisel · 2023-06-23T09:17:38Z

return_type="list"/"generator" (or return_as?)
return_order="as_submitted"/"as_completed" (or collection_order?)

with raise ValueError when return_as="list" and collection_order="as_completed" because it's useless.

At least it's very explicit. It's a bit verbose but I think I prefer this side of the tradeoff.

ogrisel · 2023-06-23T09:33:44Z

But if you want to add an iterator of future results later that's a completely different return type.

If we ever go into returning "future" or "promise" objects I think we should introduce a completely new API (and probably mimic that of concurrent.future / dask).

fcharras · 2023-06-23T10:05:07Z

Some more online discussion later, we converge on return_as=list/generator/unordered_generator. I'll update the PR in this direction.

(About futures/promises, returning such objects is definitely not in the scope of what Parallel offers, the misunderstanding comes from bad wording from me early on, sorry about that.)

ogrisel · 2023-06-23T10:05:34Z

After discussion at the scikit-learn sprint, I also prefer the following:

return_as="list"/"generator"/"unordered_generator"

This is explicit enough, technically correct and concise enough.

tomMoral

LGTM, just one nitpick

joblib/_parallel_backends.py

Co-authored-by: Thomas Moreau <thomas.moreau.2010@gmail.com>

tomMoral · 2023-06-28T07:56:36Z

Merging as the failure on sklearn is not related to this PR.

fcharras added 4 commits June 22, 2023 11:49

return_generator={True,False} -> return_as={'list','submitted'}

27629e0

Add the link to the PR to CHANGES.rst

e98a91f

backquote formatting

debd352

Merge branch 'master' into enh/change_generator_api

e4ffb3f

linting

542d081

tomMoral approved these changes Jun 22, 2023

View reviewed changes

apply review suggestions

7fad99d

fcharras added 2 commits June 23, 2023 15:22

"submitted/completed" replaced with "generator/generator_unordered"

efcdb90

minor fixups

1e717d2

tomMoral approved these changes Jun 23, 2023

View reviewed changes

joblib/_parallel_backends.py Outdated Show resolved Hide resolved

fcharras changed the title ~~return_generator={True,False} -> return_as={'list','submitted'}~~ return_generator={True,False} -> return_as={'list','generator'} Jun 23, 2023

Typo

7c2043f

Co-authored-by: Thomas Moreau <thomas.moreau.2010@gmail.com>

ogrisel approved these changes Jun 23, 2023

View reviewed changes

fcharras mentioned this pull request Jun 26, 2023

FEA Implement generator unordered parameter #1463

Merged

Merge branch 'master' into enh/change_generator_api

8862e02

tomMoral merged commit 83f9169 into joblib:master Jun 28, 2023
13 of 16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`return_generator={True,False}` -> `return_as={'list','generator'}` #1458

`return_generator={True,False}` -> `return_as={'list','generator'}` #1458

fcharras commented Jun 22, 2023

codecov bot commented Jun 22, 2023 •

edited

tomMoral left a comment

betatim commented Jun 23, 2023

rth commented Jun 23, 2023 •

edited

ogrisel commented Jun 23, 2023

ogrisel commented Jun 23, 2023

fcharras commented Jun 23, 2023 •

edited

ogrisel commented Jun 23, 2023 •

edited

tomMoral left a comment

tomMoral commented Jun 28, 2023

return_generator={True,False} -> return_as={'list','generator'} #1458

return_generator={True,False} -> return_as={'list','generator'} #1458

Conversation

fcharras commented Jun 22, 2023

codecov bot commented Jun 22, 2023 • edited

Codecov Report

tomMoral left a comment

Choose a reason for hiding this comment

betatim commented Jun 23, 2023

rth commented Jun 23, 2023 • edited

ogrisel commented Jun 23, 2023

ogrisel commented Jun 23, 2023

fcharras commented Jun 23, 2023 • edited

ogrisel commented Jun 23, 2023 • edited

tomMoral left a comment

Choose a reason for hiding this comment

tomMoral commented Jun 28, 2023

`return_generator={True,False}` -> `return_as={'list','generator'}` #1458

`return_generator={True,False}` -> `return_as={'list','generator'}` #1458

codecov bot commented Jun 22, 2023 •

edited

rth commented Jun 23, 2023 •

edited

fcharras commented Jun 23, 2023 •

edited

ogrisel commented Jun 23, 2023 •

edited