Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand concurrent.futures.Future's public API #83826

Open
aeros opened this issue Feb 16, 2020 · 10 comments
Open

Expand concurrent.futures.Future's public API #83826

aeros opened this issue Feb 16, 2020 · 10 comments
Assignees
Labels
3.9 only security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-feature A feature request or enhancement

Comments

@aeros
Copy link
Contributor

aeros commented Feb 16, 2020

BPO 39645
Nosy @gvanrossum, @brianquinlan, @pitrou, @aeros, @iforapsy, @jakirkham

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = 'https://github.com/aeros'
closed_at = None
created_at = <Date 2020-02-16.07:38:05.012>
labels = ['type-feature', 'library', '3.9']
title = "Expand concurrent.futures.Future's public API"
updated_at = <Date 2020-06-12.21:00:12.550>
user = 'https://github.com/aeros'

bugs.python.org fields:

activity = <Date 2020-06-12.21:00:12.550>
actor = 'jakirkham'
assignee = 'aeros'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2020-02-16.07:38:05.012>
creator = 'aeros'
dependencies = []
files = []
hgrepos = []
issue_num = 39645
keywords = []
message_count = 10.0
messages = ['362047', '362110', '362163', '363114', '363117', '363119', '363121', '363124', '363205', '368440']
nosy_count = 7.0
nosy_names = ['gvanrossum', 'bquinlan', 'pitrou', 'Big Stone', 'aeros', 'iforapsy', 'jakirkham']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue39645'
versions = ['Python 3.9']

@aeros
Copy link
Contributor Author

aeros commented Feb 16, 2020

Based on the following python-ideas thread: https://mail.python.org/archives/list/python-ideas@python.org/thread/LMTQ2AI6A7UXEFVHRGHKWD33H24FGM6G/#ICJKHZ4BPIUMOPIT2TDTBIW2EH4CPNCP.

In the above ML thread, the author proposed adding a new cf.SerialExecutor class, which seems to be not a great fit for the standard library (based on the current state of the discussion, as of writing this). But, Guido mentioned the following:

IOW I'm rather lukewarm about this -- even if you (Jonathan) have found use for it, I'm not sure how many other people would use it, so I doubt it's worth adding it to the stdlib. (The only thing the stdlib might grow could be a public API that makes implementing this feasible without overriding private methods.)

Specifically, the OPs proposal should be reasonably possible to implement (either just locally for themselves or a potential PyPI package) with a few minor additions to cf.Future's public API:

  1. Add a means of *publicly* accessing the future's state (future._state) without going through the internal condition's RLock.

This would allow the developer to implement their own condition or other synchronization primitive to access the state of the future. IMO, this would best be implemented as a separate future.state() and future.set_state().

  1. Add a means of *publicly* accessing the future's result (future._result) without going through the internal condition's RLock.

This would be similar to the above, but since there's already a future.result() and future.set_result(), I think it would be best implemented as an optional sync parameter that defaults to True. When set to False, it directly accesses future._result without the condition; when set to True, it has the current behavior.

  1. Add public global constants for the different possible future states: PENDING, RUNNING, CANCELLED, CANCELLED_AND_NOTIFIED, and FINISHED.

This would be useful to serve as a template of possible future states for custom implementations. I also find that fut.set_state(cf.RUNNING) looks better than fut.state("running") from an API design perspective.

Optional addition: To make fut.state() and fut.set_state() more useful for general purposes, it could have a single sync boolean parameter (feel free to bikeshed over the name), which changes whether it directly accesses future._state or does so safely through the condition. Presumably, the documentation would explicitly state that with sync=False, the future's state will not be synchronized across separate threads or processes. This would also allow it to have the same API as future.result() and future.set_result().

Also, as for my own personal motivation in expanding upon the public API for cf.Future, I've found that directly accessing the state of the future can be incredibly useful for debugging purposes. I made significant use of it while implementing the new *cancel_futures* parameter for executor.shutdown(). But, since future._state is a private member, there's no guarantee that it will continue to behave the same or that it can be relied upon in the long-term. This may not be a huge concern for quick debugging sessions, but could easily result in breakage when used in logging or unit tests.

@aeros aeros added the 3.9 only security fixes label Feb 16, 2020
@aeros aeros self-assigned this Feb 16, 2020
@aeros aeros added stdlib Python modules in the Lib dir type-feature A feature request or enhancement 3.9 only security fixes labels Feb 16, 2020
@aeros aeros self-assigned this Feb 16, 2020
@aeros aeros added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Feb 16, 2020
@gvanrossum
Copy link
Member

I'll leave it to Brian and/or Antoine to review this. Good luck!

@aeros
Copy link
Contributor Author

aeros commented Feb 17, 2020

Upon further consideration and based on recent developments in the python-ideas thread (mostly feedback from Antoine), I've decided to reduce the scope of this issue to remove future.set_state() and the *sync* parameters.

This leaves just future.state() and having the states as publicly accessible module-level constants.

@aeros
Copy link
Contributor Author

aeros commented Mar 2, 2020

This leaves just future.state() and having the states as publicly accessible module-level constants.

After reading over the python-ideas again and having some time to reflect, I think we can start with just adding future.state(), as that had the most value. Since future.state() is primarily intended for debugging/informational purposes as an approximation (similar to queue.qsize()) rather than something to be consistently relied upon, I don't see a strong practical use case for returning the states as enums (instead of a string).

If we consider adding a means to directly modify the state of the future in the future or providing an option to safely read the state of the future (through its RLock) later down the road, it may be worth considering. But not at the moment, IMO.

I'll update the name of the issue accordingly, and should have time to open a PR in the next few days. It should be rather straightforward, with the main emphasis being on the documentation to ensure that it clearly communicates the purpose of future.state(); so that users don't assume it's anything more than an approximation.

@aeros aeros changed the title Expand concurrent.futures.Future's public API Read approximate state of concurrent.futures.Future Mar 2, 2020
@aeros aeros changed the title Expand concurrent.futures.Future's public API Read approximate state of concurrent.futures.Future Mar 2, 2020
@gvanrossum
Copy link
Member

I’m a bit disappointed, since it looks like this won’t allow implementing the OP’s classes without using private APIs. The debugging and loggin use cases aren’t very compelling to me.

@aeros
Copy link
Contributor Author

aeros commented Mar 2, 2020

I’m a bit disappointed, since it looks like this won’t allow implementing the OP’s classes without using private APIs. The debugging and loggin use cases aren’t very compelling to me.

Yeah, I had every intention when I initially proposed the idea on the python-ideas thread to provide an extensive means of implementing custom Future and Executor classes, but Antoine brought up a very valid concern about users shooting themselves in the foot with it:

I'm much more lukewarm on set_state(). How hard is it to reimplement
one's own Future if someone wants a different implementation? By
allowing people to change the future's internal state, we're also
giving them a (small) gun to shoot themselves with.

In terms of a cost-benefit analysis, I'd imagine that it's going to be a rather small portion of the concurrent.futures users that would actually have a genuine use case for implementing their own custom Future or Executor.

He was still approving of a future.state() though, which is why I considered implementing it alone:

That sounds useful to me indeed. I assume you mean something like a
state() method? We already have Queue.qsize() which works a bit like
this (unlocked and advisory).

Although not nearly as significant (or interesting), I've personally encountered situations where it would be useful for logging purposes to be able to read the approximate state of the future. But if the consensus ends up being that it's not useful enough to justify adding compared to the original purpose of the issue, I would certainly understand.

@gvanrossum
Copy link
Member

But note my response to Antoine at the time, mentioning that implementing ‘as_completed()’ is impossible that way. Antoine then backtracked somewhat.

@aeros
Copy link
Contributor Author

aeros commented Mar 2, 2020

But note my response to Antoine at the time, mentioning that implementing
‘as_completed()’ is impossible that way. Antoine then backtracked somewhat.

Ah, I had seen that but for some reason hadn't considered that Antoine might have also changed his stance on a means of modifying the state of the future:

> It's actually really hard to implement your own
> Future class that works
> well with concurrent.futures.as_completed() -- this is basically what
> complicated the OP's implementation. Maybe it would be useful to look into
> a protocol to allow alternative Future implementations to hook into that?

Ah, I understand the reasons then. Ok, it does sound useful to explore
the space of solutions. But let's decouple it from simply querying the
current Future state.

In that case, I'll revert the title and leave the issue open for further discussion; but I'll hold off on any PRs until we have some consensus regarding the direction we want to go in with regards to potential new future protocols. Apologies for the misunderstanding, thanks for clarifying. :)

I'd also be interested in hearing Brian Quinlain's thoughts on the matter.

@aeros aeros changed the title Read approximate state of concurrent.futures.Future Expand concurrent.futures.Future's public API Mar 2, 2020
@aeros aeros changed the title Read approximate state of concurrent.futures.Future Expand concurrent.futures.Future's public API Mar 2, 2020
@brianquinlan
Copy link
Contributor

I'll try to take a look at this before the end of the week, but I'm currently swamped with other life stuff :-(

@BigStone
Copy link
Mannequin

BigStone mannequin commented May 8, 2020

it seems this feature would interest Dask team.
dask/distributed#3695

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 only security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-feature A feature request or enhancement
Projects
Status: No status
Development

No branches or pull requests

4 participants