Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thaw queues on awaiting stuck futures #365

Merged
merged 1 commit into from
Dec 13, 2019
Merged

Thaw queues on awaiting stuck futures #365

merged 1 commit into from
Dec 13, 2019

Conversation

mdellweg
Copy link
Member

@mdellweg mdellweg commented Nov 6, 2019

fixes #5668

mdellweg pushed a commit to mdellweg/pulp_deb that referenced this pull request Nov 6, 2019
mdellweg pushed a commit to mdellweg/pulp_deb that referenced this pull request Nov 6, 2019
@bmbouter
Copy link
Member

bmbouter commented Nov 6, 2019

@mdellweg can you give me more insight into this change? It's a plugin writer facing feature so we also need a pulp.plan.io entry for it.

@mdellweg
Copy link
Member Author

mdellweg commented Nov 6, 2019

@gmbnomis care to take a look?

mdellweg pushed a commit to mdellweg/pulp_deb that referenced this pull request Nov 13, 2019
@@ -0,0 +1 @@
Allow get_or_create_future without deactivating does_batch on DeclarativeContent from the beginning.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This release note is for the plugin API, can it be moved to the plugin API folder in CHANGES?

@@ -86,6 +86,25 @@ def __init__(self, artifact=None, url=None, relative_path=None, remote=None, ext
return download_result


class DCFuture(asyncio.Future):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need an autodoc entry so it shows in the docs also somewhere on this page https://github.com/pulp/pulpcore/blob/master/docs/plugins/api-reference/stages.rst

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is actually something we need to discuss, because as i see it, this is inner plumbing. The user does not even need to know about that class.

@@ -113,7 +132,7 @@ class DeclarativeContent:
ValueError: If `content` is not specified.
"""

__slots__ = ('content', 'd_artifacts', 'extra_data', 'does_batch', 'future')
__slots__ = ('content', 'd_artifacts', 'extra_data', 'does_batch', 'future', 'thaw_queue_event')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The future and the thaw_queue_event seem very similar. I'd like to think more about how we can provide this value to plugin writers without adding more facilities they need to use. Part of the concern is that if the plugin writer needs to take extra care to use this facility, how is that different than the plugin writer taking extra care to clear the does_batch attribute like we document here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether thaw_queue_event needs to be stored in content at all. If I understand your PR correctly, we need thaw_queue_event only if there is a future. As we use our own future type here anyway, couldn't we store it in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmbouter The way this change works, the user isn't supposed to do more than before. The only thing that is new, is that he can omit to set does_batch=False. So we can even safely remove that warning, that using futures can jam the queue.
@gmbnomis That is a matter of perspective, but this version allows to create the future, even after the d_content has been injected into the queue. Also just setting it might not even be slower than looking if there is a future and then conditionally set it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdellweg I thought about this difference, but injecting the d_content into the queue and adding the future later is a bug IMO. The queue operation should be async and we don't know when it comes back. The DC may have passed the entire pipeline before one can actually attach a future to it.

My argument was not about speed (which is probably in the same ball park), but about isolation. If the implementation of the "thaw queue" feature can be self-contained in DCFuture, I think we should do that.

@bmbouter
Copy link
Member

@mdellweg how would you feel in us including this in 3.1? I see the benefits so that is clear, but I'm concerned about how we're offering them. I want to work to find a way to bring this benefit to plugin writers without adding the the options/facilities they need to think about.

@@ -86,6 +86,25 @@ def __init__(self, artifact=None, url=None, relative_path=None, remote=None, ext
return download_result


class DCFuture(asyncio.Future):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class looks like a mixture between a proxy to a future and a subclassed future. But I don't get why. Wouldn't the following work?

class DCFuture(asyncio.Future):
    """A special future, that can thaw the queue it is stuck in."""
    __slots__ = ["d_content"]

    def __init__(self, d_content):
        self.d_content = d_content

    def __await__(self):
        self.d_content.does_batch = False
        if self.d_content.thaw_queue_event:
            self.d_content.thaw_queue_event.set()
        return super().__await__()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I wanted to make it a subclass (so that is is a future). But then i didn't see a way to inject it into:
self.future = asyncio.get_event_loop().create_future()
So it became a proxy, and i forgot about the inheritance. Thinking about this now, we can probably make the DC the proxy itself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this now, we can probably make the DC the proxy itself.

This would turn the DeclarativeContent into an awaitable future-like object itself. Though i like the concept, it would be backwards incompatible, and rather involved. I am not sure it's worth persuing it today.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I even like the proper proxy you changed it into now better. The async trigger mechanism is an optional feature of DC, but not its main feature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've always seen the DC as some kind of promised future, so i thinks it feels kind of natural to await them. As a result of a discussion with @bmbouter i prepared another (breaking) solution, where the DC's can be awaited directly.
https://github.com/mdellweg/pulpcore/tree/awaitable_dc

Copy link
Contributor

@gmbnomis gmbnomis Nov 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, we are in "we can have a nicer (but breaking) API" land?

It's great that your change gets rid of the exposed Future, since Futures are considered low-level objects that should not be exposed in user level APIs (https://docs.python.org/3/library/asyncio-future.html#future-object).

But, as said, let's please not await DC directly:

  1. It is a pretty uncommon thing to do. The basic building block of an async API seems to be the coroutine, not an awaitable object instance. See for example the high-level APIs offered by asyncio (https://docs.python.org/3/library/asyncio.html). None of them seem to offer a directly awaitable object instance.
  2. It is more explicit, we could e.g. have async def wait_for_resolution(self) which is clearer IMHO than just awaiting the DC (and can have a nice docstring)
  3. It allows us to enhance the API later, we could e.g. add a async def wait_for_download(self) to await an earlier event. Or we can add optional parameters if the need arises.

wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, we are in "we can have a nicer (but breaking) API" land?

Since it was rescheduled to 3.1 that is at least an option. And i am throwing ideas here (which is why i did not just replace this branch).
If what you are saying is you liked to use it like

await self.put(d_content)
content = await d_content.resolution()

instead of

await self.put(d_content)
content = await d_content

or the current version

fut = d_content.get_or_create_future()
await self.put(d_content)
content = await fut

i think, i can have a working example in no time.

@bmbouter
Copy link
Member

I'm branching 3.0 now. If we are able to resolve this before the actual GA I'd be OK cherry picking it since it's a) additive only and b) only affects plugin writers. Let's continue the collaboration!

mdellweg pushed a commit to mdellweg/pulp_deb that referenced this pull request Nov 14, 2019
@mdellweg
Copy link
Member Author

@gmbnomis is this what you expected to see?
Sure documentation still needs to be adjusted...


def __init__(self, content=None, d_artifacts=None, extra_data=None, does_batch=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ipanova does docker use does_batch?

[thaw_event_listener, get_listener], return_when=asyncio.FIRST_COMPLETED
)
if thaw_event_listener in done:
await thaw_event_listener
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like we mentioned on the call, I don't see how awaiting here productive since thaw_event_listener is already in done.

if content is None:
shutdown = True
log.debug(_('%(name)s - shutdown.'), {'name': self})
else:
if not content.does_batch:
if not content._resolved and content._future is not None:
Copy link

@simon-baatz simon-baatz Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the stage should not access _resolved and _future directly (this is different from _thaw_queue_event which serves as the communication means between Stage and DC).

A does_batch property in DC would be a nice way to keep this in DC I guess. A plugin writer could even derive from DC and use a different mechanism to decide what to batch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, i see now, where this is going. But i do not want to leave it as a variable on the DC, because we cannot just clear it after resolving without surprise. So i'm going to make it a property. A derived DC can then alter the logic to it's need (for example wait only til artifacts are ready).

@gmbnomis
Copy link
Contributor

@gmbnomis is this what you expected to see?

Yes, I like the interface much more than the old one.

@mdellweg
Copy link
Member Author

@gmbnomis, @bmbouter ^


@property
def does_batch(self):
return not self._resolved and self._future is None
Copy link
Contributor

@gmbnomis gmbnomis Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a note here that an overridden method must never 'overrule' the original implementation (i.e. never turn a "does not batch" into a "does batch")


...

foo_a_content = await foo_a_future # awaits until the foo_a reaches this stage
foo_a_content = await foo_a.resolution() # awaits until the foo_a reaches this stage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention somewhere that, as a general rule, the pipeline will work more efficient if waiting for resolution is done as late as possible? (that's a terrible wording, but hopefully you understand what I am trying to say).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know exactly what you mean, and i will try to write something.

@gmbnomis
Copy link
Contributor

@gmbnomis, @bmbouter ^

As said, I like the new interface much better than the old one. It's @bmbouter's call whether this backwards incompatible change is ok at this point.

Regarding the "thaw queue" feature, I am still struggling with the added complexity (given that the benefit will only be seen in very specific situations and if the plugin writer knows what she/he is doing). OTOH, this may be the price we have to pay to ease the use of pipeline feedback loops without risking hangs all the time.

@bmbouter
Copy link
Member

bmbouter commented Dec 2, 2019

@mdellweg Travis is failing, maybe try rebasing?

I'm ok adding it at this point in time because we are still going to have breaking changes in the plugin API for a few Y releases. When merged this would just go into 3.1 and not backported to 3.0.

This overall seems safer so I'm inclined to accept it. Last call for blocking concerns.

@mdellweg
Copy link
Member Author

mdellweg commented Dec 3, 2019

@bmbouter rebasing helped.

@bmbouter
Copy link
Member

bmbouter commented Dec 7, 2019

I want to do a final review on this soon. Sorry for the delay

mdellweg pushed a commit to mdellweg/pulp_deb that referenced this pull request Dec 10, 2019
If it is currently waiting in a queue, unfreeze that.
Remove old interfaces does_batch and get_or_create_future.

fixes #5668
Copy link
Member

@bmbouter bmbouter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdellweg sorry this has taken so long. Now that 3.0.0 is released I had enough time to review this once more. I'm going to add a 5668.feature in another release note, but I'm merging this for now. Thank you!

@bmbouter bmbouter merged commit 2d415c3 into pulp:master Dec 13, 2019
@bmbouter
Copy link
Member

@mdellweg to complete this work please send an email to pulp-dev identifying this as a breaking change with the subject starting with [breaking]. All plugin API breaking changes need to follow that policy once merged.

@mdellweg mdellweg deleted the thaw_queues branch December 16, 2019 10:21
mdellweg pushed a commit to mdellweg/pulp_deb that referenced this pull request Dec 16, 2019
lubosmj added a commit to lubosmj/pulp_container that referenced this pull request Dec 16, 2019
lubosmj added a commit to lubosmj/pulp_container that referenced this pull request Dec 16, 2019
lubosmj added a commit to lubosmj/pulp_container that referenced this pull request Dec 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants