Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop stream shielding; it was from a legacy api design #230

Merged
merged 3 commits into from Sep 2, 2021

Conversation

goodboy
Copy link
Owner

@goodboy goodboy commented Sep 1, 2021

Before we had async with Portal.open_stream_from() which has explicit enter/exit semantics, we had a implicit portal-returns-stream-reference thing which required that if a stream.receive() was cancelled, we relayed that via a stream closure to the other side.

Since we have the new API now we don't need this shielding (aka we won't .aclose() the stream on cancellation) any more.

This PR removes all that and the CI should prove that all things are well.

Once CI is clean I'll just remove the first patch draft which comments all the unneeded code.

@goodboy goodboy force-pushed the drop_stream_shielding branch 2 times, most recently from 0c6e7ca to 8a57cb4 Compare September 1, 2021 11:59
@goodboy goodboy mentioned this pull request Sep 1, 2021
3 tasks
if self._shielded:
log.warning(f"{self} is shielded, portal channel being kept alive")
return

# XXX: This must be set **AFTER** the shielded test above!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This XXX note is obsolete now? also, what happens when await self._ctx.send_stop() below is cancelled? I can't tell locally if that is handled internally, or for that matter if cancelling it is not a problem.

Copy link
Owner Author

@goodboy goodboy Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh good catch and good question.

Technically if we're cancelled there's not a lot that can be done anyway since you run into the 2-general's problem in terms of getting back an ack to the stop msg. So even if we were to wait, how long would we wait, and further if we're cancelled the transport layer may still stay active but the far end task may have already terminated (depending on which side of the stream was the context opener).

One thing we can think about is a stricter ack to the surrounding context cancellation maybe?

Copy link
Owner Author

@goodboy goodboy Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so i think the short answer is, if we're cancelled on .send_stop() then it's as if it wasn't sent and the surrounding context is responsible for relaying that cancellation back to the other side.

So you aren't really supposed to be able to tell what happens from here, you'd want to look at .open_stream() i think.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a similar call is cancelled there, so it seems like the far side is abandoned IIUC. to maintain your one-shot invariant you'll need a shielded scope with a deadline, and some way to configure that deadline, and then some way to mark the other side as lost (and maybe panic and blow up the whole portal?) when the deadline passes.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yes sorry so in the bidir case this block isn't invoked on the caller (opener of context) side since self._ctx._portal will be None. So if the far end callee task is cancelled while closing, i'm not sure it improves anything to try-to-ensure the stop is relayed given that we'll get the context cancel bubble across anyway on the caller side. Maybe this is not clear enough or should change though?

Copy link
Owner Author

@goodboy goodboy Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So i guess the good question you've made is should there be attempted stream state consistency on cancellation during .aclose() yes?

If we stick with the an op that is cancelled should be treated like it never happened then i think the way this is implemented is fine right?

If the callee tried to close but was cancelled does that mean it closed or not?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right and it seems to me that's fine as long as it's consistent with the rest of the system design.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this was the whole point of Portal.open_context() being necessary before you can open a stream - exactly to handle the cancellation/failure case (in one place) without having to try and jam it into the stream state tracking.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in, does any other part of the system need to know if someone wanted to close the stream but couldn't do it in time? probably not but I wouldn't know.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah i don't think so.

As per discussion in chat, if you have a stream state desync there's not a ton can be done without a supervisory strategy to handle the cancellation (could be parent(caller) or child(callee) side) and either way, the stream shouldn't be closed implicitly by cancellation (what this patch changes) so it's up to whoever handles that trio.Cancelled to determine what to do about a cancellation - in the most default case the stream is closed more thoroughly in the exit step and later in the context closure.

For our current support of only a one shot streaming phase in the dialog i think this is sufficient since we shouldn't be attempting to re-open the same stream without a new context.

The whole origin was not having an explicit open/close semantic for
streams. We have that now so this internal mechanic isn't needed and
further our streams become more correct by having `.aclose()` be
independent of cancellation.
@goodboy goodboy merged commit e5845b5 into master Sep 2, 2021
@goodboy goodboy deleted the drop_stream_shielding branch September 2, 2021 20:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants