Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changes resumability spec to make it compatible with QUIC #312

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

OlegDokuka
Copy link
Member

@OlegDokuka OlegDokuka commented Aug 23, 2020

This PR introduces breaking changes to resumability spec which opens the following features:

  1. Fine-grained resumption control: A Responder can decide whether a REQUEST should be resumable or not. For that purpose, a list of frames (PAYLOAD, CANCEL, REQUEST_N) now includes a new flag R which indicates whether this is a resumable stream or not. Such a design makes it possible to apply resumability for individual streams rather than storing all frames on the connection level.
  2. The resumption progress is done on the stream basis: Instead of storing a global position, each stream counts the local (P)position, (I)mplied Position, (R)etained Position. Which makes it possible for a Client to encode part of this information into the KEEPALIVE frame using the defined format, so a Server can decode it back and release a required number of frames for every individual stream. Such a design makes it possible to use resumability for transports, which does not guarantee contiguous frames sending but contiguous frame sending for individual streams (e.g. QUIC, HTTP/3)

@OlegDokuka
Copy link
Member Author

cc @tmontgomery

@OlegDokuka OlegDokuka linked an issue Aug 23, 2020 that may be closed by this pull request
@OlegDokuka OlegDokuka added this to the 1.0 milestone Aug 23, 2020
Protocol.md Outdated
Comment on lines 471 to 482
|0|0| Number of Entries |
+-------------------------------+-------------------------------+
|0| Stream ID |
+-------------------------------+-------------------------------+
| Bytes To Skip |
+-------------------------------+-------------------------------+
|0| Stream ID |
+-------------------------------+-------------------------------+
| Bytes To Skip |
+-------------------------------+-------------------------------+
| ... |
+-------------------------------+-------------------------------+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also define the meaning of these new fields below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, if this draft sounds like a good start, I will finalize it

@stevegury
Copy link
Member

so a Server can decode it back and skip a specified number of bytes for every individual stream

Shouldn't it be: "skip a number of frames"?

@OlegDokuka
Copy link
Member Author

so a Server can decode it back and skip a specified number of bytes for every individual stream

Shouldn't it be: "skip a number of frames"?

I guess it can be frames. But the current implementations count position in bytes (for some reasons), thus I did not change that behavior

@OlegDokuka
Copy link
Member Author

@stevegury @linux-china @nebhale @benjchristensen @tmontgomery @rstoyanchev I have finalized the spec changes, feel free to propose more changes in the idea / wording

@OlegDokuka OlegDokuka marked this pull request as ready for review October 1, 2020 12:49
@OlegDokuka OlegDokuka force-pushed the enhancement/resumability-changes branch from 450fe76 to 11c2889 Compare October 1, 2020 13:30
Protocol.md Show resolved Hide resolved
Protocol.md Outdated Show resolved Hide resolved
Protocol.md Outdated Show resolved Hide resolved
Protocol.md Outdated Show resolved Hide resolved
Protocol.md Outdated Show resolved Hide resolved
Copy link
Contributor

@rstoyanchev rstoyanchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About KEEPALIVE becoming potentially large:

  1. In my mind per-stream resumption guides towards being more selective to enable resumption only for "expensive" streams. That means the overall number shouldn't be anywhere near the current max allowed. I wonder if we shouldn't go further and choose a much lower limit on the max entries that that still allows plenty of resumable streams while also keeping the size of KEEPALIVE reasonable?
  2. By default RSocket Java doesn't cleanup the store based on positions from KEEPALIVE. If are going to expand the amount of data for implied positions, then perhaps there should be a mode in which KEEPALIVE omits sending implied positions, if those are going to be ignored anyway.

About where (R)esume flag applies:

  1. If we take resumption to the level of a stream, I wonder if it shouldn't only apply to actual "streams" as in REQUEST_STREAM and REQUEST_CHANNEL? In other words how much value is there from having it on REQUEST_RESPONSE and REQUEST_FNF.
  2. I can see that PAYLOAD needs to have the (R)esume flag so that a Responder can agree to make the stream resumable but I don't entirely see the reason for having {R}resume on REQUEST_N or CANCEL?

Protocol.md Outdated Show resolved Hide resolved
@OlegDokuka
Copy link
Member Author

@stevegury @benjchristensen @yschimke @rstoyanchev @simonbasle @nkristek @smaldini @spencergibb

folks, I have finalized spec changes after our last discussion. I have added a resumption scenario that corresponds to what is described in the new spec. Please have a look to get more ideas about the new Resumption spec

Copy link
Contributor

@rstoyanchev rstoyanchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to do another round but for now sending through some minor edits.

Protocol.md Outdated Show resolved Hide resolved
Protocol.md Outdated Show resolved Hide resolved
Protocol.md Outdated Show resolved Hide resolved
Protocol.md Outdated Show resolved Hide resolved
Protocol.md Outdated Show resolved Hide resolved
Protocol.md Outdated Show resolved Hide resolved
OlegDokuka and others added 6 commits March 5, 2021 21:31
Co-authored-by: Rossen Stoyanchev <rstoyanchev@users.noreply.github.com>
Co-authored-by: Rossen Stoyanchev <rstoyanchev@users.noreply.github.com>
Co-authored-by: Rossen Stoyanchev <rstoyanchev@users.noreply.github.com>
Co-authored-by: Rossen Stoyanchev <rstoyanchev@users.noreply.github.com>
Co-authored-by: Rossen Stoyanchev <rstoyanchev@users.noreply.github.com>
Co-authored-by: Rossen Stoyanchev <rstoyanchev@users.noreply.github.com>
@@ -91,7 +91,7 @@ The following are features of Data and Metadata.
The RSocket protocol uses a lower level transport protocol to carry RSocket frames. A transport protocol MUST provide the following:

1. Unicast [Reliable Delivery](https://en.wikipedia.org/wiki/Reliability_(computer_networking)).
1. [Connection-Oriented](https://en.wikipedia.org/wiki/Connection-oriented_communication) and preservation of frame ordering. Frame A sent before Frame B MUST arrive in source order. i.e. if Frame A is sent by the same source as Frame B, then Frame A will always arrive before Frame B. No assumptions about ordering across sources is assumed.
1. Stream-Oriented (similar to [connection-oriented](https://en.wikipedia.org/wiki/Connection-oriented_communication)) preservation of frame ordering for the same stream within a multiplexed connection. If frame A is sent before frame B, within the same stream, on the same connection, then frame A MUST arrive before frame B. No assumptions are made about ordering across different streams within the same multiplexed connection.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should drop connection-oriented completely here. Isn't a TCP socket still an option but with known limitations like head of line blocking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the point here is to emphasize that we don't need connection-oriented message sending anymore but we rather need stream-oriented. If connection support stream-oriented message sending via connection-oriented message sending - it is fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth a clarifying note. What does this spec mean for something without streams?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. Make sense

* __Last Received Position__: (63 bits = max value 2^63-1) Unsigned 63-bit long of Resume Last Received Position. Value MUST be > 0. (optional. Set to all 0s when not supported.)
* __Number of Entries__: (24 bits = max value 2^20 = 1,048,576) (optional. Set to all 0s when not supported.)
* __Stream ID__: (31 bits = max value 2^31-1 = 2,147,483,647) Unsigned 31-bit integer representing the resumable stream Identifier. (present if Number of Entries > 0)
* __Implied Position__: (64 bits = max value 2^64) Unsigned 64-bit value of the individual Stream Resume Last Received Frame Position from the remote party. Value MUST be > 0. (present if Number of Entries > 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any real world use cases and can understand the size of these messages? Seems like they could be dauntingly large? I'll defer to other directly involved in the discussion, but looks like a huge jump in size and processing complexity here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing a random idea out there, for a connection-oriented stream, should we still support something without per stream accounting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Will clarify that

Protocol.md Outdated Show resolved Hide resolved
* __Last Received Position__: (63 bits = max value 2^63-1) Unsigned 63-bit long of Resume Last Received Position. Value MUST be > 0. (optional. Set to all 0s when not supported.)
* __Number of Entries__: (24 bits = max value 2^20 = 1,048,576) (optional. Set to all 0s when not supported.)
* __Stream ID__: (31 bits = max value 2^31-1 = 2,147,483,647) Unsigned 31-bit integer representing the resumable stream Identifier. (present if Number of Entries > 0)
* __Implied Position__: (64 bits = max value 2^64) Unsigned 64-bit value of the individual Stream Resume Last Received Frame Position from the remote party. Value MUST be > 0. (present if Number of Entries > 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will defer to others, to my mind the limiting to 2^63 makes it way simpler to safely store in a Long. 2^64 is really awkward in Java.

Co-authored-by: Yuri Schimke <yuri@schimke.ee>
Protocol.md Outdated Show resolved Hide resolved
Co-authored-by: Niclas Kristek <dev@nkristek.com>
@@ -732,19 +751,25 @@ RSocket resumption exists only for specific cases. It is not intended to be an
1. Resumption is always initiated by the client and either allowed or denied by the server.
1. Resumption makes no assumptions of application state for delivered frames with respect to atomicity, transactionality, etc. See above.

### Resume Agreement

If communication with resumption support has been successfully established, the decision on whether a particular REQUEST should be resumed is made for every stream individually. To make resumption enabled for a REQUEST, the Responder MUST explicitly set the (R)esume flag in the first response FRAME to the Requester. Once, the Requester received the first FRAME from the Responder and that frame has (R)esume flag set, then the current stream is assumed to be resumable. Otherwise, if the first Responder FRAME does not have (R)esume flag set, the stream is MUST be considered as not resumable. If a Requester has sent a REQUEST frame and then the connection was lost leaving the resumption for that stream in the unknown state, such a stream should wait until the connection is reestablished, and the RESUME | RESUME_OK frame is received. If RESUME | RESUME_OK includes the mentioned stream in the list of resumable streams, then the resumption is assumed to be enabled for that stream. Otherwise, the stream may be terminated with the __REJECTED__ or retransmit frame again.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please improve the wording of the last sentence?

Otherwise, the stream may be terminated with the REJECTED or retransmit frame again.

What frame is retransmitted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non, it should be terminated locally since the remote party decided to discontinue that stream. I will improve wording, had the same that feeling of absence "locally only" words but did not how to keep it short

* Client waits for either a RESUME_OK or ERROR[CONNECTION_ERROR|REJECTED_RESUME] frame from the server.
* On receiving an ERROR[REJECTED_RESUME] frame, the client MUST NOT attempt resumption again.
* On receiving a RESUME_OK, the client:
* MUST assume that the next REQUEST, CANCEL, ERROR, and PAYLOAD frames have an implied position commencing from the last implied positions
* MAY retransmit *all* REQUEST, CANCEL, ERROR, and PAYLOAD frames starting at the RESUME_OK Last Received Position field value from the server.
* MUST assume that the next REQUEST, CANCEL, ERROR, and PAYLOAD frames continuously increasing their corresponding Received Frame Count

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do RSocket fragments (frames with fragmentFollows flag) also increase the frame count? Imho we should explicitly write it here for additional clarification.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fragments follow flag is a part of Payload frame so it implicitly means that it has to be counted since fragment is still a payload frame

@OlegDokuka
Copy link
Member Author

@stevegury @benjchristensen do you have any feedback to share or shell we proceed with what we have and you will share yours afterward?


```
Client | Server
1 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add numbers on the Server side, either replacing or in addition to the vertical bars, just like the line numbers on the Client side. It will make it easier to match.

18 <- PAYLOAD (Resume(1) StreamID(3)) // RPosition(1) Position(1) IPosition(1) |
19 <- REQUEST(N) (Resume(0) StreamID(3)) // RPosition(1) Position(1) IPosition(2) |
20 -> REQUEST_STREAM (StreamID(5)) // RPosition(0) Position(1) IPosition(0) |
21~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~DISCON|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be spelled out fully on both sides I think, it's clear that is it is the same event. Or perhaps also use directional arrows at the beginning of the line (like for payloads).

29 -> RESUME (NoE(2), { StreamID(1) : 1, StreamID(3) : 2 }) |
30 | -> RESUME (NoE(2), { StreamID(1) : 1, StreamID(3) : 2 })
31 | <- RESUME_OK (NoE(2), { StreamID(1) : 1, StreamID(3) : 1 })
32 | <- REDELIVER 3 FRAMES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use a comment for "REDELIVER" ?

<- // REDELIVER 3 FRAMES 

according to the above sequence we can describe it as the following:
<pre>
<code>
[<span style="color:red"><b>Line 3</b></span>][<span style="color:green"><i>Client</i></span>][StreamID(0)] sends Setup with Resumability enabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be good to have comments above their respective lines:

Client                                                                                         | Server 
1                                                                                                |
2                                                                                                |
      // send Setup with Resumability enabled
3  -> SETUP(ResumeFlag(1))                                                  |
4                                                                                               | -> SETUP
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Weaker frame ordering in Connection-Oriented communication
6 participants