Goals and Motivation #1

Open
benjchristensen opened this Issue Feb 26, 2015 · 56 comments

Comments

Projects
None yet
@benjchristensen
Contributor

benjchristensen commented Feb 26, 2015

Due to the successful collaborations on Reactive Streams for the JVM and community involvement in Reactive Extensions (RxJava and friends) I want to pursue greater polyglot support of the Reactive Stream semantics. This includes both language specific interfaces and over-the-network protocols. I propose that we collaborate as a community to achieve the use cases I list below, along with any others I'm missing that we derive together.

In full disclosure, personally I am building systems that need polyglot, stream-oriented network interaction, primarily between Java and JavaScript in the near future. I prefer to collaborate and design the solution openly rather than reinvent yet another competing solution. I am unsatisfied with current solutions or unaware of better ones. Teams at Netflix are creating custom one-off solutions based on ReactiveX/Reactive-Stream semantics and I'd prefer we not do this in isolation. Selfishly I want the input of people far better at this domain than I am since I am out of my league in defining network protocols and interfaces in non-Java languages. I also want to avoid NIH (not-invented-here) and solve these problems across organizations and companies since systems I'm building will most likely outlive my involvement in them and community support and involvement in core, foundational networking and messaging layers is far better than home grown solutions in the long run. I expect this to slow me down in the near term, but greatly accelerate and improve the medium and long-term accomplishments, and significantly improve the final outcome.

The timelines I'm hoping for would be functioning prototypes and protocol drafts in a few months, with release candidates in 6-9 months (Q3/Q4-2015) and a GA release in early 2016. I and the team I work with at Netflix intend on building our systems along these timelines to be proving out what we design here.

Additionally, I hope for collaboration across expertise domains to allow for debate, critiques, ideas and solutions that would not occur while staying in our individual silos.

Use Cases

The intent is to enable Reactive Stream semantics for async, stream-oriented IO supporting backpressure and cancelation.

On top of protocols such as TCP, WebSockets and possibly HTTP/2 it would allow bi-directional, multiplexed communication for these semantics:

  • subscribe, request(n), cancel
  • onNext, onError, onComplete

Usage patterns would include:

Scalar Request, Scalar Response

This would behave similarly to RPC/IPC calls.

For example:

  • UP subscribe("hello", 1) // to eliminate round-trip, the initial request(n) could be included in the subscribe
  • DOWN onNext("World!")
  • DOWN onComplete

Scalar Request, Vector Response

This would behave similarly to HTTP Server-Sent-Events.

For example:

  • UP subscribe("names", 100) // to eliminate round-trip, the initial request(n) could be included in the subscribe
  • DOWN onNext("Dave")
  • DOWN onNext("Tom")
  • DOWN onNext("Sarah")
  • DOWN onComplete

Or with request(n) and unsubscribe on an infinite stream:

  • UP subscribe("increment", 3) // to eliminate round-trip, the initial request(n) could be included in the subscribe
  • DOWN onNext(1)
  • DOWN onNext(2)
  • DOWN onNext(3)
  • UP request(2)
  • DOWN onNext(4)
  • DOWN onNext(5)
  • UP unsubscribe

Bidirectional Streams

This would behave more like raw TCP or WebSockets.

The following example is very poor, but representative of desire for messaging UP with event propagation DOWN across multiple subscriptions.

  • UP subscribe("user-events-XYZ", 100)
  • UP subscribe("data-updates", 100)
  • UP msg("eventA", "abc") // fire-and-forget a message
  • DOWN onNext("user-events-XYZ: eventA Completed")
  • UP msg("/do/something", "args")
  • DOWN onNext("user-events-XYZ: x-updated")
  • DOWN onNext("data-event: 8756-modified")

Possible Outcomes

Intended outcomes of this pursuit are:

  1. Discover there is already a solution for this and we can shut this down and use it.
  2. Decide we can't agree and we go off and build our own custom things.
  3. We determine this is a useful and newish thing, collaborate and build the above.

Artifacts

Following are artifacts envisioned from this collaboration during this first phase.

Network Protocol

This is expected as purely a network protocol. Due to my ignorance I can't specify more, but I expect variations for:

  • binary and text (for example, into JavaScript apps it may be valuable to support text/JSON whereas interprocess Java/C/Go/etc would benefit from binary)
  • unidirectional and bidirectional transport layers (TCP vs HTTP/1 vs WebSockets vs HTTP/2 etc as tranport layers)
  • how serialiation and protocol negotiation should work

Ultimately the desire is for protocols to be defined that can work on top of TCP, HTTP/1, HTTP/2, WebSockets and possibly others like UDP.

Java Interfaces and Reference Implementation

Java interfaces for exposing the various use cases using Reactive Streams interfaces would be very powerful to allow a standard interop for Reactive Stream IO.

It is not expected to have anything more than interfaces defined, but a reference implementation with unit tests to prove functionality should be included.

JavaScript Interfaces and Reference Implementation

Similar desire as for Java above.

Network TCK

Along with the network protocol I would expect a test suite to validate implementations.

Moving Forward

As a first step I'd like to determine if there is sufficient interest and that this is not insane, completely naive and wrong, or reinventing something that already exists.

If we get through that part, I'll work with you all to create more concrete Github issues to start designing and making this happen.

@benjchristensen benjchristensen referenced this issue in reactive-streams/reactive-streams-jvm Feb 26, 2015

Closed

Polyglot Support #45

@benjchristensen

This comment has been minimized.

Show comment
Hide comment
@benjchristensen

This comment has been minimized.

Show comment
Hide comment
@benjchristensen

benjchristensen Feb 26, 2015

Contributor

cc @blesh as this relates to the discussion about involving JS in Reactive Streams reactive-streams/reactive-streams-jvm#45 (comment)

Contributor

benjchristensen commented Feb 26, 2015

cc @blesh as this relates to the discussion about involving JS in Reactive Streams reactive-streams/reactive-streams-jvm#45 (comment)

@benjchristensen

This comment has been minimized.

Show comment
Hide comment
@benjchristensen

benjchristensen Feb 26, 2015

Contributor

Also related to this is the newly created repo for collaborating on Reactive Streams for Javascript: https://github.com/reactive-streams/reactive-streams-js

Contributor

benjchristensen commented Feb 26, 2015

Also related to this is the newly created repo for collaborating on Reactive Streams for Javascript: https://github.com/reactive-streams/reactive-streams-js

@tmontgomery

This comment has been minimized.

Show comment
Hide comment
@tmontgomery

tmontgomery Feb 26, 2015

I am in. I'll help!

I am in. I'll help!

@jbrisbin

This comment has been minimized.

Show comment
Hide comment

👍

@maniksurtani

This comment has been minimized.

Show comment
Hide comment
@maniksurtani

maniksurtani Feb 26, 2015

Interesting. Count me in.

Interesting. Count me in.

@benlesh

This comment has been minimized.

Show comment
Hide comment
@benlesh

benlesh Feb 26, 2015

In full disclosure, personally I am building systems that need polyglot, stream-oriented network interaction, primarily between Java and JavaScript in the near future. I prefer to collaborate and design the solution openly rather than reinvent yet another competing solution.

Ditto... but completely different app.

We have a primitive implementation in our project at Netflix, (unrelated to @benjchristensen's project), but it's very primitive and I don't think it's directly useable in a generic way in it's current form.

Count me in.

benlesh commented Feb 26, 2015

In full disclosure, personally I am building systems that need polyglot, stream-oriented network interaction, primarily between Java and JavaScript in the near future. I prefer to collaborate and design the solution openly rather than reinvent yet another competing solution.

Ditto... but completely different app.

We have a primitive implementation in our project at Netflix, (unrelated to @benjchristensen's project), but it's very primitive and I don't think it's directly useable in a generic way in it's current form.

Count me in.

@JakeWharton

This comment has been minimized.

Show comment
Hide comment
@JakeWharton

JakeWharton Feb 26, 2015

binary and text (for example, into JavaScript apps it may be valuable to support text/JSON whereas interprocess Java/C/Go/etc would benefit from binary)

Be very careful with this distinction, and, in fact, I think you want a regular content type mechanism instead. One of the things WebSockets got horribly wrong is this arbitrary separation of text and binary frames with no additional metadata. I would stick to just saying data and allowing the request to negotiate the format of that data.

binary and text (for example, into JavaScript apps it may be valuable to support text/JSON whereas interprocess Java/C/Go/etc would benefit from binary)

Be very careful with this distinction, and, in fact, I think you want a regular content type mechanism instead. One of the things WebSockets got horribly wrong is this arbitrary separation of text and binary frames with no additional metadata. I would stick to just saying data and allowing the request to negotiate the format of that data.

@jbrisbin

This comment has been minimized.

Show comment
Hide comment
@jbrisbin

jbrisbin Feb 26, 2015

I agree with @JakeWharton about treating content as simply data and figuring out what to do with it based on the negotiated Content-Type. There's no benefit IMO to designating some payloads String and others SomethingElse[]. The need to do charset decoding in many situations means one must rely on the full power of the Content-Type.

I don't just main the simple type and subtype, either. I mean the full capability with x-custom+json and quality factors, etc...

I agree with @JakeWharton about treating content as simply data and figuring out what to do with it based on the negotiated Content-Type. There's no benefit IMO to designating some payloads String and others SomethingElse[]. The need to do charset decoding in many situations means one must rely on the full power of the Content-Type.

I don't just main the simple type and subtype, either. I mean the full capability with x-custom+json and quality factors, etc...

@benjchristensen

This comment has been minimized.

Show comment
Hide comment
@benjchristensen

benjchristensen Feb 26, 2015

Contributor

I would stick to just saying data and allowing the request to negotiate the format of that data.

Great feedback @JakeWharton and thanks for getting involved,

Contributor

benjchristensen commented Feb 26, 2015

I would stick to just saying data and allowing the request to negotiate the format of that data.

Great feedback @JakeWharton and thanks for getting involved,

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Feb 26, 2015

I think to begin with, the scope of the project should be better defined. At one extreme, it could be a very barebones protocol that doesn't do anything beyond RS semantics, message framing and maybe multiplexing. At the other extreme, it could know about Content-Types, bidi-ness negotiation, publisher discovery and creation (e.g. 'give me a publisher that reads this file'), pushing events without subscriptions, optional features that depend on the underlying protocol...

I think to begin with, the scope of the project should be better defined. At one extreme, it could be a very barebones protocol that doesn't do anything beyond RS semantics, message framing and maybe multiplexing. At the other extreme, it could know about Content-Types, bidi-ness negotiation, publisher discovery and creation (e.g. 'give me a publisher that reads this file'), pushing events without subscriptions, optional features that depend on the underlying protocol...

@benjchristensen

This comment has been minimized.

Show comment
Hide comment
@benjchristensen

benjchristensen Feb 26, 2015

Contributor

@tmontgomery Todd, how do you recommend we start defining this? I think you're the right one to start defining the network protocol. I opened #2 for that discussion to begin and would appreciate you taking a lead role in that.

Contributor

benjchristensen commented Feb 26, 2015

@tmontgomery Todd, how do you recommend we start defining this? I think you're the right one to start defining the network protocol. I opened #2 for that discussion to begin and would appreciate you taking a lead role in that.

@benjchristensen

This comment has been minimized.

Show comment
Hide comment
@benjchristensen

benjchristensen Feb 26, 2015

Contributor

@danarmak What do you recommend the scope be define as to achieve the types of use cases I mentioned?

Contributor

benjchristensen commented Feb 26, 2015

@danarmak What do you recommend the scope be define as to achieve the types of use cases I mentioned?

@danarmak

This comment has been minimized.

Show comment
Hide comment

@benjchristensen will reply in #3

@tmontgomery

This comment has been minimized.

Show comment
Hide comment
@tmontgomery

tmontgomery Feb 26, 2015

For a protocol, there is a nice tool that can be used to specify behavior and test against implementations. K3PO.

I.e. if you want a TCK-like test kit, specify your k3po scripts and any implementation can test against it.

For a protocol, there is a nice tool that can be used to specify behavior and test against implementations. K3PO.

I.e. if you want a TCK-like test kit, specify your k3po scripts and any implementation can test against it.

@MarkusJais

This comment has been minimized.

Show comment
Hide comment
@MarkusJais

MarkusJais Feb 27, 2015

I like the idea of including TCP as a base protocol. Not everything has to be built on top of HTTP :-)

I like the idea of including TCP as a base protocol. Not everything has to be built on top of HTTP :-)

@experquisite

This comment has been minimized.

Show comment
Hide comment
@experquisite

experquisite Feb 27, 2015

I'm psyched, I'll be following this closely, if not helping out. I have been thinking about my need to do this in Scala; I was planning on using scodec to serialize my messages and then persist/replicate my reactive streams using Chronicle or perhaps Aeron. However, it would be much more convenient if subscription and back-pressure were all integrated.

I'm psyched, I'll be following this closely, if not helping out. I have been thinking about my need to do this in Scala; I was planning on using scodec to serialize my messages and then persist/replicate my reactive streams using Chronicle or perhaps Aeron. However, it would be much more convenient if subscription and back-pressure were all integrated.

@benjchristensen benjchristensen referenced this issue in reactive-ipc/reactive-ipc-jvm Mar 11, 2015

Open

Goals & Motivations for Reactive IPC #1

@moredip

This comment has been minimized.

Show comment
Hide comment
@moredip

moredip Mar 12, 2015

Would it make sense to build Rx stream semantics on top of a well-established base protocol, rather than starting from first principles?

ZeroMQ springs to mind. It's well established and seems to have good language support. An issue is that there doesn't seem to be any support for ZeroMQ over HTTP, and I'm not sure how baked the ZeroMQ over WebSocket spec is.

moredip commented Mar 12, 2015

Would it make sense to build Rx stream semantics on top of a well-established base protocol, rather than starting from first principles?

ZeroMQ springs to mind. It's well established and seems to have good language support. An issue is that there doesn't seem to be any support for ZeroMQ over HTTP, and I'm not sure how baked the ZeroMQ over WebSocket spec is.

@jbrisbin

This comment has been minimized.

Show comment
Hide comment
@jbrisbin

jbrisbin Mar 12, 2015

@moredip I agree with you. When you look at the basic interaction patterns of TCP communication, they all fall into one of the categories implemented in ZMQ. IMO having PUSH/PULL and REQ/REP embedded at a level higher than the underlying transport library would make using those patterns much easier in composition libraries.

Someone will inevitably want to use REQ/REP in TCP using the Netty transport layer, so why not encode that interaction in a reusable way that is Reactive?

@moredip I agree with you. When you look at the basic interaction patterns of TCP communication, they all fall into one of the categories implemented in ZMQ. IMO having PUSH/PULL and REQ/REP embedded at a level higher than the underlying transport library would make using those patterns much easier in composition libraries.

Someone will inevitably want to use REQ/REP in TCP using the Netty transport layer, so why not encode that interaction in a reusable way that is Reactive?

@moredip

This comment has been minimized.

Show comment
Hide comment
@moredip

moredip Mar 12, 2015

at a minimum, perhaps building a reference protocol implementation on top of ZeroMQ would be a cheap way to flush out some ideas and/or get something concrete where some language binding APIs could be played with.

moredip commented Mar 12, 2015

at a minimum, perhaps building a reference protocol implementation on top of ZeroMQ would be a cheap way to flush out some ideas and/or get something concrete where some language binding APIs could be played with.

@jbrisbin jbrisbin referenced this issue in reactive-ipc/reactive-ipc-jvm Mar 12, 2015

Closed

Reactive Streams Signal Processor #5

@jbrisbin

This comment has been minimized.

Show comment
Hide comment
@jbrisbin

jbrisbin Mar 12, 2015

I created an issue at reactive-ipc-jvm [1] to suggest ZeroMQ as a transport implementation in the Reactive IPC kernel. I think working backwards from the implementation there would suggest the appropriate ways of exposing those patterns in a more consumable way via RS.IO.

[1] - reactive-ipc/reactive-ipc-jvm#12

I created an issue at reactive-ipc-jvm [1] to suggest ZeroMQ as a transport implementation in the Reactive IPC kernel. I think working backwards from the implementation there would suggest the appropriate ways of exposing those patterns in a more consumable way via RS.IO.

[1] - reactive-ipc/reactive-ipc-jvm#12

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 12, 2015

ZMQ has some wonderful features: disconnected sockets (i.e. transparent recovery from underlying transport errors), framing (atomic message delivery), and transport independence. The first one would be a lot of work to specify and implement ourselves, and the second makes our protocol simpler.

However, ZMQ also leaves a lot of freedom for specifying e.g. the reliability semantics (queueing and blocking). I tend to think the RS.io protocol should standardize these as far as possible (and I hope that's very far), otherwise the spec will become either very complex or fragmented.

@jbrisbin how do you see PUSH/PULL or REQ/REP being used for an RS abstraction? Naively, it seems to me that only an async bidi message stream like DEALER/DEALER would work. With PUSH/PULL you'd need a separate socket for each direction. But I don't have a lot of experience with zmq patterns, so maybe I'm missing something.

ZMQ has some wonderful features: disconnected sockets (i.e. transparent recovery from underlying transport errors), framing (atomic message delivery), and transport independence. The first one would be a lot of work to specify and implement ourselves, and the second makes our protocol simpler.

However, ZMQ also leaves a lot of freedom for specifying e.g. the reliability semantics (queueing and blocking). I tend to think the RS.io protocol should standardize these as far as possible (and I hope that's very far), otherwise the spec will become either very complex or fragmented.

@jbrisbin how do you see PUSH/PULL or REQ/REP being used for an RS abstraction? Naively, it seems to me that only an async bidi message stream like DEALER/DEALER would work. With PUSH/PULL you'd need a separate socket for each direction. But I don't have a lot of experience with zmq patterns, so maybe I'm missing something.

@pidster

This comment has been minimized.

Show comment
Hide comment
@pidster

pidster Mar 13, 2015

Per @benjchristensen's original post, that explored the intended behaviour, would you consider that logical functions detailed in the 'The Scalability Protocols' detailed on the Nanomsg homepage are of interest?

The communication patterns, also called "scalability protocols", are basic blocks for
building distributed systems. By combining them you can create a vast array of
distributed applications. The following scalability protocols are currently available:

PAIR - simple one-to-one communication
BUS - simple many-to-many communication
REQREP - allows to build clusters of stateless services to process user requests
PUBSUB - distributes messages to large sets of interested subscribers
PIPELINE - aggregates messages from multiple sources and load balances
them among many destinations
SURVEY - allows to query state of multiple applications in a single go

Scalability protocols are layered on top of the transport layer in the network
stack. At the moment, the nanomsg library supports the following transports
mechanisms:

INPROC - transport within a process (between threads, modules etc.)
IPC - transport between processes on a single machine
TCP - network transport via TCP"

See also: Nanomsg vs ZeroMQ.

Offering a Reactive-based implementation of these would be interesting, because it would enable higher level functionality to be assembled on top, e.g. HTTP on top of REQREP, or membership functions over SURVEY.

In this case the reactive processing mechanism is orthogonal to the choice of network protocol (and probably transport).

pidster commented Mar 13, 2015

Per @benjchristensen's original post, that explored the intended behaviour, would you consider that logical functions detailed in the 'The Scalability Protocols' detailed on the Nanomsg homepage are of interest?

The communication patterns, also called "scalability protocols", are basic blocks for
building distributed systems. By combining them you can create a vast array of
distributed applications. The following scalability protocols are currently available:

PAIR - simple one-to-one communication
BUS - simple many-to-many communication
REQREP - allows to build clusters of stateless services to process user requests
PUBSUB - distributes messages to large sets of interested subscribers
PIPELINE - aggregates messages from multiple sources and load balances
them among many destinations
SURVEY - allows to query state of multiple applications in a single go

Scalability protocols are layered on top of the transport layer in the network
stack. At the moment, the nanomsg library supports the following transports
mechanisms:

INPROC - transport within a process (between threads, modules etc.)
IPC - transport between processes on a single machine
TCP - network transport via TCP"

See also: Nanomsg vs ZeroMQ.

Offering a Reactive-based implementation of these would be interesting, because it would enable higher level functionality to be assembled on top, e.g. HTTP on top of REQREP, or membership functions over SURVEY.

In this case the reactive processing mechanism is orthogonal to the choice of network protocol (and probably transport).

@tmontgomery

This comment has been minimized.

Show comment
Hide comment
@tmontgomery

tmontgomery Mar 13, 2015

I believe that it is important to leave as little dependency on underlying transport as possible. ZeroMQ or Nanomsg, while being tremendously awesome, would be adding some dependencies that, while possibly convenient, limit the use cases slightly. Especially for running over top of some useful protocols like CoAP, MQTT, etc. later. Excluding them by adding dependencies up front would seem to be unnecessary and paint us into a corner.

Dependency on the underlying transport protocol should be minimal for a well designed protocol at this layer... not sure if that is Layer 5 or 7, actually.

Due to the nature of the use cases outlined, I think the transport can only be assumed to provide simplex, best effort delivery.

This means that the only assumption would be:

  • the transport will attempt to recover lost data within a session invocation
  • the transport will provide at least 1 way communication

Why not bi-directional? Some "protocols" like ZeroMQ, JMS, and most streaming messaging systems don't have easy to define bi-directional semantics. I believe it is possible to accommodate a lot of these systems by specifying a separation of "UP" and "DOWN" as @benjchristensen mentions that can be mapped to 2 transport sessions or to the same one... i.e. TCP, for example, by utilizing bi-directionality.

The fewer dependencies required, the better we protocol we will have. A clear separation of concerns is going to make for a much more robust and simpler solution.

I believe that it is important to leave as little dependency on underlying transport as possible. ZeroMQ or Nanomsg, while being tremendously awesome, would be adding some dependencies that, while possibly convenient, limit the use cases slightly. Especially for running over top of some useful protocols like CoAP, MQTT, etc. later. Excluding them by adding dependencies up front would seem to be unnecessary and paint us into a corner.

Dependency on the underlying transport protocol should be minimal for a well designed protocol at this layer... not sure if that is Layer 5 or 7, actually.

Due to the nature of the use cases outlined, I think the transport can only be assumed to provide simplex, best effort delivery.

This means that the only assumption would be:

  • the transport will attempt to recover lost data within a session invocation
  • the transport will provide at least 1 way communication

Why not bi-directional? Some "protocols" like ZeroMQ, JMS, and most streaming messaging systems don't have easy to define bi-directional semantics. I believe it is possible to accommodate a lot of these systems by specifying a separation of "UP" and "DOWN" as @benjchristensen mentions that can be mapped to 2 transport sessions or to the same one... i.e. TCP, for example, by utilizing bi-directionality.

The fewer dependencies required, the better we protocol we will have. A clear separation of concerns is going to make for a much more robust and simpler solution.

@rlankenau

This comment has been minimized.

Show comment
Hide comment
@rlankenau

rlankenau Mar 13, 2015

I agree with @tmontgomery regarding the separation of concerns and only relying on simplex communication, but the abstractions that @pidster posted are great.

I don't know that they all have to be presented as first-class actions in the protocol, but they provide a very nice way of considering systems that will use the protocol.

I agree with @tmontgomery regarding the separation of concerns and only relying on simplex communication, but the abstractions that @pidster posted are great.

I don't know that they all have to be presented as first-class actions in the protocol, but they provide a very nice way of considering systems that will use the protocol.

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

@tmontgomery Using two sessions, one in either direction, tends to create network routing problems. Often A can connect to B, but not B to A, due to NAT or firewalling. Even if both A and B can connect to the other, when A connects to B, the messages B receives won't necessarily be seen to come from the public address it should use to connect back to A. A node may not even know its own public IP address or whether or not it has one (e.g. in an AWS VPC).

RS semantics require duplex communication: elements in one direction, demand the other way. How can you do RS over a single simplex channel? By requiring the underlying transport to handle backpressure, like TCP does? Ot do you mean that it would always require two channels, but they might be separate?

Incidentally, glancing at the MQTT docs, it seems it's duplex?

I think the transport can only be assumed to provide simplex, best effort delivery.
the transport will attempt to recover lost data within a session invocation

How can you track RS demand on a best-delivery transport, without notifications of whether delivery actually occurred? The session might get stuck with the publisher waiting for more demand, and the subscriber waiting for more items, because the earlier items were lost in transmission.

@tmontgomery Using two sessions, one in either direction, tends to create network routing problems. Often A can connect to B, but not B to A, due to NAT or firewalling. Even if both A and B can connect to the other, when A connects to B, the messages B receives won't necessarily be seen to come from the public address it should use to connect back to A. A node may not even know its own public IP address or whether or not it has one (e.g. in an AWS VPC).

RS semantics require duplex communication: elements in one direction, demand the other way. How can you do RS over a single simplex channel? By requiring the underlying transport to handle backpressure, like TCP does? Ot do you mean that it would always require two channels, but they might be separate?

Incidentally, glancing at the MQTT docs, it seems it's duplex?

I think the transport can only be assumed to provide simplex, best effort delivery.
the transport will attempt to recover lost data within a session invocation

How can you track RS demand on a best-delivery transport, without notifications of whether delivery actually occurred? The session might get stuck with the publisher waiting for more demand, and the subscriber waiting for more items, because the earlier items were lost in transmission.

@rkuhn

This comment has been minimized.

Show comment
Hide comment
@rkuhn

rkuhn Mar 13, 2015

Member

@danarmak It would of course be trivial to implement RS semantics on a lossless duplex transport, but as far as I understand this is not the only goal of this effort. It is meaningful to layer RS semantics on top of a transport that is by itself not yet reliable or back-pressured—I’d venture to say that this is the more interesting goal.

Member

rkuhn commented Mar 13, 2015

@danarmak It would of course be trivial to implement RS semantics on a lossless duplex transport, but as far as I understand this is not the only goal of this effort. It is meaningful to layer RS semantics on top of a transport that is by itself not yet reliable or back-pressured—I’d venture to say that this is the more interesting goal.

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

@rkuhn could you please present a motivating use case of a transport that is not lossless and/or not duplex and why it's needed to use it?

The one obvious example from my personal experience is HTTP/1, which is not fully duplex (and apparently CoAP which has similar semantics). I listed some reasons to use HTTP at the start of #6: it can be used from inside browsers, the server components can be hosted in existing HTTP frameworks alongside other software, and it's already integrated with SSL, authentication etc.

Are there other prominent non-duplex use cases?

@rkuhn could you please present a motivating use case of a transport that is not lossless and/or not duplex and why it's needed to use it?

The one obvious example from my personal experience is HTTP/1, which is not fully duplex (and apparently CoAP which has similar semantics). I listed some reasons to use HTTP at the start of #6: it can be used from inside browsers, the server components can be hosted in existing HTTP frameworks alongside other software, and it's already integrated with SSL, authentication etc.

Are there other prominent non-duplex use cases?

@rkuhn

This comment has been minimized.

Show comment
Hide comment
@rkuhn

rkuhn Mar 13, 2015

Member

@danarmak Yes, the prototypical example is UDP: not having to maintain a lot of connection-based state in network endpoints as well as intermediate components can be imperative, there are limits of scale implied by using TCP as the basis for everything you do in a large system.

Member

rkuhn commented Mar 13, 2015

@danarmak Yes, the prototypical example is UDP: not having to maintain a lot of connection-based state in network endpoints as well as intermediate components can be imperative, there are limits of scale implied by using TCP as the basis for everything you do in a large system.

@tmontgomery

This comment has been minimized.

Show comment
Hide comment
@tmontgomery

tmontgomery Mar 13, 2015

@danarmak was thinking of the semantics on top of MQTT... i.e. basic messaging simplex operation. I forgot it actually has request/response. My fault. I blame my cold. :)

I wasn't think of the direction of connectivity being an issue. I.e. connectivity can happen in either direction as needed by the network and data flow can happen in whatever direction is needed. In this way, NAT traversal would not be an issue. I.e. both TCP sessions, for example, would connect in the same direction, but flow would go in each respective direction. TCP is a bad example since it wouldn't be necessary to have 2 connections, they can use bi-directionality.

@rkuhn I would be OK with not back-pressured. Non-reliable is problematic.

HTTP/1 and HTTP/2 (depending on how it is mapped) are ones that are problematic to handle. HTTP/1 will require 2 TCP connections as you point out in #6 . So might HTTP/2 depending on implementation limitations. Any messaging system using JMS would have to map onto topics. By extension, CoAP would be another one. Although, it does have Request/Response as well. But it might be simpler to map it as two publish/subscribes instead.

@danarmak was thinking of the semantics on top of MQTT... i.e. basic messaging simplex operation. I forgot it actually has request/response. My fault. I blame my cold. :)

I wasn't think of the direction of connectivity being an issue. I.e. connectivity can happen in either direction as needed by the network and data flow can happen in whatever direction is needed. In this way, NAT traversal would not be an issue. I.e. both TCP sessions, for example, would connect in the same direction, but flow would go in each respective direction. TCP is a bad example since it wouldn't be necessary to have 2 connections, they can use bi-directionality.

@rkuhn I would be OK with not back-pressured. Non-reliable is problematic.

HTTP/1 and HTTP/2 (depending on how it is mapped) are ones that are problematic to handle. HTTP/1 will require 2 TCP connections as you point out in #6 . So might HTTP/2 depending on implementation limitations. Any messaging system using JMS would have to map onto topics. By extension, CoAP would be another one. Although, it does have Request/Response as well. But it might be simpler to map it as two publish/subscribes instead.

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

@tmontgomery in simplex protocols, which side initiates the connection is usually tied to the directionality, which side can send messages. In a req/rep model like HTTP/1, the client would have to constantly poll for server messages. This is of course inefficient. So for real duplex transports we would still need to specify a different protocol.

@rkuhn with UDP, how do you propose to handle the routing issue?

The most general option is for the initiator to tell the other side, in its first message, what address to send replies to. If it doesn't know, it defaults to saying 'null', in which case the second party sends to the address it sees the messages as coming from.

But it's hard, sometimes impossible, to figure out dynamically what address to specify. And with a transport like UDP where there are no delivery acks, if the initiator isn't getting replies, it can't even tell if it's a routing problem in the forward direction, or in the other direction, or if the other party is just down.

Or we could punt it to the user and not worry about it, if they want to use routed UDP for bidi communications, it's on their own head :-)

@tmontgomery in simplex protocols, which side initiates the connection is usually tied to the directionality, which side can send messages. In a req/rep model like HTTP/1, the client would have to constantly poll for server messages. This is of course inefficient. So for real duplex transports we would still need to specify a different protocol.

@rkuhn with UDP, how do you propose to handle the routing issue?

The most general option is for the initiator to tell the other side, in its first message, what address to send replies to. If it doesn't know, it defaults to saying 'null', in which case the second party sends to the address it sees the messages as coming from.

But it's hard, sometimes impossible, to figure out dynamically what address to specify. And with a transport like UDP where there are no delivery acks, if the initiator isn't getting replies, it can't even tell if it's a routing problem in the forward direction, or in the other direction, or if the other party is just down.

Or we could punt it to the user and not worry about it, if they want to use routed UDP for bidi communications, it's on their own head :-)

@tmontgomery

This comment has been minimized.

Show comment
Hide comment
@tmontgomery

tmontgomery Mar 13, 2015

@danarmak I'm not a fan of long polling. But just playing devils advocate, wouldn't HTTP streaming be a way to do the backchannel? Inefficient, yes, but better than some options.

For UDP, the receiver can send back to the senders IP and port. As DNS and other protocols do. This is how DNS traverses firewalls.

@danarmak I'm not a fan of long polling. But just playing devils advocate, wouldn't HTTP streaming be a way to do the backchannel? Inefficient, yes, but better than some options.

For UDP, the receiver can send back to the senders IP and port. As DNS and other protocols do. This is how DNS traverses firewalls.

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

We've been discussing this from two directions at once. On the one hand, how easy different transports are to support. On the other hand, the usecases that require specific transports. should one consideration trump the other?

What are the protocols that you all think we must support even at very high cost, and not just if it's easy and neat? (For myself, lossless duplex and HTTP/1 would be enough.)

Conversely, at what point does the scope or difficulty become too great, so the project would likely stall or last too long, and so we should not try to support such scenarios? Would we know it without trying out a protocol draft / implementation? (For myself, I don't have a good idea.)

Finally, should we try to release and implement a restricted, focused first version quickly, and plan to support other usecases later? Assuming the protocol would be specified a bit differently for different transports in any case.

We've been discussing this from two directions at once. On the one hand, how easy different transports are to support. On the other hand, the usecases that require specific transports. should one consideration trump the other?

What are the protocols that you all think we must support even at very high cost, and not just if it's easy and neat? (For myself, lossless duplex and HTTP/1 would be enough.)

Conversely, at what point does the scope or difficulty become too great, so the project would likely stall or last too long, and so we should not try to support such scenarios? Would we know it without trying out a protocol draft / implementation? (For myself, I don't have a good idea.)

Finally, should we try to release and implement a restricted, focused first version quickly, and plan to support other usecases later? Assuming the protocol would be specified a bit differently for different transports in any case.

@tmontgomery

This comment has been minimized.

Show comment
Hide comment
@tmontgomery

tmontgomery Mar 13, 2015

I would be OK with a lossless duplex first version (suitable for TCP, WebSocket, and possibly HTTP/2). But would like to see us extend to HTTP/1 and other not-fully-duplex transports (like Aeron) afterward. My motivation is that I want to run this over Aeron. But realistically, I know that is a minor use case for many.

I would be OK with a lossless duplex first version (suitable for TCP, WebSocket, and possibly HTTP/2). But would like to see us extend to HTTP/1 and other not-fully-duplex transports (like Aeron) afterward. My motivation is that I want to run this over Aeron. But realistically, I know that is a minor use case for many.

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

@tmontgomery with HTTP request+response streaming, I feel you'd give up some of HTTP's semantics. For instance, if there was a network error (as there eventually will be for a very long lived connection), you wouldn't know which messages were received. So you'd have to implement explicit recovery/negotiation semantics, instead of just mapping RS messages to HTTP ones. This is tricky if the client sends a lot of demand messages, but it doesn't actually calculate what the total current demand should be.

(Actually, my own #6 has the same problem in miniature, because it allows for message batching. I'll have to add a note that the server must buffer the whole batch and resend it if it's not delivered safely the first time.)

@tmontgomery with HTTP request+response streaming, I feel you'd give up some of HTTP's semantics. For instance, if there was a network error (as there eventually will be for a very long lived connection), you wouldn't know which messages were received. So you'd have to implement explicit recovery/negotiation semantics, instead of just mapping RS messages to HTTP ones. This is tricky if the client sends a lot of demand messages, but it doesn't actually calculate what the total current demand should be.

(Actually, my own #6 has the same problem in miniature, because it allows for message batching. I'll have to add a note that the server must buffer the whole batch and resend it if it's not delivered safely the first time.)

@tmontgomery

This comment has been minimized.

Show comment
Hide comment
@tmontgomery

tmontgomery Mar 13, 2015

Doesn't TCP have the same problem, though. A TCP connection can be cut in the middle of a frame. Would seem to be the same issue. Maybe there is a subtlety I am missing...

Doesn't TCP have the same problem, though. A TCP connection can be cut in the middle of a frame. Would seem to be the same issue. Maybe there is a subtlety I am missing...

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

I thought that with UDP, the only reason services like DNS work is that the firewall (and all routers along the way) allow them specifically and track request/reply state. Or, more commonly, there's a DNS server inside the local network. Or DNS over TCP is used.

If a host doesn't even have a public IP address, how could a reply UDP packet possibly reach it, no matter what address it's sent to?

I thought that with UDP, the only reason services like DNS work is that the firewall (and all routers along the way) allow them specifically and track request/reply state. Or, more commonly, there's a DNS server inside the local network. Or DNS over TCP is used.

If a host doesn't even have a public IP address, how could a reply UDP packet possibly reach it, no matter what address it's sent to?

@tmontgomery

This comment has been minimized.

Show comment
Hide comment
@tmontgomery

tmontgomery Mar 13, 2015

Usually, and this is not the case with all firewalls or configs, the outbound UDP packet creates forwarding state in the router that has a lifetime. If it doesn't see a UDP message back or more UDP packets sent out to the same destination, then it deletes the forwarding state. However, if it sees a UDP packet back, it then forwards on to original source. Notice the firewall can just use it's own IP and manage it's own UDP port just as it would with a TCP connection.

Lots of UDP-based games would not work if this didn't work...

A local DNS cache (like one inside the firewall) still normally uses UDP for outbound queries. TCP is normally only used (my experience may be dated, though) for zone transfers and other bulk requests.

UDP is treated exactly like TCP wrt IP address and port usage.

NB: this assumes UDP is allowed through the firewall.

Usually, and this is not the case with all firewalls or configs, the outbound UDP packet creates forwarding state in the router that has a lifetime. If it doesn't see a UDP message back or more UDP packets sent out to the same destination, then it deletes the forwarding state. However, if it sees a UDP packet back, it then forwards on to original source. Notice the firewall can just use it's own IP and manage it's own UDP port just as it would with a TCP connection.

Lots of UDP-based games would not work if this didn't work...

A local DNS cache (like one inside the firewall) still normally uses UDP for outbound queries. TCP is normally only used (my experience may be dated, though) for zone transfers and other bulk requests.

UDP is treated exactly like TCP wrt IP address and port usage.

NB: this assumes UDP is allowed through the firewall.

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

@tmontgomery It's true, TCP has a similar problem. Which is why #4 doesn't specify recovery in case the connection is broken, because there's no way for each side to know for sure what data the other side received before the error. To overcome this we'd need to add something like explicit numbering of the frames, so after reconnecting each side could tell the other which frames it received. (Tempting to just use ZMQ which already implements frame re-delivery...)

Also, each side needs to buffer the messages it sent until the other side acknowledges them, in case it needs to resend. These explicit acks should be included in other messages and also in a new message type in case there's no other message going that way.

I didn't include this in #4 because it adds a lot of complexity. Plenty of TCP-based protocols use long-lived sessions without recovery support, like SSH for instance, and I thought we could get away with it too, but maybe there are use cases where it's important to have. What do you think?

My point about HTTP was that one of the reasons to use it instead of TCP was to reuse its semantics, like framing messages and only acting on completely delivered ones. HTTP streaming is a step back in that sense. But many other HTTP features remain, so it might well still be better than using websockets.

@tmontgomery It's true, TCP has a similar problem. Which is why #4 doesn't specify recovery in case the connection is broken, because there's no way for each side to know for sure what data the other side received before the error. To overcome this we'd need to add something like explicit numbering of the frames, so after reconnecting each side could tell the other which frames it received. (Tempting to just use ZMQ which already implements frame re-delivery...)

Also, each side needs to buffer the messages it sent until the other side acknowledges them, in case it needs to resend. These explicit acks should be included in other messages and also in a new message type in case there's no other message going that way.

I didn't include this in #4 because it adds a lot of complexity. Plenty of TCP-based protocols use long-lived sessions without recovery support, like SSH for instance, and I thought we could get away with it too, but maybe there are use cases where it's important to have. What do you think?

My point about HTTP was that one of the reasons to use it instead of TCP was to reuse its semantics, like framing messages and only acting on completely delivered ones. HTTP streaming is a step back in that sense. But many other HTTP features remain, so it might well still be better than using websockets.

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

@tmontgomery If we can just assume UDP (and any other transport under consideration) is routed correctly both ways, and not address routing at all in the protocol, that would be great. Presumably this depends on the usecases involved.

@rkuhn using UDP over TCP for performance implies allowing some messages to be lost and not retransmitted. But then how do you calculate demand correctly, assuming the goal is to layer any valid RS traffic over RS.io? If a stream element (onNext) is lost, the session would deadlock with the publisher waiting for more demand and the subscriber waiting for more items.

@tmontgomery If we can just assume UDP (and any other transport under consideration) is routed correctly both ways, and not address routing at all in the protocol, that would be great. Presumably this depends on the usecases involved.

@rkuhn using UDP over TCP for performance implies allowing some messages to be lost and not retransmitted. But then how do you calculate demand correctly, assuming the goal is to layer any valid RS traffic over RS.io? If a stream element (onNext) is lost, the session would deadlock with the publisher waiting for more demand and the subscriber waiting for more items.

@tmontgomery

This comment has been minimized.

Show comment
Hide comment
@tmontgomery

tmontgomery Mar 13, 2015

I think we should consider if recovery across transport sessions is in scope... my opinion is that it is a higher level concern. i.e. shouldn't be handled in RS.io. It's best handled closer to the application were semantics can be integrated cleaner. Doing it generically for all applications is a nightmare.

I think we should consider if recovery across transport sessions is in scope... my opinion is that it is a higher level concern. i.e. shouldn't be handled in RS.io. It's best handled closer to the application were semantics can be integrated cleaner. Doing it generically for all applications is a nightmare.

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

I agree. I just have a nagging suspicion that long-lived HTTP streaming is likely to induce errors due to HTTP proxies or HTTP-aware firewalls applying default timeout rules and such, if both sides don't transmit anything for a while.

I agree. I just have a nagging suspicion that long-lived HTTP streaming is likely to induce errors due to HTTP proxies or HTTP-aware firewalls applying default timeout rules and such, if both sides don't transmit anything for a while.

@rkuhn

This comment has been minimized.

Show comment
Hide comment
@rkuhn

rkuhn Mar 13, 2015

Member

As you guys have mentioned already there are very different transport mechanisms that we might consider, and I concur that starting out with a relatively simple one is probably a good strategy. There is a cost to using a fairly complex protocol like TCP or even HTTP, though, in that these address some concerns that overlap with what we want to achieve. It might boil down to the good old difference between ease and simplicity. To answer your point about UDP’s lossy nature, TCP’s head of the line blocking and the “TCP incast” problem illustrate why building upon a simpler foundation with different trade-offs than TCP might be worthwhile. Another consideration is that streaming use-cases that do not pass network boundaries (firewalls, NAT, “the internet”) are numerous enough to warrant a closer look.

Member

rkuhn commented Mar 13, 2015

As you guys have mentioned already there are very different transport mechanisms that we might consider, and I concur that starting out with a relatively simple one is probably a good strategy. There is a cost to using a fairly complex protocol like TCP or even HTTP, though, in that these address some concerns that overlap with what we want to achieve. It might boil down to the good old difference between ease and simplicity. To answer your point about UDP’s lossy nature, TCP’s head of the line blocking and the “TCP incast” problem illustrate why building upon a simpler foundation with different trade-offs than TCP might be worthwhile. Another consideration is that streaming use-cases that do not pass network boundaries (firewalls, NAT, “the internet”) are numerous enough to warrant a closer look.

@tmontgomery

This comment has been minimized.

Show comment
Hide comment
@tmontgomery

tmontgomery Mar 13, 2015

You are right. They will. Hostile intermediaries will terminate HTTP sessions they "think" are hung. WebSocket deployments have shaken a lot of that out. HTTP/2 will shake out more of them. But they exist. But in that case, I think it should be treated as a transport session being cut unexpectedly and handled by the application.

At the end of the day, the application is the one that is dealing with its own semantics. What does the end of data look like? Have I processed this before? etc.

You are right. They will. Hostile intermediaries will terminate HTTP sessions they "think" are hung. WebSocket deployments have shaken a lot of that out. HTTP/2 will shake out more of them. But they exist. But in that case, I think it should be treated as a transport session being cut unexpectedly and handled by the application.

At the end of the day, the application is the one that is dealing with its own semantics. What does the end of data look like? Have I processed this before? etc.

@pidster

This comment has been minimized.

Show comment
Hide comment
@pidster

pidster Mar 13, 2015

So I was trying to ask a question, I will try to be more specific.

I read the initial post to include some kind of pubsub pattern.

Is the intent to offer reactive implementations of certain patterns, or
various functions over HTTP? Or is the intent to offer functionality over a
given transport?

On 13 Mar 2015, at 20:35, Todd L. Montgomery notifications@github.com
wrote:

You are right. They will. Hostile intermediaries will terminate HTTP
sessions they "think" are hung. WebSocket deployments have shaken a lot of
that out. HTTP/2 will shake out more of them. But they exist. But in that
case, I think it should be treated as a transport session being cut
unexpectedly and handled by the application.

At the end of the day, the application is the one that is dealing with its
own semantics. What does the end of data look like? Have I processed this
before? etc.


Reply to this email directly or view it on GitHub
#1 (comment)
.

pidster commented Mar 13, 2015

So I was trying to ask a question, I will try to be more specific.

I read the initial post to include some kind of pubsub pattern.

Is the intent to offer reactive implementations of certain patterns, or
various functions over HTTP? Or is the intent to offer functionality over a
given transport?

On 13 Mar 2015, at 20:35, Todd L. Montgomery notifications@github.com
wrote:

You are right. They will. Hostile intermediaries will terminate HTTP
sessions they "think" are hung. WebSocket deployments have shaken a lot of
that out. HTTP/2 will shake out more of them. But they exist. But in that
case, I think it should be treated as a transport session being cut
unexpectedly and handled by the application.

At the end of the day, the application is the one that is dealing with its
own semantics. What does the end of data look like? Have I processed this
before? etc.


Reply to this email directly or view it on GitHub
#1 (comment)
.

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

@rkuhn I'm afraid I don't understand what you're proposing. Is it to to build our own ack mechanism on top of UDP, but more lightweight than TCP? If so, wouldn't one of the existing UDP-based protocols be a better starting point?

@rkuhn I'm afraid I don't understand what you're proposing. Is it to to build our own ack mechanism on top of UDP, but more lightweight than TCP? If so, wouldn't one of the existing UDP-based protocols be a better starting point?

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 13, 2015

@tmontgomery some RS use cases don't allow to easily tell the publisher where to start the stream, or at least that requires explicit support in the publisher implementation (and also getting the needed arguments to the publisher factory). If the underlying connection is often broken, RS.io over HTTP won't work as a generic RS transport in some scenarios.

If the suggested solution in those cases is to use something else like websockets, then I ask myself if people in practice would always use websockets as a matter of habit once they notice HTTP is sometimes unreliable.

@tmontgomery some RS use cases don't allow to easily tell the publisher where to start the stream, or at least that requires explicit support in the publisher implementation (and also getting the needed arguments to the publisher factory). If the underlying connection is often broken, RS.io over HTTP won't work as a generic RS transport in some scenarios.

If the suggested solution in those cases is to use something else like websockets, then I ask myself if people in practice would always use websockets as a matter of habit once they notice HTTP is sometimes unreliable.

@experquisite

This comment has been minimized.

Show comment
Hide comment
@experquisite

experquisite Mar 13, 2015

@tmontgomery just FYI, I too am interested in Rx over Aeron, or some other chronicled reliable UDP.

@tmontgomery just FYI, I too am interested in Rx over Aeron, or some other chronicled reliable UDP.

@rkuhn

This comment has been minimized.

Show comment
Hide comment
@rkuhn

rkuhn Mar 13, 2015

Member

@danarmak You are understanding correctly, although I am not proposing any specific solution. I am just raising possible use-cases in order to determine the scope of this effort.

Member

rkuhn commented Mar 13, 2015

@danarmak You are understanding correctly, although I am not proposing any specific solution. I am just raising possible use-cases in order to determine the scope of this effort.

@viktorklang

This comment has been minimized.

Show comment
Hide comment
@viktorklang

viktorklang Mar 15, 2015

Sorry for being late to the party, but I brought beer....

How about we take a little step back and start by defining goals, prioritizing them and then see where the scope ends up, then we can start looking at solutions to address it.

Sorry for being late to the party, but I brought beer....

How about we take a little step back and start by defining goals, prioritizing them and then see where the scope ends up, then we can start looking at solutions to address it.

@danarmak

This comment has been minimized.

Show comment
Hide comment
@danarmak

danarmak Mar 15, 2015

I think the core use case is "connect any existing RS publisher and subscriber across an IP network", which needs at least these features:

  1. Carry any valid RS traffic, don't change the RS semantics.
  2. Multiplex multiple RS streams over a single transport (if using a non-multiplexing transport), because some usecases involve subscribing to 100s/1000s of 'slow' streams and opening e.g. 1000s of sockets is wasteful.
  3. Either allow for protocol extensions of some kind, or specify more (possibly optional) features, e.g. for discovery of publishers. Otherwise most deployments will need to communicate out of band as well.
  4. Transport support is a bit controversial. Personally I think supporting TCP is necessary because it's the only IP-based transport that can guarantee bidi traffic will work correctly on every possible network configuration. UDP-based solutions require stateful firewalls on every hop, and even then some firewalls are configured to block returning UDP traffic.

4b. If we support TCP, we might as well generalize to all bidi duplex streams, such as pipes.

4c. It's also possible to specify use of some other protocol that can run on top of TCP, such as ZMQ, which provides transparent reconnection and native multiplexing and framing. But then, if we support non-TCP transports that ZMQ doesn't, we either forfeit those features or have to reimplement them.

Other proposed transports/usecases have been:

*5. HTTP and/or Websockets, to allow integration with existing HTTP stacks and use from browsers. Personally, I like the idea that Websockets are enough: they're a bidi duplex bytestream transport which can be used from HTTP servers and browsers, and is preceded by HTTP negotiation which can include authentication etc.
*6. UDP-based transports which are more efficient or scalable than TCP in some usecases. Since RS streams can't be lossy or out-of-order and have to calculate correct demand, we'd need to implement these features on top of UDP or use a protocol that does, while still being more efficient than TCP.

I think the core use case is "connect any existing RS publisher and subscriber across an IP network", which needs at least these features:

  1. Carry any valid RS traffic, don't change the RS semantics.
  2. Multiplex multiple RS streams over a single transport (if using a non-multiplexing transport), because some usecases involve subscribing to 100s/1000s of 'slow' streams and opening e.g. 1000s of sockets is wasteful.
  3. Either allow for protocol extensions of some kind, or specify more (possibly optional) features, e.g. for discovery of publishers. Otherwise most deployments will need to communicate out of band as well.
  4. Transport support is a bit controversial. Personally I think supporting TCP is necessary because it's the only IP-based transport that can guarantee bidi traffic will work correctly on every possible network configuration. UDP-based solutions require stateful firewalls on every hop, and even then some firewalls are configured to block returning UDP traffic.

4b. If we support TCP, we might as well generalize to all bidi duplex streams, such as pipes.

4c. It's also possible to specify use of some other protocol that can run on top of TCP, such as ZMQ, which provides transparent reconnection and native multiplexing and framing. But then, if we support non-TCP transports that ZMQ doesn't, we either forfeit those features or have to reimplement them.

Other proposed transports/usecases have been:

*5. HTTP and/or Websockets, to allow integration with existing HTTP stacks and use from browsers. Personally, I like the idea that Websockets are enough: they're a bidi duplex bytestream transport which can be used from HTTP servers and browsers, and is preceded by HTTP negotiation which can include authentication etc.
*6. UDP-based transports which are more efficient or scalable than TCP in some usecases. Since RS streams can't be lossy or out-of-order and have to calculate correct demand, we'd need to implement these features on top of UDP or use a protocol that does, while still being more efficient than TCP.

@benjchristensen

This comment has been minimized.

Show comment
Hide comment
@benjchristensen

benjchristensen Mar 18, 2015

Contributor

I agree. I just have a nagging suspicion that long-lived HTTP streaming is likely to induce errors due to HTTP proxies or HTTP-aware firewalls applying default timeout rules and such, if both sides don't transmit anything for a while.

Yes this can happen, but I almost certainly do need this protocol over HTTP transport such as WebSockets or HTTP/2 for external use. Internally I intend to use whatever is best (TCP, UDP, Aeron, etc)

I think it should be treated as a transport session being cut unexpectedly and handled by the application.

I agree with this. This is how we deal with SSE today where we have streams open 24/7.

Reactive Streams semantics with request(n) behavior should help solve an issue that we've struggled with using SSE which is buffer bloat in proxies. For example, Amazon ELBs do not work well for SSE because the producer can fill the ELB buffers. We have seen ELBs buffer up to 15 minutes worth of one of our streams then "blow up" as we overwhelmed its memory. We had to stop using ELBs for SSE streams of this nature. Putting backpressure into the application level semantics using RS.io should help solve this. Then the only issue should be occasional disconnects, and that's fine for our application to deal with as we must always deal with that anyways.

I like the idea that Websockets are enough

If HTTP/2 can't work but Websockets does that is sufficient for me to achieve my use cases. We need to address the HTTP connectivity though as external communication (WAN, over the internet) effectively requires this to get through firewall, NATs, etc.

Contributor

benjchristensen commented Mar 18, 2015

I agree. I just have a nagging suspicion that long-lived HTTP streaming is likely to induce errors due to HTTP proxies or HTTP-aware firewalls applying default timeout rules and such, if both sides don't transmit anything for a while.

Yes this can happen, but I almost certainly do need this protocol over HTTP transport such as WebSockets or HTTP/2 for external use. Internally I intend to use whatever is best (TCP, UDP, Aeron, etc)

I think it should be treated as a transport session being cut unexpectedly and handled by the application.

I agree with this. This is how we deal with SSE today where we have streams open 24/7.

Reactive Streams semantics with request(n) behavior should help solve an issue that we've struggled with using SSE which is buffer bloat in proxies. For example, Amazon ELBs do not work well for SSE because the producer can fill the ELB buffers. We have seen ELBs buffer up to 15 minutes worth of one of our streams then "blow up" as we overwhelmed its memory. We had to stop using ELBs for SSE streams of this nature. Putting backpressure into the application level semantics using RS.io should help solve this. Then the only issue should be occasional disconnects, and that's fine for our application to deal with as we must always deal with that anyways.

I like the idea that Websockets are enough

If HTTP/2 can't work but Websockets does that is sufficient for me to achieve my use cases. We need to address the HTTP connectivity though as external communication (WAN, over the internet) effectively requires this to get through firewall, NATs, etc.

@benlesh

This comment has been minimized.

Show comment
Hide comment
@benlesh

benlesh Mar 19, 2015

Why would you have different results buffering Web Sockets vs SSE? How do ELB buffers treat web sockets differently?

benlesh commented Mar 19, 2015

Why would you have different results buffering Web Sockets vs SSE? How do ELB buffers treat web sockets differently?

@benjchristensen

This comment has been minimized.

Show comment
Hide comment
@benjchristensen

benjchristensen Mar 19, 2015

Contributor

I have never run WebSockets through an ELB. For one thing they don't support it. The example I was giving was that the application semantics of request(n) would compose throug proxies so buffer bloat wouldn't occur. SSE is just a firehose without application level backpressure so is vulnerable to buffering.

Contributor

benjchristensen commented Mar 19, 2015

I have never run WebSockets through an ELB. For one thing they don't support it. The example I was giving was that the application semantics of request(n) would compose throug proxies so buffer bloat wouldn't occur. SSE is just a firehose without application level backpressure so is vulnerable to buffering.

@pk11

This comment has been minimized.

Show comment
Hide comment
@pk11

pk11 Apr 7, 2015

We are running WebSockets behind ELB in production (using the TCP backdoor) and while we ran into all sorts of issues, we did not notice any buffering.

As per this doc

HTTP/S:

When you use HTTP (layer 7) for both front-end and back-end connections, your load balancer parses the headers in the request and terminates the connection before re-sending the request to the back-end instance(s). This is the default configuration provided by Elastic Load Balancing.

TCP/SSL:

When you use TCP (layer 4) for both front-end and back-end connections, your load balancer forwards the request to the back-end instances without modification to the headers. After getting the request, your load balancer attempts to open a TCP connection to the back-end instance on the port specified in the health check configuration. If the load balancer fails to connect with the instance at the specified port within the configured response timeout period, the instance is considered unhealthy.

As for whether HTTP/2 or WebSockets should be used for the underlying protocol. Personally, I would recommend to try to make HTTP/2 work first. Obviously, requirements may vary but WebSockets require bigger infra investment upfront.

pk11 commented Apr 7, 2015

We are running WebSockets behind ELB in production (using the TCP backdoor) and while we ran into all sorts of issues, we did not notice any buffering.

As per this doc

HTTP/S:

When you use HTTP (layer 7) for both front-end and back-end connections, your load balancer parses the headers in the request and terminates the connection before re-sending the request to the back-end instance(s). This is the default configuration provided by Elastic Load Balancing.

TCP/SSL:

When you use TCP (layer 4) for both front-end and back-end connections, your load balancer forwards the request to the back-end instances without modification to the headers. After getting the request, your load balancer attempts to open a TCP connection to the back-end instance on the port specified in the health check configuration. If the load balancer fails to connect with the instance at the specified port within the configured response timeout period, the instance is considered unhealthy.

As for whether HTTP/2 or WebSockets should be used for the underlying protocol. Personally, I would recommend to try to make HTTP/2 work first. Obviously, requirements may vary but WebSockets require bigger infra investment upfront.

@joelhandwell

This comment has been minimized.

Show comment
Hide comment
@joelhandwell

joelhandwell Jul 18, 2018

How is this now? Still actively discussed in other places?

How is this now? Still actively discussed in other places?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment