Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WebSocket and SSE options #1272

Closed
wants to merge 7 commits into from
Closed

Add WebSocket and SSE options #1272

wants to merge 7 commits into from

Conversation

@ohler55
Copy link
Contributor

ohler55 commented Apr 28, 2018

This suggested addition is the result of a collaboration between @boazsegev of Iodine fame and myself. Both Agoo and Iodine will support these suggested additions. At this point I don't think any other servers are supporting WebSockets and SSE so this would set the standard for others to follow.

@ioquatix
Copy link
Member

ioquatix commented May 1, 2018

Interesting.

I already support completely transparent asynchronous web sockets in falcon. It uses rack.hijack just fine. An example is given here: https://github.com/socketry/async-websocket/blob/master/examples/chat/config.ru

Of course with this model supporting SSE is also trivial.

Regarding the event driven model which is being exposed via rack.upgrade: I don't think exposing an event driven API is a great way to make systems asynchronous. It's kind of limiting.

Regarding the overall design - careful thought needs to be given to how HTTP/2 fits into this picture. Ideally, SSE, WebSockets, and other "interactive" streams, transit via reading from the request body and writing to the response body. That's the model which I feel works best with HTTP/2, where each request/response is encapsulated into a full duplex stream.

@ioquatix
Copy link
Member

ioquatix commented May 1, 2018

Bare in mind that async-websocket doesn't support web sockets over HTTP/2, this is something I'll investigate in the coming months.

This is also one of the challenges of implementing rack.hijack for HTTP/2. You need to be careful of what it means, exactly, to hijack a connection (or stream in the case of HTTP/2).

@boazsegev
Copy link

boazsegev commented May 1, 2018

I think the HTTP/2 layer is something that's handled only by the server. It shouldn't change the rack.upgrade semantics since it isn't one of the protocols / behaviors that should be handled by the application.

I think the idea is great because it decouples the application from any network logic.

While hijack solutions remove the server, changing the relationship from network<=>server<=>application to network<=>application, this approach keeps the application away from the network, keeping the initial (and desired) relationship intact.

EDIT:

I also published a blog post about this PR and tried to explain this difference on a Reddit thread. I'm not sure how well I did explaining my thought on the matter, but I think it's a wonderful approach.

@ioquatix
Copy link
Member

ioquatix commented May 1, 2018

@boazsegev While I see where you are coming from, I think that your idea of a semantic model is much higher than what is exposed by HTTP.

Given the direction of HTTP/2, it's clear to me that the base HTTP semantic model is a full-duplex un-buffered stream of chunks. Whether you implement this with multiple HTTP/1 connections or HTTP/2 streams (which to me is the network layer), is irrelevant to the application.

The application simply sees a full duplex stream of chunks. That's the semantic model for the application. On top of this you can implement your desired web sockets, sse, and other interesting things, like streaming responses, and so on.

Your semantic model has a tight coupling with your desired application level protocols, but I think that's a very limiting design. For example, you can't implement general streaming responses like https://github.com/socketry/falcon/blob/master/examples/beer/config.ru (the fact it uses async is irrelevant, you could just as easily do it with threads).

@ioquatix
Copy link
Member

ioquatix commented May 1, 2018

I think I should mention this discussion here since it seems relevant: #1148

@ohler55
Copy link
Contributor Author

ohler55 commented May 1, 2018

I'd like to point out that Rack is event based as are the WebSocket and SSE API/specs. The Rack hijack offers a streaming back door that is not addressed by the spec. This PR attempts to formalize an approach to using WebSocket and SSE that is consistent with the use pattern common in modern web applications with JavaScript front ends. It is not an attempt to get rid of the hijack option. There is certainly reason to continue support for hijacking but that should not be a reason to reject this PR.

@boazsegev
Copy link

boazsegev commented May 1, 2018

@ioquatix , you're totally right, I can't implement that code... nobody can unless they implement Async::Task and add that as a Rack dependency.

As for streaming solutions: SSE is a streaming solution.

If you refer to video streaming or binary data requirements, WebSockets are perfect for that.

If you meant old-school HTTP streaming... yes, that's true. I'm not even sure that I should, but I'm aware that I don't. In fact, I'm pretty sure the Rack model wouldn't be ideal for streaming anyway.

@ioquatix
Copy link
Member

ioquatix commented May 1, 2018

I'd like to point out that Rack is event based

I'm not convinced by this

If anything, rack is based on composable middleware. There is no event handling.

The Rack hijack offers a streaming back door that is not addressed by the spec

Sorry, but this is complete dump trucks:

rack/SPEC

Lines 87 to 96 in 154ac52

<tt>rack.hijack?</tt>:: present and true if the server supports
connection hijacking. See below, hijacking.
<tt>rack.hijack</tt>:: an object responding to #call that must be
called at least once before using
rack.hijack_io.
It is recommended #call return rack.hijack_io
as well as setting it in env if necessary.
<tt>rack.hijack_io</tt>:: if rack.hijack? is true, and rack.hijack
has received #call, this will contain
an object resembling an IO. See hijacking.

This PR attempts to formalize an approach to using WebSocket and SSE that is consistent with the use pattern common in modern web applications with JavaScript front ends

This is the guts of the matter. Why is this necessary? It's clearly already possible to do it - so we have to ask the question, do we need to formalise this in rack, or is there something simpler that we could formalise which allows more possibilities. We should accept or reject this PR based on the merits of what it adds to Rack, as a specification. I think there are better, more generic, options.

@ioquatix
Copy link
Member

ioquatix commented May 1, 2018

@ioquatix , you're totally right, I can't implement that code... nobody can unless they implement Async::Task and add that as a Rack dependency.

I don't think I proposed that anywhere. I just said, with the current Rack spec, you can do it, and it scales well. We don't need to change the rack spec to have a great implementation of web sockets.

As for streaming solutions: SSE is a streaming solution.

It only works in one direction and it only works in a very specific context. It's not general response streaming, as in body.each {|chunk| ... peer.write(chunk) ...}

If you refer to video streaming or binary data requirements, WebSockets are perfect for that.

If you meant old-school HTTP streaming... yes, that's true. I'm not even sure that I should, but I'm aware that I don't. In fact, I'm pretty sure the Rack model wouldn't be ideal for streaming anyway.

The Rack model is great for streaming responses. That's the whole point of the response body responding to each.

@ioquatix
Copy link
Member

ioquatix commented May 1, 2018

By the way, I'm not trying to shit all over your PR, although it probably feels like it. Congratulations for making something, and implementing it and proposing it. Basically, I just don't see the logic of it though. I think there are better things in Rack that need to be addressed, and I think the direction of HTTP/2 confirms this.

@ohler55
Copy link
Contributor Author

ohler55 commented May 1, 2018

Of course Rack is event based. The trigger for all Rack code is the #call(env) callback where env is the encapsulation of the event that has occurred. That event being an HTTP request. After that you are correct, the handling is based on a middleware model.

@boazsegev
Copy link

boazsegev commented May 1, 2018

I'm paraphrasing slightly to focus on your question, forgive me and correct me if I misunderstood:

Why is this necessary? ... is there something simpler that we could formalise which allows more possibilities.

I believe this is necessary because hijack is complex.

There's a difference between simple and easy. hijack is easy - but not simple.

It might seem simple for Rack and maybe even simple in the server, but it's a super complex solution that requires applications to add IO handling code.

The hijack pattern is an anti-DRY and anti-SOLID pattern that breaks away from object oriented approaches and requires applications to become familiar with the networking layers.

Sure, we might need to keep hijack around because some things are just impossible without it, but that doesn't make it the best solution for everything.

On the other hand, the callback approach in this PR is simple, but it's clearly not easy.

Servers will have to put in some work (less work than applications need to put in for hijack, but still).

However, the solution is simple and follows both DRY and SOLID principles.

The callback approach isn't a generic solution, it's just good for things that lend themselves to using callbacks, but this is the advantage of this approach.

IMHO, This is a perfect fit for everything callback related and it's extendible (if WebSockets 2.0 come around, nothing needs to be changed on the Rack/Application side, just the server).

@boazsegev
Copy link

boazsegev commented May 1, 2018

@ioquatix , I was looking through the falcon repo and you seem to have done great work there.

I understand your concerns and your wish to have a better solution. It seems to me that you yourself put in a lot of time and effort into improving the sated of affairs and making things easier and better for everyone.

I appreciate your approach and I find myself slightly surprised by the challenges you raise against this PR.

It seems to me that you're using similar semantics with different method names (i.e., your WebSocket next_message seems to block the current thread, allowing other threads to handle other server events until a new message becomes available... or maybe I misunderstood).

Sure, my implementation is evented and your implementation is multi-threaded, but both lend themselves to the same approach - application side "callbacks".

Your implementation could just as easily remain threaded and hide the next_event loop away so it calls the correct callback.

I'm not sure I understand your reservations.

@ioquatix
Copy link
Member

ioquatix commented May 1, 2018

I believe this is necessary because hijack is complex.

Those are all really good points and I appreciate your thinking.

To be honest, in falcon, implementing your proposed API wouldn't be more than about 100 lines of code, perhaps less. So, it's not complicated to implement, because falcon already has a well defined and good concurrency model. That's the context in which I'm thinking about your PR, by the way.

Whether the complexity is in the server, or the application code, is an interesting point and one that I can agree with to a certain extent. I don't think there is a strong argument either way.

@ioquatix , I was looking through the falcon repo and you seem to have done great work there.

Thanks, I can tell we are going to get along well now.

I understand your concerns and your wish to have a better solution. It seems to me that you yourself put in a lot of time and effort into improving the sated of affairs and making things easier and better for everyone.

Yes, but I still haven't made a 1.0 release yet. It's coming soon hopefully.

I appreciate your approach and I find myself slightly surprised by the challenges you raise against this PR.

Fair enough.

It seems to me that you're using similar semantics with different method names (i.e., your WebSocket next_message seems to block the current thread, allowing other threads to handle other server events until a new message becomes available... or maybe I misunderstood).

That's almost right. Async doesn't use threads, it uses fibers which are cooperatively scheduled. They have much less overhead than threads.

Sure, my implementation is evented and your implementation is multi-threaded, but both lend themselves to the same approach - application side "callbacks".

Async is event driven. When a fiber performs an operation that would block, it yields back to the reactor which resumes once the operation can continue.

I deliberately try to avoid callbacks because in my experience they lead to callback hell. The difference is that with callbacks you need to use variables to track state, and each time the callback is invoked you have to process that state to figure out what to do next. With fibers, you naturally resume where you were with your stack intact. You can implement complicated state machines with ease.

Your implementation could just as easily remain threaded and hide the next_event loop away so it calls the correct callback.

Yes, this is feasible.

I'm not sure I understand your reservations.

I like this PR from the point of view that it tries to provide some generic interface for web sockets and server sent events. Making a generic interface for that is both simple and difficult. I admire the you've done it and implemented it.

My main reservation is that your proposed API is incomplete. The rack specification as it stands, is really simple and allows for a lot of flexibility. What you've proposed is an extension which fundamentally encodes a model for concurrency into Rack, something that it hasn't had except in a very precise location (response body #each).

To look at it another way, falcon passes the rack linter, yet it implements real-time streaming requests and responses. So, within the current confines of Rack, we can enjoy these advanced features.

Your proposed API exposes an incomplete model for concurrency and the phrase that comes to my mind is "thar be the dragons". What happens if you stick a RestClient.get into on_message - does it stall the entire server? How do you communicate between web sockets (in your PR I assume you'd need a shared backend like redis)? Whether or not you agree with it, the node model of single process, multiple connections is one that works very well - you can have an array of web socket connections, send messages to all of them, etc. This model can work well with Rack as shown by falcon (e.g. https://github.com/socketry/async-websocket/tree/master/examples/chat).

Whatever way you cut it, Ruby has a pretty frustrating model for concurrency right now (and IO too, but it's slowly getting better). People are working on it, myself, others (Eric, https://bugs.ruby-lang.org/issues/13618), ko1 (guilds) and I have no doubt that good things are coming to this space.

But, Rack, right now, has to be a specification that works with Ruby as it is today. If Thribers become part of Ruby (I personally hope not), the way servers should implement web-sockets should be totally different. It might be possible to make a MVP, like this PR, but I think it's loading Rack up with too much complexity.

@boazsegev
Copy link

boazsegev commented May 1, 2018

@ioquatix interesting, thanks for taking the time to explain.

I love the fiber approach, as it hides away the event reactor and make things more comfortable to manage.

My main reservation is that your proposed API is incomplete.

Yes, you are right that there are concerns related to the proposed approach. It doesn't deal with IPC or client<=>client communications. It also allows developers to violate multi-threading best practices.

However, this proposal seems pretty balanced. It doesn't enforce a huge change on existing servers and the "missing" pieces are easily complemented by existing solutions (Redis being just one of them).

Personally, I implement pub/sub in iodine to "fill the gap". However, I'm not sure this should be part of the specification since other developers might wish to "fill the gap" using a different approach, such as limiting the server to a single process and using an array.

What you've proposed is an extension which fundamentally encodes a model for concurrency into Rack...

I'm not sure I understand this part.

The server can be a single threaded server, a thread-per connection server, a fiber based server and practically any type of concurrency model can be implemented.

I'm not sure I see where the PR requires a specific concurrency model.

What happens if you stick a RestClient.get into on_message - does it stall the entire server?

That really depends on the server, I guess.

What happens if I call sleep(10) while handling an HTTP request (except getting fired, that is)?

It's probably the same answer.

Iodine, for example, supports multi-threading and cluster mode (pretty much the same model as Puma). Agoo (I think) supports multi-threading.

For both of these servers I would recommend avoiding blocking calls within the thread, but they both offer some level of protection before experiencing DoS.

I don't use fibers so I'm not sure how they would react to this scenario. Hopefully they will react better.

Either way, I doubt the question of having fibers replace threads is part of the discussion here.

Whether or not you agree with it, the node model of single process, multiple connections is one that works very well...

I agree with it, but I don't see why that matters.

Even the node model needs to be adjusted when scaling horizontally.

All models have properties that define their strengths and weaknesses.

IMHO, this PR proposes a mechanism that's independent of model used by the server, making it flexible enough for everyone to implement.

@ioquatix
Copy link
Member

ioquatix commented May 1, 2018

However, this proposal seems pretty balanced. It doesn't enforce a huge change on existing servers and the "missing" pieces are easily complemented by existing solutions (Redis being just one of them).

How do servers support your proposal? By pulling in websocket-driver? It's a large surface area, and it does get pretty tricky. I wouldn't call it balanced - it's heavily biased towards implementing WebSockets.

I'm not sure I understand this part.
The server can be a single threaded server, a thread-per connection server, a fiber based server and practically any type of concurrency model can be implemented.
I'm not sure I see where the PR requires a specific concurrency model.

Invoking a Rack app is for the most part trivial. It's a function call that returns a result. No real model for concurrency is needed for this basic definition. You scale up by executing the function on different CPU cores, but fundamentally you can't change it to a non-linear event driven callback model or some other asynchronous model (async provides transparent inversion of control back to the reactor, but it doesn't change or require changes in flow control).

This proposal embeds non-linear flow control into the Rack spec. What I mean is, it's impossible to implement the given proposal without some kind concurrency. I'm not saying that any particular model for concurrency is being encoded, but just that by your approach a model for currency is now necessary.

This has a huge down-stream effect, since all code that depends on Rack now has to be aware of and capable of asynchronous execution. For example, how would you change rack-test to support this PR?

I think this equally applies to rack.hijack, and I also don't like that approach, but it's pretty much set in stone now. For example, how do you implement rack.hijack with HTTP/2? It's probably not possible in the general sense.

What happens if I call sleep(10) while handling an HTTP request (except getting fired, that is)?

In async provided you call task.sleep(10) the fiber defers for 10 seconds. It doesn't block the server. In puma it would block the worker thread.

Have you tried using ActiveRecord in your on_message callback? How did that scale for you? My experience is that the ActiveRecord ConnectionPool design is very poor for highly concurrent workloads. These are very tricky issues to get right (although you can do it as shown here: https://github.com/socketry/async-postgres and it does scale up pretty well).

[regarding node] I agree with it, but I don't see why that matters.
Even the node model needs to be adjusted when scaling horizontally.

Because with your proposed API, implementing a basic Node style server like this is not possible. Right from the get go you need additional machinery to do pub/sub or other kinds of communication. Even the situation with streaming responses is not improved without additional work. It's a very specific proposal designed for a very specific kind of scalability. It's far too specific IMHO.

What we need is a proposal that better aligns with HTTP/2 streams since it's clear to me that it should be the future of HTTP and Rack. It should have a clear model for concurrency that fits with the existing multiprocess/multithread/worker implementations - i.e. reading from rack.input might block the request and writing to the response body might block the request due to buffering. On top of that, as already demonstrated, you can implement highly scalable asynchronous servers.

Streaming request and response bodies in falcon directly improve the latency of existing apps with no changes. Here is an implementation of rack.input which streams (and buffers) the input: https://github.com/socketry/falcon/blob/master/lib/falcon/adapters/input.rb

But this PR requires significant changes to existing apps for any kind of benefit. Not only that, but it only supports a very specific kind of scalability with an under-specified concurrency model (i.e. what happens if you block in on_message).

Let me finish with the following question: Do even think there is a future for WebSockets? https://datatracker.ietf.org/doc/draft-hirano-httpbis-websocket-over-http2/ hasn't been touched since 2014. There is an interesting write up here: https://daniel.haxx.se/blog/2016/06/15/no-WebSockets-over-http2/ - WebSockets are something which has never been a good fit for the request/response paradigm and that's something which fundamentally underpins Rack (assuming that rack.hijack is a poorly thought out addition to the spec :).

SPEC Outdated
on_shutdown() # may be called before a connection is closed due to server shutdown.
on_close() # called when the connection is closed
on_drained # may be called when the number of pending writes drops to zero.
The object will be extended by the server to include:

This comment has been minimized.

@tenderlove

tenderlove May 1, 2018 Member

This will break method caches. I'd prefer a solution that doesn't lean on runtime method extension. Maybe something like this:

class MyWSEObject
  def initialize io
    @io = io
  end

  def on_open
    @io.write("neat")
  end
end

app = lambda do |env|
  env['rack.upgrade'] = MyWSEObject.new(env['rack.wse.object'])
end

This comment has been minimized.

@ioquatix

ioquatix May 1, 2018 Member

class MyWSEObject
  def on_open(io)
    io.write("neat")
  end
end

app = lambda do |env|
  env['rack.upgrade'] = MyWSEObject.new
end

would be simpler

This comment has been minimized.

@tenderlove

tenderlove May 1, 2018 Member

That would be fine too. 😄

This comment has been minimized.

@ioquatix

ioquatix May 1, 2018 Member

So, the next question, is it a good idea to write into the request env to essentially send a response? To me, that's confusing. Perhaps at best it could be something like

if upgrade = env['rack.upgrade']
  return upgrade.call(MyWSEObject.new)
end

return [400, {}, []]

Perhaps the server could signal exactly what APIs are supported as in:

if upgrade = env['rack.upgrade.web_socket'] # or rack.upgrade.event_stream
  return upgrade.call(MyWSEObject.new)
end

return [400, {}, []]

This comment has been minimized.

@tenderlove

tenderlove May 2, 2018 Member

TBH it doesn't really matter to me. Unfortunately though, env is really the only way that middleware can communicate. So writing to env is the only way that one middleware could indicate to the next "I'm responding by writing to a websocket". If the intention is that any "callable" (middleware / app) that sets the WSE object must not call anything else in the middleware chain, then that needs to be explicitly stated in SPEC.

This comment has been minimized.

@ohler55

ohler55 May 2, 2018 Author Contributor

@ioquatix having the env['rack.upgrade'] respond to some kind of registration method would be a reasonable approach in my opinion. I feel a little uncomfortable having it provide a return value since it makes it less obvious what is being returned but my feelings are not very strong on that.

This comment has been minimized.

@tenderlove

tenderlove May 2, 2018 Member

@ohler55 ya, I think passing the IO object to every callback (as @ioquatix suggested) is my preferred approach.

This comment has been minimized.

@boazsegev

boazsegev May 2, 2018

For C extensions, there's double the amount of objects per connection.

Why is this?

@tenderlove , I assume you know that the C server doesn't need an IO object, it uses a simple integer to keep track of IO.

This means that a C extension just needs to keep the callback object itself and add a secret variable to that object with the socket number.

This secret variable isn't possible in Ruby, but C extensions are allowed to create them. Since that variable is a Number, it's very small (it doesn't create an Object in the ObjectSpace realm and it's immutable).

This means a single object in C (the callback object). If we add the IO object, it's two objects - a 200% increase in object count.

In Ruby, we will have the callback handler, the raw IO object (both pre-existing) and an IO wrapper, a 150% increase in object count.

This comment has been minimized.

@tenderlove

tenderlove May 2, 2018 Member

This means a single object in C (the callback object). If we add the IO object, it's two objects - a 200% increase in object count.

What % increase is that in an actual app?

This comment has been minimized.

@boazsegev

boazsegev May 2, 2018

What % increase is that in an actual app?

That really depends on the app's use-case.

A micro services app running a WebSocket API will get hit harder than an HTTP app with the occasional WebSocket upload manager.

I think the issue is more pronounced because we are talking about long term objects.

Short term object count doesn't effect memory fragmentation as much as the fact that there's a long-term object blocking a memory "block" / "arena".

The more long-term objects are created (and eventually destroyed), the faster the fragmentation.

Having said that, even a 200% increase in object count is probably better than the solutions we currently employ when implementing WebSockets using hijack.

I won't make an issue out of it. I'll do what everyone thinks best, but I know in very practical terms (having coded both implementations in C) that extend is cheaper both in memory and in the number of method tree traversals required.

@evanphx
Copy link
Contributor

evanphx commented May 2, 2018

Hi all! Websockets and rack again, let's do this!

Up front, last time this came up, I prototyped it in puma (puma/puma#1054). It was an experiment we didn't merge in, but it totally works and we might revisit it.

The API of on_* methods for events is totally fine, no issue there.

Extending the object to decorate it with a write method is pretty ugly, from both an implementation statement as well as a performance one. Passing an object that implements #write to the on_* methods to write back is less code and performs better.

If frameworks want to take that object and use extend to make it's methods available on the handler, that's totally fine and up to the framework. Rack should not do that on it's own, it should provide a lower requirement.

@SamSaffron
Copy link
Contributor

SamSaffron commented May 2, 2018

You should be able to opt to the "type" of upgrade you are after

I can already see 2 use cases that this protocol does not help at all

  1. https://github.com/discourse/discourse/blob/master/lib/hijack.rb

With the current example use of:

https://github.com/discourse/discourse/blob/master/app/controllers/user_avatars_controller.rb#L71-L78

  1. Messagebus also controls transport via: https://github.com/SamSaffron/message_bus/blob/master/lib/message_bus/client.rb and there is no clean protocol here for a "chunked encoding" http req short of SSE (which happens to fit sort of except for demanding "text/event-stream" as the content type

I would recommend

class Raw
  def type
      :raw  # or :sse or :socket
   end

  def on_open(io)
      # custom header support
      io.write_headers({  }) 
     
      # transparent chunked encoding support
      io.write_chunk("a chunk") 
  end
end

 env['rack.raw'] = Raw.new 

I am not super happy about rack.upgrade as a name cause it very tied to web sockets kind of terminology

@ioquatix
Copy link
Member

ioquatix commented May 2, 2018

Rather than env['rack.raw'] = Raw.new, why not just return [200, {}, Raw.new] and have servers do different things depending on #type.

@SamSaffron
Copy link
Contributor

SamSaffron commented May 2, 2018

Another thing that makes me somewhat uneasy here is that it still suffers from the very ugly trait that deep down the middleware stack someone returns rubbish to all the rest of the pieces that then gets thrown in bin.

So... to mitigate this very bad problem I would recommend:

  # only gets called AFTER everything walked down the middleware stack. 
  def on_open(io, env, response)
      status, headers, result = response

      # custom header support
      io.write_headers({  }) 
     
      # transparent chunked encoding support
      io.write_chunk("a chunk") 
  end

Overall I am leaning toward not even mucking with rack long term for a lot of this stuff and moving to fiber servers that can pause and resume more cleanly, multiplexing tons of connections cleanly. A big blocker there though is the MRI does not let you ship fibers between threads now which is not ideal.

That said this works ... so ... yay ?https://github.com/SamSaffron/performance/blob/master/fiber_server/server.ru

@SamSaffron
Copy link
Contributor

SamSaffron commented May 2, 2018

Also... one extra big question here is ... do we want to bloat all of our Rack servers out there with all this extra logic when simply adding a single dependency to middlware that implements this today is already feasible on top of hijack?

should we not start with gem install magic_generic_web_socket_support_middleware prior to pushing on implementers to all implement the same thing?

@ioquatix
Copy link
Member

ioquatix commented May 2, 2018

A big blocker there though is the MRI does not let you ship fibers between threads now which is not ideal.

Fibers are actually bound to the thread that creates them by definition.

Some designs use a more generic coroutine structure (e.g. green threads) which you can move between threads. If you are interested in this, you might find https://bugs.ruby-lang.org/issues/13618 interesting.

If you are interested in Fiber based servers, check out falcon. https://github.com/socketry/falcon/blob/master/examples/beer/config.ru and https://github.com/socketry/ which is a complete stack of asynchronous components.

Also... one extra big question here is ... do we want to bloat all of our Rack servers out there with all this extra logic when simply adding a single dependency to middlware that implements this today is already feasible on top of hijack?

Yes, I strongly agree with this, and the answer IMHO is no.

@ohler55
Copy link
Contributor Author

ohler55 commented May 2, 2018

@evanphx I appreciate you opinions and it is good to have a different perspective than what we had. I can't agree that the performance would be any different in one case over the other though. The other points are certainly worth considering.

@SamSaffron
Copy link
Contributor

SamSaffron commented May 2, 2018

yes, I am across Eric's work there and hope we get something in MRI, I also think Koichi is open to having a protocol for allowing to move Fibers between threads, and I want to see proper green threads back.

Also seen Falcon, I agree that a fiber based server is very very appealing for a bunch of workloads it simplifies so much of the mess we have now with pure threaded servers and slow requests, especially cause you can walk down the middleware stack properly in the right time.

@tenderlove
Copy link
Member

tenderlove commented May 2, 2018

should we not start with gem install magic_generic_web_socket_support_middleware prior to pushing on implementers to all implement the same thing?

TBH I'm OK with it if it's an opt-in. This proposal seems simple enough that someone could implement a "hijack" based approach using a middleware. The upside of formalizing it is that if webservers want to implement a high performance version of magic_generic_web_socket_support_middleware they can. I think of it as similar to X-Sendfile: yes we have a middleware that can do it, but webservers can implement an accelerated version since we have the standard.

@ohler55
Copy link
Contributor Author

ohler55 commented May 2, 2018

@SamSaffron in regard to on_open taking an io, env, and repsonse. By the time on_open is called the connection has already been established. Since it is either WebSocket or SSE a response and env are no longer relevant. The intent was to allow middleware to decide if a connection shoulc be established based on the return status in the #call(env) method that asked for the upgrade.

@SamSaffron
Copy link
Contributor

SamSaffron commented May 2, 2018

@tenderlove My issue though is that encourages a whole bunch of code duplication... say Puma takes this on now it needs to ship with a websocket protocol as a strong dependency, or worst still it will carry its own duplicate websocket protocol thingy. Chunked encoding is easy enough but there is a fair amount of code to do websocket upgrade and encoding depending on how far you want to take it and how many protocol variants you want to support.

@ioquatix
Copy link
Member

ioquatix commented May 2, 2018

I only think on_open should take an IO. It should be the responsibility of the instance to store the IO IMHO. This way you can call @io.write at any time, e.g. from a timer.

@boazsegev
Copy link

boazsegev commented May 4, 2018

write - exception vs. true/false?

Reading through the thread I noticed how hard it might be to predict a write failing.

This means that user code must use begin/rescue blocks to address network issues.

Since a failing write is a predictable and possible result in the normal flow, wouldn't it be better if the server handled any exceptions and simply returned true if the data was (or scheduled to be) written to the socket and false otherwise?

I think this will improve performance and also prevent the obvious nesting of exception handling code (on the server and the app).

Thoughts?

@ohler55
Copy link
Contributor Author

ohler55 commented May 4, 2018

I had assumed raising an exception was the natural behavior but maybe that is not the best approach. Even returning true or false does not necessarily indicate the message was delivered. I think the real question is, what information would be useful to the app. Maybe nothing. Callbacks exist for closing which lets the app know writes will no longer be delivered. Is an acknowledged protocol desired on top of WebSockets? I think that might be a step too far.

@ioquatix
Copy link
Member

ioquatix commented May 4, 2018

I think the next steps are:

  • To finish several implementations of the proposed spec, this will expose any obvious design issues with the spec.
  • To implement several practical applications, and test interop (including periodic/timer events, pub/sub, and upstream interaction, e.g. database/redis) - if we can't make useful applications that work on different servers (i.e. if you start relying on iodine's pub/sub which can't be implemented in puma, it's going to be a problem.
  • To prepare a finalised spec which takes into account the above and then have a formal vote, probably only involving actual stake-holders: Ruby server implementors and Rack contributors.
@ohler55
Copy link
Contributor Author

ohler55 commented May 4, 2018

There are currently two implementations of the originally proposed spec, Agoo and Iodine. I'd think that if proof that the spec is workable.

Waiting for several applications to be implemented against a spec that has not been approved seems rather unlikely. Does that pretty much means not changes can be made to the spec because there is no spec to follow. Rather circular logic.

So the purpose of this PR was to refine the proposal and then get a vote which would be result in this PR being either approved or rejected. How do we get to that phase?

@ioquatix
Copy link
Member

ioquatix commented May 4, 2018

Waiting for several applications to be implemented against a spec that has not been approved seems rather unlikely.

I don't think so. Making some sample apps is a great way to document the API, provides a great starting point for new users, etc. And finally, if the sample apps can't work between servers, what's the point of a shared SPEC?

@ioquatix
Copy link
Member

ioquatix commented May 4, 2018

There are currently two implementations of the originally proposed spec, Agoo and Iodine. I'd think that if proof that the spec is workable.

I think we need to see a working implementation in at least Puma & Passenger as a baseline since that is what is very common for Ruby development and production.

@ohler55
Copy link
Contributor Author

ohler55 commented May 4, 2018

I believe the suggested bar was several practical applications. There are already sample applications in the Agoo and Iodine repositories.

@ioquatix
Copy link
Member

ioquatix commented May 4, 2018

To implement several practical applications

Something more practical than "Hello World". For example, a shared real-time Todo list :p

I think we should be involving some database activity in the callback, probably using ActiveRecord or Redis.

If you've already got several sample applications, once we finalise a draft SPEC and we all implement it, we should be able to all run those apps on our respective servers, right? That's the whole point of having a shared SPEC.

@matthewd
Copy link
Member

matthewd commented May 4, 2018

I still don't see a statement about whether write can block.

I'm also unconvinced that pending/drained is an effective solution for non-blocking writes.

I remain of the opinion that this should remain a draft / small-e extension while real-world implementations adopt it and fully exercise it.

Easy-mode for convincing me would be a PR to Rails adapting Action Cable to this API -- which would also allow like-for-like benchmarking.

@boazsegev
Copy link

boazsegev commented May 4, 2018

@matthewd ,

Although I understand the suggestion (adding a Rails PR), I find that quite unfair.

I can program a server in C and I can wrap in a gem for Ruby MRI... but I know nothing about ActionCable, it's inner workings or the Client<=>Application protocol used above the WebSocket layer.

The learning curve and time spent on a Rails patch would only be viable if the PR is conceptually acceptable and we all intend to move forward with the proposal and implement it.

I really doubt if pushing a PR to Rails is something that should happen before we agree that the PR is conceptually acceptable (i.e., contingent on benchmarks, or whatever).

As a side note, I also don't think this should enter a Rails 5.x update release. This PR has the potential to retire the Rails IO handling layer (the one based on nio4r) and reduce the existing codebase significantly - at least where WebSockets and SSE are concerned (and long polling could be achieved by SSE Javascript's polyfills, retiring hijack altogether).

@boazsegev
Copy link

boazsegev commented May 4, 2018

I still don't see a statement about whether write can block.

@matthewd , I assumed (since there's no statement) that it would be implementation defined.

I'm assuming that blocking servers will be avoided by applications that are effected by the behavior.

But I totally understand if you believe there should be a concurrency model requirement regarding write and close not blocking. After all, if write blocks, pub/sub performance might grind to a halt.

on_drained(client) # may be called when the number of pending writes drops to zero.
The <tt>client</tt> is used for writing and checking status of the upgraded connect. It has these methods.
write(message) # writes to the WebSocket or SSE connection
close() # forces a close of the WebSocket or SSE connection

This comment has been minimized.

@boazsegev

boazsegev May 4, 2018

After reading through @ioquatix and @matthewd 's comments, I wonder:

Perhaps close() should "schedule the connection to close once all pending write calls have been performed"?

This comment has been minimized.

@ioquatix

ioquatix May 4, 2018 Member

If you want to have that, implement it as close_write which is what Ruby calls shutdown(SHUT_WR). close should guarantee after returning that the underlying socket is closed. Otherwise, you are in for a world of pain.

This comment has been minimized.

@ohler55

ohler55 May 4, 2018 Author Contributor

Agoo treats a cal to close as a request just like a write and places the close on the queue.

This comment has been minimized.

@boazsegev

boazsegev May 4, 2018

@ohler55, so does iodine. I place a "close" marker on the packet queue, so close is always performed after all scheduled data is sent.

@ioquatix , I think your missing the point of abstracting away the network and the protocol.

My suggestion was about clarifying this little bit, not changing what both iodine and agoo already implement.

We aren't authoring a network layer. We are authoring an abstracted application side API.

The reasonable exception is that write is performed before close. i.e., if my code is:

write "hello"
close

The reasonable expectation is that "hello" is actually written.

There's no force_close or close_write in the specification because the application shouldn't be concerned with these things. If the application doesn't want data written, it can avoid writing it.

This comment has been minimized.

@ioquatix

ioquatix May 4, 2018 Member

You can make close call flush, or you can flush after every write. But you make close call flush, you better be careful about EPIPE.

This comment has been minimized.

@ioquatix

ioquatix May 4, 2018 Member

For a high level protocol like this, calling flush after each write would make sense to me.

It provides the user with a strong expectation, that after calling write, the data has been sent, and pending a network failure, would arrive, or else the write fails right then, with, say, EPIPE. Otherwise you'll just end up with a spaghetti state machine trying to handle all these conditions.

This comment has been minimized.

@boazsegev

boazsegev May 4, 2018

@ioquatix ,

I hope I don't seem too blunt or crude. I very much appreciate the interest and willingness to polish the specification and make it both as clear and as practical as can be.

However, please consider the model to be totally separate from the network - there is no network. There's only this API.

We can change the API if we need features, but we don't expose network bits or logic because part of our job is to abstract these things away - so there is no network, there is no protocol (as much as possible).

In this sense, flush doesn't exist. It's a network / server detail that the application never sees, abstracted away by the server.

The closest an application can come to ask about these things is to ask about all the pending outgoing write events that haven't yet completed. This allows an application to know if the on_drain callback is somewhere in their future.

The pending query doesn't to expose the network, it exposes the progress of existing API calls. This provides important information about the possibility of a slow client (or a slow clearing "queue") allowing an application to stop serving a resource hungry "client".

This comment has been minimized.

@ohler55

ohler55 May 4, 2018 Author Contributor

If I understand your response you are answering yes to the question of whether or not a flush method should be added. Then in any application you wrote you would block until the write completes instead of making use of the on_drained callback. That is your choice of course.

This comment has been minimized.

@ioquatix

ioquatix May 4, 2018 Member

However, please consider the model to be totally separate from the network - there is no network. There's only this API.

Fair enough.

This comment has been minimized.

@ioquatix

ioquatix May 4, 2018 Member

If I understand your response you are answering yes to the question of whether or not a flush method should be added. Then in any application you wrote you would block until the write completes instead of making use of the on_drained callback. That is your choice of course.

I really find the inverted flow control of callback style programming horrible. So, I prefer #flush over an #on_drained callback. Callbacks = state machine spaghetti = hard to maintain/buggy code. It's just my personal opinion, FYI.

@ohler55
Copy link
Contributor Author

ohler55 commented May 4, 2018

@matthewd would a statement leaving the blocking behavior up the the server be enough or maybe a method to ask if blocking would block other threads?

@boazsegev
Copy link

boazsegev commented May 8, 2018

I have an initial development implementation of the updated draft, placed in iodine's 0.6.0 version branch (installable from source).

There's some features being re-written (such as pub/sub extensions, including Redis integration), but the core WebSocket/SSE and pub/sub API will work.

I think this would be enough to author a test application using a monkey-patched Rails...

@matthewd , any final changes / requests / tips before we start woking on a "like-for-like benchmarking" example app? How do you want to benchmark this? Should we use the websocket-shootout approach?

(it might not test what we're looking at, as memory consumption and performance will still depend heavily on the pub/sub system rather than the connection system...)

@ohler55
Copy link
Contributor Author

ohler55 commented May 8, 2018

The Agoo develop branch is also compatible with the spec as described in the current state of this PR. Should make a versioned release later this week once the named pub-sub feature is completed.

@ohler55
Copy link
Contributor Author

ohler55 commented May 9, 2018

I cleaned up the formatting of the SPEC around the upgrade section. An HTML version is at http://www.ohler.com/agoo/rack/file.SPEC.html.

@boazsegev
Copy link

boazsegev commented May 14, 2018

@evanphx , how is the Puma adaptation going?

So far, the API iodine implements for the client object looks like this.

It includes a number of methods not listed in the specification besides the Pub/Sub extensions (the timeout, timeout= and protocol method).

I also added the env accessor so I could avoid making objects just to keep the env alive. This is something I need for an upcoming plezi.io release and I'm not sure if it should be adopted (as it adds longevity to all the objects in the env Hash, which is bound to increase memory fragmentation).

Anyway, I hope to see this move forward. Iodine has been serving clients like this for the last couple of year or so (albeit using extent rather than a client object) and so far the feedback is positive and the performance is promising.

@ioquatix
Copy link
Member

ioquatix commented May 14, 2018

If you need extensions to make anything useful they need to be part of the spec IMHO.

@boazsegev
Copy link

boazsegev commented May 15, 2018

@ioquatix , this was my thought as well, which is why I wrote it here. Perhaps these added methods (timeout, timeout= and protocol) should be part of the specification.

On the other hand, I don't have to have the env variable, I just as easily could have create a callback object (using a class instance) instead of static module... it was more of a design preference that I thought I should mention.

EDIT I should note that the timeout methods aren't required to make WebSockets / SSE work. It's just that iodine also supports raw TCP/IP connections and these connections require access to timeout management.

@ohler55
Copy link
Contributor Author

ohler55 commented May 15, 2018

The entire pub-sub pattern can be implemented by keeping an Array of connections. A publish would then call write on each one. The reason pub-sub was not included in the PR was that it added another new use pattern to the spec that would take even more effort for servers to comply with.

Like @boazsegev and Iodine Agoo includes a pub-sub implementation that is compatible to a large degree with Iodine. In then end we wanted something basic that could be used with a minimum of changes in Rack and the way developers use it.

The current PR focuses just on Push without also introducing pub-sub. Maybe a future PR can propose the addition of pub-sub. One step at a time.

@ioquatix
Copy link
Member

ioquatix commented May 15, 2018

The entire pub-sub pattern can be implemented by keeping an Array of connections. A publish would then call write on each one. The reason pub-sub was not included in the PR was that it added another new use pattern to the spec that would take even more effort for servers to comply with.

This wouldn't work with falcon in forked mode, since it forks one process per CPU core and they all handle connections independently.

Even in threaded mode, some kind of synchronisation is required. I believe the same problem would affect puma in cluster mode, for example.

My opinion is that this spec needs to be feature complete, as in, it's possible within the confines of the spec to implement actual useful websocket applications. Otherwise, what's the point? If every server supports the core spec but provides their own incompatible models on top of that, this isn't going to work out very well.

@ohler55
Copy link
Contributor Author

ohler55 commented May 15, 2018

The PR, as it stands, is feature complete for WebSocket and SSE. Sure, extending the PR to include pub-sub would be a problem for servers that don't share state but that has nothing to do with WebSocket and SSE support proposed here.

@ioquatix, Slamming @boazsegev example code that is meant to demonstrate not only the proposed Rack additions but also other features of Iodine and then claiming that example invalidates this PR is ludicrous. The PR offers a clean and simple way for even new developers to take advantage of WebSockets and SSE. Some of the comments such as those from @tenderlove were constructive and genuinely aimed toward enhancing the current Rack spec with the addition of WebSockets and SSE. You have fought tooth and nail against it from the start with an extreme passion to block any variation on the PR. I don't understand why you feel so threatened by this PR. Can you explain why you have such strong feelings about offering Rack users an simple to use approach to WebSocket and SSE?

@rack rack locked as too heated and limited conversation to collaborators May 15, 2018
@matthewd
Copy link
Member

matthewd commented May 15, 2018

Okay, this conversation clearly needs to go back on ice for a bit.

I don't agree that pub/sub is within scope, but if we can't do each other the courtesy of assuming genuine opinions and positive/constructive intentions, then we're not going to get anywhere useful.

@tenderlove
Copy link
Member

tenderlove commented May 15, 2018

TBH I'm kind of worried about committing to an API without a real world app using it. The reason is because I want to make sure we don't forget anything we need to implement actual apps. If we could get Action Cable running on top of a server that used this API, then I'd be satisfied to merge.

@ioquatix
Copy link
Member

ioquatix commented Feb 5, 2020

There has been some great discussion here. Thanks everyone for their effort and passion. I am going to close this issue, because I don't think we reached a consensus.

That being said, further discussion is required, backed by concrete implementations. Specifically, I think more emphasis should be placed on HTTP/2 (and HTTP/3) and union of the semantic model when HTTP/1 is also included in the mix.

We welcome proposals for a better stream model for Rack 3.0 and we are currently considering how to make rack.hijack a first class citizen: #1550

@ioquatix ioquatix closed this Feb 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

8 participants
You can’t perform that action at this time.