Skip to content

Conversation

palkan
Copy link
Contributor

@palkan palkan commented Feb 6, 2024

Motivation / Background

This PR aims to make Action Cable more flexible and extensible as a library/framework by separating low-level, implementation specific details (e.g., handling WebSockets, dealing with concurrency) from abstractions (Connection, Channel classes). That would make it possible to use user-level Action Cable code with alternative web servers or execution models (e.g., Fibers) without rewriting it or monkey-patching internals.

Tip

You can find a more detailed description of the problem, motivation and potential benefits of such refactoring in this document.

Note

This PR does not introduce any significant new features or public API breaking changes. It doesn't affect the existing functionality of built-in Action Cable server.

Important

This functionality from this PR is already available for Rails 7+ application through the actioncable-next gem. Please, give it a try and share your feedback here, this would help to get this PR merged into Rails.

Detail

The changes are grouped into commits to reflect the following major changes.

  1. Reorganizing threaded executors.

Action Cable relies on two threaded executors right now: one to perform user-space code (Worker) and another one to perform internal async operations around pub/sub. The latter was for some (historical?) reason hidden within the StreamEventLoop class that implements and IO loop to perform read/write operations on sockets. I've extracted the executor and promoted it into a server attribute independent of IO event loop. The main motivation of this extraction is not only in enforcing SoC but to separate WebSocket-specific code from more generic async executor used throughout the codebase.

I've also wrapped the Concurrent Ruby threaded executor into a custom class with a minimal interface (#post, #timer and #shutdown methods), so it will be possible to configure a custom executor for a server in the future (one use-case we have in mind is using a Fiber-based executor instead of a Thread-based one).

Note that this PR doesn't include any configuration-related code (e.g., to support smth like config.action_cable.async_executor = :fiber) but I'm considering adding something like that in the follow-up PRs.

  1. Limiting the scope of async operations.

This step is the follow-up to the async executor extraction. We have a few places in the Channel internals which directly refer to the event_loop but they're all related to pub/sub operations. So, to avoid unnecessary async operations at the abstraction level, I moved all it to the pub/sub layer by and standardized it across adapters by using a SubscriberMap::Async subclass.

Previously, we might have double async calls when performing #stream_from (one from the channel and one from the pub/sub adapter); now it's up to the adapter whether to create subscriptions asynchronously or not. We shouldn't duplicate this logic.

  1. Splitting ActionCable::Connection::Base into ActionCable::Connection::Base and ActionCable::Server::Connection

Today, the Connection class knows about everything from performing user code to setting up WebSocket event listeners to dealing with Rails executor wrapping. That prevents from reusing this class in environments when we don't need to deal with all of these (non-WebSocket servers or unit tests, more about them below).

This change is mostly code reorganization and minor tests updates (e.g., we no longer need to always use run_in_eventmachine, because now we deal with pure Ruby classes).

So, now we have two types of connections: low-level Server::Connection and application-level Connection::Base. The communicate with each other via the following interfaces:

classDiagram
  class _PubSub {
    subscribe: (channel, handler, callback) -> void
    unsubscribe: (channel, handler) -> void
  }
  
  class _Executor {
    post: (&block) -> void
    timer: (interval, &block) -> handle
  }
  
  class _Server {
    pubsub: _PubSub
    executor: _Executor
    config: Configuration 
  }
  
  class _Socket {
    logger: Logger
    env: Hash[String, untyped]
    request: Request
    protocol: String
    initialize: (_Server, env) -> void
    transmit: (untyped) -> void
    close: () -> void
    perform_work: (untyped receiver, method_name Symbol, *untyped args) -> void
  }
  
  class _Connection {
    initialize: (_Server, _Socket) -> void
    handle_open: () -> void
    handle_incoming: (payload) -> void
    handle_close: () -> void
    beat: () -> void
    statistics: () -> Hash[Symbol, untyped]
  }
  
  _Server --> _PubSub
  _Server --> _Executor
  _Connection --> _Socket
  _Connection --> _Server
Loading

Examples / follow-ups

Important

This repository contains examples, tests, and benchmarks for this PR: https://github.com/anycable/action-cable-next-playground

Action Cable testing

Here is an example commit (one of the future PRs) demonstrating how this refactoring simplifies Action Cable testing and make it more robust (no patches, real connection and channel classes, etc.)

Merged into this PR: commit

The corresponding rspec-rails upgrade is tracked here (minor changes): palkan/rspec-rails#1

Action Cable over SSE

Here is a sketch of how SSE transport for Action Cable can be implemented (based on the changes from this PR): https://gist.github.com/palkan/270a192e79b05d5601fe497ad3ec83b5

Iodine server implementation

Iodine comes with native (meaning implemented within the C extension) WebSockets and pub/sub support. We can now leverage these features for Action Cable in an adapteresque way: https://github.com/anycable/action-cable-next-playground/blob/master/lib/servers/iodine.rb

AnyCable integration

An upgraded (and less hacky) AnyCable Rails integration is in development here: anycable/anycable-rails#205

Other changes

  • Improved error reporting for subscriptions(AlreadySubscribed, ChannelNotFound, etc.)

Previously, we ignored some of this malfunction examples or simply wrote error log messages. With the new design, Subscriptions shouldn't make assumptions of how to deal with this violations, they must raise exceptions. Rescue handlers have been moved up the stack for the same reasons.

Additional information

Here is a Discord thread where we discuss the project.

There is an old PR attempting to achieve the same goal: #27648.

Checklist

Before submitting the PR make sure the following are checked:

  • This Pull Request is related to one change. Changes that are unrelated should be opened in separate PRs.
  • Commit message has a detailed description of what changed and why. If this PR fixes a related issue include it in the commit message. Ex: [Fix #issue-number]
  • Tests are added or updated if you fix a bug or add a feature.
  • CHANGELOG files are updated for the changed libraries if there is a behavior change or additional feature. Minor bug fixes and documentation changes should not be included.

/cc @ioquatix @matthewd @rafaelfranca

@rails-bot rails-bot bot added the actioncable label Feb 6, 2024
@ioquatix
Copy link
Contributor

ioquatix commented Feb 6, 2024

I am so going to make time today to review this!!!!

respond_to_invalid_request
end
def handle_open
connect if respond_to?(:connect)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about just having an empty connect method rather than using if respond_to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good question. I ported it from the current version; I will try to dig why it has been implemented this way in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!


@connection_class = -> { ActionCable::Connection::Base }
@worker_pool_size = 4
@executor_pool_size = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This configuration seems specific to threaded execution, does it make sense in general or should we consider server specific configuration?


# Executor is used by various actions within Action Cable (e.g., pub/sub operations) to run code asynchronously.
def executor
@executor || @mutex.synchronize { @executor ||= ThreadedExecutor.new(max_size: config.executor_pool_size) }
Copy link
Contributor

@ioquatix ioquatix Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this - and the pool size, should be provided entirely by the user, e.g. a traits class. Certainly for Falcon, ideally this is very different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to adapterize this as well (I mentioned in the description, "support smth like config.action_cable.async_executor = :fiber"), so it will be possible to plug in any executor.

But I decided to leave it for follow-up PRs to focus only on refactoring what we already have right now. I'd better introduce features one by one after we finalize the refactoring part.

@message_buffer = Server::Connection::MessageBuffer.new(self)

@protocol = nil
@app_conn = config.connection_class.call.new(server, self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should cache the result of config.connection_class.call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess, the primary reason why it's implemented like that is code reloading. We can check the app configuration value and decide if we need to cache it or not; however, I'm not sure it worth the effort.

Also, I see a potential extension here—we can turn it into a connection class selector depending on the request information:

config.connection_class = ->(request) {
  if request.path == "/admin/cable"
    Admin:Connection
  else
    ApplicationCable::Connection
  end 
}

So, keeping it dynamic makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is useful, I'm working on a NoStr server, I hack a NostrConnection, and inject to ActionCable in this way

ActionCable.module_eval do
  module_function def nostr_server
    @nostr_server ||=
      begin
        config = ActionCable::Server::Base.config.dup
        config.connection_class = -> { "NostrCable::Connection".safe_constantize }
        ActionCable::Server::Base.new config: config
      end
  end
end

# This class is heavily based on faye-websocket-ruby
#
# Copyright (c) 2010-2015 James Coglan
class ClientSocket # :nodoc:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will want to use an entirely different implementation of WebSocket - is this going to be a problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope.

The way I see Falcon integration is as follows: (almost) nothing from the action_cable/server folder is gonna be used. This is the code specific to Rails WebSocket server implementation. Other server are free (and encouraged) to use low-level code which works better for them.

We will probably extract some parts from the Rails server code along the way (e.g., access logs, etc.). But today, you can already start building a Falcon-specific version and use only the action_cable/channel and action_cable/connection code from Rails.

# This class is heavily based on faye-websocket-ruby
#
# Copyright (c) 2010-2015 James Coglan
class Stream # :nodoc:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class also seems extremely specific to the underlying implementation - again, is this something we can replace/bypass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto


module ActionCable
module Connection
module Server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can move the default server implementation into it's own directory, so that it's clear where the boundary between "the client/server interface" and the actual "rack hijack server" is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where the boundary between "the client/server interface" and the actual "rack hijack server" is.

(Already answered above but here is a shorter version 😁)

The line is drawn at the file structure level: action_cable/server is the Rails default rack hijack server, everything else is to be shared between different implementations.

@ioquatix
Copy link
Contributor

ioquatix commented Feb 6, 2024

This PR looks like a good start to me.

I would like to understand a little more how I might implement async-actioncable according to the changes presented in this PR.

I would like to propose that we use some kind of traits class to control more of the internal execution model, and ideally, a clear separation of "interface" vs "rack hijack_io implementation". That would help me understand how things fit together (layered).

@palkan
Copy link
Contributor Author

palkan commented Feb 7, 2024

Added one more example of using new architecture—Action Cable over SSE the Rails way (via ActionController::Live): https://gist.github.com/palkan/270a192e79b05d5601fe497ad3ec83b5

We can see some copy-pasting from Server::Connection (server.config.connection_class.call, new_tagged_logger) that makes me think that we should move this logic to the server object itself 🤔

@jasl
Copy link
Contributor

jasl commented Feb 11, 2024

I need to build a WebSocket service, and I'm trying to find a modern WebSocket framework in Ruby,
unfortunately, I don't see such an active Gem for this.
After several hours of research, I found ActionCable has potential.
The AC original protocol can easily be overridden.

@ioquatix
Copy link
Contributor

@jasl Are you building a backend service or are you looking for full stack?

For the backend, people have been having great success with https://github.com/socketry/async-websocket + falcon.

For full stack, https://github.com/socketry/live provides both a front end and a backend library. I'm going to fix the linked example so you can try it out.

@jasl
Copy link
Contributor

jasl commented Feb 12, 2024

@jasl Are you building a backend service or are you looking for full stack?

For the backend, people have been having great success with https://github.com/socketry/async-websocket + falcon.

For full stack, https://github.com/socketry/live provides both a front end and a backend library. I'm going to fix the linked example so you can try it out.

I already tried async-websocket + falcon. IMO it's much better than Faye stack, but it's low-level.
For a real product, I need to handle connection lifecycle, ping-pong, broadcast, etc.

that's why I seek this PR, if this works, async-websocket + falcon can be the backend (the executor)?

@jasl
Copy link
Contributor

jasl commented Feb 12, 2024

@ioquatix
I made a PoC https://github.com/jasl/dephy-relay/blob/main/app/channels/nostr_cable/connection.rb

Most of my codes are under https://github.com/jasl/dephy-relay/blob/main/app/channels/nostr_cable

This sample replaces the ActionCable protocol with the NoStr protocol, including subscription and broadcasting

@ioquatix
Copy link
Contributor

ioquatix commented Apr 7, 2024

@palkan How can we move forward with this proposal?

I'm running into more issues, for example, when running Rails on HTTP/2, ActionCable does not work correctly, because the browser tries to connect using HTTP/2 WebSockets :(

    3m    error: Rails: Incoming request: CONNECT "/cable". [ec=0x8cdc] [pid=178916] [2024-04-07 16:34:32 +1200]
               | Failed to upgrade to WebSocket (REQUEST_METHOD: CONNECT, HTTP_CONNECTION: , HTTP_UPGRADE: )

@palkan
Copy link
Contributor Author

palkan commented Apr 16, 2024

Hey @ioquatix! Just in time: I planned to work on the next iteration this week (already started). Stay tuned!

@palkan palkan force-pushed the refactor/action-cable-server-adapterization branch from 4104394 to e01afaa Compare April 16, 2024 02:55
@ioquatix
Copy link
Contributor

Awesome, if there is anything I can do to help, please let me know!

@palkan
Copy link
Contributor Author

palkan commented Apr 16, 2024

Did one more round:

@ioquatix
Copy link
Contributor

Wow, awesome work! Thanks. If I can make time, I'll try it out.

@ioquatix
Copy link
Contributor

@matthewd any chance you can review this PR? Thanks!

@matthewd
Copy link
Member

Yep, not sure how much time I'll have in the next week or two, but failing that, I'm sure we can practice some Conference Driven Development. 👍

@ioquatix
Copy link
Contributor

@palkan will you be at RubyKaigi?

@palkan
Copy link
Contributor Author

palkan commented Apr 26, 2024

@palkan will you be at RubyKaigi?

Yep. I'm also coming to RailsConf before that, so we I can try to find someone else to pitch this PR 🙂

end

# Tags are declared in the server but computed in the connection. This allows us per-connection tailored tags.
# You can pass request object either directly or via block to lazily evaluate it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem lazy at the moment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The block is only called if any of the config.log_tags is a Proc. Otherwise, we don't need to initialize a request object. So, the "lazy" refers to the difference between:

new_tagged_logger(request)

# and
new_tagged_logger { request }

@ioquatix
Copy link
Contributor

ioquatix commented May 8, 2024

Is there any chance we can merge this before 7.2 is released?

@palkan palkan force-pushed the refactor/action-cable-server-adapterization branch from 1e53f93 to f0d0f3f Compare September 13, 2024 02:34
@palkan palkan force-pushed the refactor/action-cable-server-adapterization branch from f3c418e to 486e24e Compare September 13, 2024 04:13
@palkan
Copy link
Contributor Author

palkan commented Sep 13, 2024

Some updates:

@ioquatix
Copy link
Contributor

Super excited to see renewed effort on this PR, thanks everyone and especially thanks @palkan for all your effort.

@palkan palkan force-pushed the refactor/action-cable-server-adapterization branch from e36c71a to 258d20c Compare September 13, 2024 19:50
@palkan
Copy link
Contributor Author

palkan commented Sep 14, 2024

@matthewd @ioquatix Okay, I think, I've reached the state I consider good and usable enough to let you do the final reviews 😀

Here is the project with examples, tests and benchmarks: https://github.com/anycable/action-cable-next-playground (helped me to find and fix some bugs). Now I feel pretty confident about the refactoring. Also, updated the PR description.

@ioquatix I would like you to take a look at the falcon integration and.. well, rewrite it, probably. I'm pretty sure my async code is far from perfect. I aimed to demonstrate the technique of using Action Cable with other servers. Feel free to propose a PR :)


Regarding the async compatibility of Action Cable I'd like to note is that most production subscription adapters uses the same Thread.new + #listen pattern. And it doesn't play well with Async, since we perform broadcasts (via the async executor) from a separate thread. In other words, it hits the "No async task available!" situation. Not sure if we can generalize them to work with both threads and fibers (I ended up writing a custom async-redis subscription adapter).

@ioquatix
Copy link
Contributor

ioquatix commented Sep 28, 2024

I tried out your implementation and I think it's okay.

I think the only thing missing is a schedule operation on a public interface for the executor, which is used to replace Thread.new{#listen} -> @executor.schedule{...}.

Thinking of this as a queueing problem, extracting out the logic into queue style operations might not be a bad idea, e.g. redis is providing a queue, but any suitable system can provide a queue, e.g. a database, in-memory, shared memory, unix IPC, etc. In other words, extracting the queue logic into (1) I want to publish an event and (2) I want to consume events would be a nice abstraction (I know some of this already exists).

@palkan
Copy link
Contributor Author

palkan commented Sep 30, 2024

@matthewd @ioquatix I made this PR available as a gem, so we can collect more feedback from the apps running on Rails 7+: https://github.com/anycable/actioncable-next. Please, spread the word 🙂

I think the only thing missing is a schedule operation on a public interface for the executor, which is used to replace Thread.new{#listen} -> @executor.schedule{...}.

Thanks! I'll take a look at this a bit later.

@ioquatix
Copy link
Contributor

With the above gem, it is now straight forward to implement https://github.com/socketry/async-cable

If anyone is able to test async-cable in a real application, it would help validate this PR.

@palkan
Copy link
Contributor Author

palkan commented May 9, 2025

Hey @matthewd,
Hope you're doing well.

Let's figure out what we need to make it a part of 8.1. With the release of async-cable, the primary (but not the only) target of this refactoring, we see more and more people switching to it or at least considering it. I'm pretty sure it's been in production for months (should we ask folks to show here and testify?).

Even though everything works through the actioncable-next gem, I consider it a temporary band-aid and not the final solution to the problem. There should be one Action Cable, not two.

@yasoob
Copy link

yasoob commented Aug 12, 2025

Hello folks! Any plans on getting this in the next release :)

@jasl
Copy link
Contributor

jasl commented Aug 13, 2025

Kindly ping, what's progress now?

@ioquatix
Copy link
Contributor

The way for us to move forward here, is for people to test out Falcon + Async::Cable and report back that it is working in production (ideally with a little bit of evidence).

@gjtorikian
Copy link
Contributor

I don’t know what evidence is required, but I’ve built a small LLM chat app, deployed to Fly.io, with that setup. It works, and it’s fantastic. I became interested in the async stack after reading these docs from the RubyLLM gem.

Happy to run a stress test or whatever is needed. I’ve been lurking as a subscriber on this thread, so I’m not sure how aware the broader Rails/Ruby+AI community is of this work.

@ioquatix
Copy link
Contributor

If we can collect a few data points, then I'll try to escalate up.

@andreaslillebo
Copy link

I'm really grateful for all the effort being put into this. We've been running Falcon + Async::Cable in production now since August 19th. Have not encountered any issues yet.

We are mainly using it for LLM chat streaming. 6.2k assistant messages, about 1.4M turbo-streams.

That's at least one more data point. Please let me know if there is anything else I can do to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants