Skip to content
This repository was archived by the owner on Feb 7, 2024. It is now read-only.

Redis as a replication backend for scalability #140

Merged
merged 32 commits into from
Aug 13, 2020

Conversation

francislavoie
Copy link
Contributor

@francislavoie francislavoie commented Mar 25, 2019

This is a continuation of @snellingio's work in #61 and supersedes it.

Disclaimer: this is still WIP, I still have some work to do here before it's ready to go.

Things that are done:

  • Some general cleanup
    I fixed some typos here and there, added some additional type hints to make my IDE happy, added @mixin on Facades, etc.
  • Rewrote RedisClient to use lazy clients (thanks @WyriHaximus for implementing that feature!) and implemented pub/sub.
  • If client push is enabled, that should also work, via publishing to Redis. RedisClient will ignore messages from itself.
  • RedisPusherBroadcaster is implemented.
    This is a hybrid of the Redis and Pusher broadcasters that are shipped with Laravel. This is needed because we still want to use the Pusher auth logic (signing the broadcasted messages) but we want to broadcast via Redis instead of doing an HTTP request to the websocket server to push out messages.
  • Pub/sub logic is implemented under a feature flag in config, it gets checked at every entry-point into replication logic. This means that nothing should change for users that don't need replication
  • Scope the pub/sub channels on Redis by app ID
    This is needed so that channels from different apps don't cross-talk when they aren't supposed to. This is done in the Broadcaster and RedisClient. Redis channels are names "$appId:$channel" wherever needed.
  • Implement storing Presence Channel information in Redis
    This one was tricky, because among other things, it required rewriting some of the HTTP controller logic to support Redis' async IO. The replication feature flag complicates this a bit as well because we end up with two code paths throughout, wherever it's enabled. I'll probably need the most help reviewing this portion due to its complexity.
  • Tests
    I went the route of extending some of the existing tests, only running the tests with replication enabled as well, to hit the relevant code paths. RedisClient is not covered, a LocalClient mock is used instead.
  • Just a note: I found that it doesn't make sense to put any of the logic in the channel manager (e.g. RedisChannelManager) because it doesn't itself do anything. Channel and PresenceChannel are where the interesting things happen. Maybe these classes could be split up into replicated versions of each, but it doesn't seem entirely necessary yet.

Things that are still to do:

  • Improve reliability via Redis reconnect logic
    In case Redis goes down, RedisClient should attempt to reconnect, and if successful, should re-subscribe to all the channels on behalf of the users. This shouldn't be too hard, there's already local storage for the list of channels (see protected $subscribedChannels in RedisClient)
  • Documentation
    We'll need new sections in the documentation to describe how to set this up. Notably, users will need to add a new driver in broadcasting.php due to the hybrid broadcaster I implemented.

@francislavoie francislavoie force-pushed the redis-replication branch 3 times, most recently from 1e2838f to 2e4569e Compare March 29, 2019 14:25
@francislavoie
Copy link
Contributor Author

@WyriHaximus if you have the time, I wouldn't mind some clarification on how to handle reconnection with lazy clients. I think we'd only need to re-subscribe to the channels on reconnect (i.e. array_keys($this->subscribedChannels)). I think I'd want to bind to the close event then attempt to open a new connection and subscribe, but I'm not sure how that would look. Wouldn't mind a code snippet as an example.

@francislavoie
Copy link
Contributor Author

francislavoie commented Mar 29, 2019

Alright, I have the presence channel logic in. That was probably the hardest part to write for this feature. I don't love it, I'm hooking in all over the place to reach out to the replication backend, and had to change how some methods work such that they can return promises, but it looks like it should be functional (no pun intended).

At this point, it's mostly ready for code review, I'll start writing tests next chance I get to work on this. Sometime next week probably.

@WyriHaximus
Copy link

@WyriHaximus if you have the time, I wouldn't mind some clarification on how to handle reconnection with lazy clients. I think we'd only need to re-subscribe to the channels on reconnect (i.e. array_keys($this->subscribedChannels)). I think I'd want to bind to the close event then attempt to open a new connection and subscribe, but I'm not sure how that would look. Wouldn't mind a code snippet as an example.

I'll try to get an answer about that tomorrow by doing some experiments because I'm already interested in that 👍

@francislavoie
Copy link
Contributor Author

Awesome, thank you!

@francislavoie
Copy link
Contributor Author

I added tests to cover Replication. I went the route of extending some of the existing tests, only running the tests with replication enabled as well, to hit those relevant code paths. RedisClient is not covered, a FakeReplication mock is used instead.

The implementation is looking pretty good at this point, I'll just need to do some actual end-to-end testing to confirm everything is working as expected, and I'm waiting to hear back from @WyriHaximus on reconnection (which is mostly just a nice-to-have, IMO not a requirement to merge).

@mpociot or @freekmurze I'd really appreciate it if this could be reviewed soon, because #6 seemed pretty popular! Let me know if there's anything I can do to make this easier to review, I realize it's quite a big patch. I'm willing to chat about it out-of-band if it can help.

@WyriHaximus
Copy link

@WyriHaximus if you have the time, I wouldn't mind some clarification on how to handle reconnection with lazy clients. I think we'd only need to re-subscribe to the channels on reconnect (i.e. array_keys($this->subscribedChannels)). I think I'd want to bind to the close event then attempt to open a new connection and subscribe, but I'm not sure how that would look. Wouldn't mind a code snippet as an example.

I'll try to get an answer about that tomorrow by doing some experiments because I'm already interested in that

Sorry it took a bit longer then expected. Based on some experiments and digging through the code. You only need to resubscribe and maybe re-do any operations that haven't completed yet. Be careful with this tho. Your state in redis might have changed compared to what you want to execute. Also keep an eye on the error event to know what went wrong. And adjust your reconnect strategy based on the error you're getting.

@francislavoie
Copy link
Contributor Author

Sorry, I'm not sure I fully understand. Do you mean doing something like this should be fine?

// Pseudocode
$conn->on('close', function($conn) use ($subscriptions) {
    foreach ($subscriptions as $sub) {
        $conn->subscribe($sub);
    }
});

I don't see a way to explicitly reconnect for lazy clients. I'm not sure how I'd reconcile the close and error events, since those are separate things.

@WyriHaximus
Copy link

There is no need to explicitly reconnect for lazy clients, that's build in. Here are my two experiment files. You can restart redis without issue, but note the timeout in the subscriber, that is required due to the redis server needed some time to restart and when we resubscribe right away we're our connection might be refused:

publisher.php:

<?php

use Clue\React\Redis\Factory;

require 'vendor/autoload.php';

$loop = React\EventLoop\Factory::create();
$factory = new Factory($loop);

$client = $factory->createLazyClient('localhost');
$client->on('error', function (Throwable $throwable) {
    echo (string)$throwable;
});

$loop->addPeriodicTimer(1, function () use ($client) {
    $message = json_encode(['id' => 10, 'time' => time()]);
    $client->publish('user', $message);
});

$loop->run();

subscriber.php:

<?php

use Clue\React\Redis\Factory;

require 'vendor/autoload.php';

$loop = React\EventLoop\Factory::create();
$factory = new Factory($loop);

$client = $factory->createLazyClient('localhost');
$client->on('message', function ($channel, $payload) {
    // pubsub message received on given $channel
    var_dump($channel, json_decode($payload));
});

$client->subscribe('user');

$client->on('unsubscribe', function () use ($client, $loop) {
    echo '__unsubscribe__', PHP_EOL;
    $loop->addTimer(1, function () use ($client) {
        $client->subscribe('user');
    });
});
$client->on('error', function (Throwable $throwable) {
    echo (string)$throwable;
});


$loop->run();

@francislavoie
Copy link
Contributor Author

Awesome, that's very helpful! Reading that along with looking at LazyClient.php cleared up how that should work. That's pretty clean!

I was thinking, would there be a need for exponential backoff for connecting at the LazyClient level? Seems to me like it only tries to connect again one time in __call. Could be configurable like the idle timeout. I could open an issue for that if you think that would make sense.

@WyriHaximus
Copy link

shrug I'm unsure tbh because this is blurring the line between providing a simple remote-networked-service-client which also errors on you when it fails after a while. And something that totally takes care of everything you do with it. And I'm unsure what @clue's view is on that with this library, but you can always open an issue about it. My suggestion would be to handle that in this library

@francislavoie
Copy link
Contributor Author

Fair enough. But I think that enough of the implementation of the connection is handled by the LazyClient class that I don't think it's really possible to implement that sort of thing without extending or changing that class.

Anyways, thanks a bunch for your help! Very much appreciated.

@WyriHaximus
Copy link

You could do a decorator or an adapter implementing that behaviour. But that might be something for a follow up PR tbh

Any time, feel free to ping me when you need help with anything ReactPHP related things 😎

@francislavoie
Copy link
Contributor Author

Alright I have good news - I finally got around to trying this branch out in my own project to test things out for-real. It works!

I didn't try out all the features, because my app doesn't cover everything. I almost exclusively use private channels, no presence channels... so I'd need someone else to try this out with presence channels to confirm that those work as intended.

I'll make a PR on the docs repo later, but here's a quick writeup to start. I figure this'll be added under the "Advanced usage" section.

Configure broadcasting.php

Since the replication mode uses Redis as a pub-sub backend but still uses Pusher for authentication and such, a hybrid redis-pusher broadcast driver must be used, which is provided with the library's service provider.

Add the following in connections in your broadcasting.php config file:

        'redis-pusher' => [
            'driver' => 'redis-pusher',
            'key' => env('PUSHER_APP_KEY'),
            'secret' => env('PUSHER_APP_SECRET'),
            'app_id' => env('PUSHER_APP_ID'),
            'options' => [
                'cluster' => env('PUSHER_APP_CLUSTER'),
                'encrypted' => true,
                'host' => '127.0.0.1',
                'port' => 6001,
                'scheme' => 'http',
            ],
            'connection' => 'default',
        ],

Note that when replicated the host field should point to one of your websocket instances. Currently, the configured host does not matter because this broadcast driver doesn't use the Pusher API to broadcast messages, but rather uses Redis. The Pusher part of the driver is to conform with the Pusher API, although still using Redis as the communication mechanism instead of Pusher's HTTP API.

Remember to configure BROADCAST_DRIVER=redis-pusher in your .env to enable usage of this new driver!

Configure websockets.php

Now in your websockets.php config file, add the following: (NOTE: this should already exist once the PR is merged, only the enabled flag should be toggled)

    /*
     * You can enable replication to publish and subscribe to messages across the driver
     */
    'replication' => [
        'enabled' => true,

        'driver' => 'redis',

        'redis' => [
            'connection' => 'default',
        ],
    ],

This will enable the replication mode and use BeyondCode\LaravelWebSockets\PubSub\Redis\RedisClient as a replication driver. Currently, Redis is the only supported driver.

If you wish to implement your own replication driver, you could implement BeyondCode\LaravelWebSockets\PubSub\ReplicationInterface and register your implementation in the Laravel service container as a singleton.

Scaling

Now that you have things configured, there's a few things to know to run effectively with replication.

You'll need to be running a Redis instance which is reachable by all the websocket server instances. Redis is used to allow the websocket servers to inter-communicate when messages are to be broadcast to and from websocket clients.

So that browser clients don't need to know which websocket server they're connecting to, it's best to use a load balancer in front. If you plan on using the Pusher client-side HTTP API for querying the server for channel information, you'll also need to make sure that your load-balancing configuration allows for those requests to make it through along with the websocket connections. This might mean you may need to use an additional rule to load balance requests to /apps as well (/app is the websocket route, /apps is the HTTP API) if you have existing load balancer logic to handle websocket connections with the Connection and Upgrade headers.

You may also want to terminate TLS at your load balancer instead of configuring it in websockets.php. This makes it easier to handle certificate renewals in one place instead of on each websocket server instance and simplifies configuration for new instances.

@francislavoie francislavoie changed the title WIP: Redis as a replication backend for scalability Redis as a replication backend for scalability Apr 22, 2019
@snellingio
Copy link
Contributor

@francislavoie This looks great. I will give it a go in one of my apps this week. Great work!

@phantom8805
Copy link

phantom8805 commented Apr 23, 2019

@francislavoie how about this trait? he was added only in 5.8
https://laravel.com/api/5.8/Illuminate/Broadcasting/Broadcasters/UsePusherChannelConventions.html
you use him in src/PubSub/Redis/RedisPusherBroadcaster.php:15
and retrieveUser method also was added in 5.8 to https://laravel.com/api/5.8/Illuminate/Broadcasting/Broadcasters/Broadcaster.html in src/PubSub/Redis/RedisPusherBroadcaster.php:74

@francislavoie
Copy link
Contributor Author

@phantom8805 you're mentioning that as a concern for compat with older Laravel versions? That's a good point, I'll take a look soon. I really just copied the source of the Pusher broadcaster and replaced the broadcast function with Redis instead.

@phantom8805
Copy link

@francislavoie and thank for this pull request. it fixed my problem.

@vesper8
Copy link

vesper8 commented May 15, 2019

this looks great.. I wish we would hear from @mpociot in this and other awesome PRs to get an idea of what he likes/doesn't like and if this is on its way to ever being merged

@mpociot
Copy link
Member

mpociot commented May 15, 2019

I’ve already started reviewing and merging a couple of PRs in the last days.
Sorry, but this PR is huge and simply requires a lot of manual testing - even with a lot of automated tests included.

@francislavoie
Copy link
Contributor Author

I'll say that from my part, at the very least I appreciate that you mentioned that this inspired your work rather than not giving any attribution - I sent you an email in case you want to chat about it.

@m1stermanager
Copy link

gonna throw a big +1 out there for this

@rennokki
Copy link
Collaborator

rennokki commented Aug 12, 2020

I assume this package is no longer maintained or something. A few days ago I was in an urge to fixing some Websocket issues so I had to re-create a package from scratch, using most of this codebase code: https://github.com/renoki-co/sock. The only thing I haven't replicated to the package is the dashboard, which is less of a concern right now.

Why rewriting it? Most of the code wasn't commented, no code coverage to know what has been tested and what has not been tested, I have added more tests to cover some more code and use cases, and now trying to make it running well in a horizontally-scaled environment.

It's 11 PM and I have been working since 8 AM on trying to make it scalable until I reached my end of not being able to serialize the connections when storing channels, and they were too big to store them, even more, when there are hundreds of connections.

I have opened a PR here with the changes from contributor's PR: https://github.com/renoki-co/sock/pull/4

I will be able to bring up the feature testing with a Redis instance and test it with both Local and Redis instances to ensure they both work as stated in this PR.

In case you're asking, I haven't covered up the documentation since the primary concern is to make it work.

@vesper8
Copy link

vesper8 commented Aug 12, 2020

@rennokki I wouldn't be so quick to call this package abandonned but unfortunately this seems to be a trend with beyondcode/marcel's packages.. he's so busy making new things all the time that there isn't much time left for improving/maintaining his old packages.. I absolutely love his stuff and use almost everything he's put out and purchased every product beyondcode has sold so far.. but I do wish he hired help to maintain his old packages such as botman, this and others. This PR in particular should have been merged ages ago.. it's such a must-have for anyone that needs to work with bigger numbers. Anyway I appreciate your attempt to fix this problem and thank you for creating https://github.com/renoki-co/sock, I've starred it and am watching it, I hope you add some documentation and perhaps a migration guide for migrating from laravel-websockets or at least some docs to explain if this is also a pusher drop-in replacement and what similarities/differences it has with laravel-websockets

@rennokki
Copy link
Collaborator

Just to be clear - I'm not trying to get people to use it, I offer it as an alternative. I know how it feels when scaling things out to millions of requests everyday with improper resources.

I honestly hope this package will get proper maintenance asap and look after issues/PRs.

@okaufmann
Copy link

As maintainer of one of Marcel's smaller packages (https://github.com/mpociot/teamwork) I can really recommend being a maintainer to free him up, so he can do even more cool stuff 😍

@rennokki
Copy link
Collaborator

rennokki commented Aug 12, 2020

@okaufmann @mpociot I'd want to offer as a maintainer too on packages that need support. 🤔

@mpociot
Copy link
Member

mpociot commented Aug 12, 2020

@rennokki I just sent you a message via Twitter DM, but as I said in a previous comment:

If anybody wants to help me maintain this repo, feel free to get in touch with me.

Just send me a Twitter DM (they are open) and we can talk. I just can't find the time to work on laravel websockets at the moment tbh

Update: @rennokki is now added as a maintainer 🎉

@francislavoie
Copy link
Contributor Author

For transparency, I spoke with @mpociot as well and I said no because I'm no longer actually using this library in an active project at my company, so I wouldn't really be able to dogfood. I don't think I'd be that effective as a maintainer.

That said, @rennokki, please feel free to @ me for reviews/feedback if you want, I'm willing to help out in that way.

I'm glad to see this PR and project finally see some movement after nearly two years with the original proposal for horizontal scaling being in #6. I hope it sees a proper revival. 📈

@rennokki rennokki added enhancement New feature or request help wanted Extra attention is needed labels Aug 13, 2020
@rennokki rennokki changed the base branch from master to 2.x August 13, 2020 10:47
@rennokki
Copy link
Collaborator

@francislavoie I changed the base branch from master to 2.x as it's a breaking change and it can still be used with 2.x in the composer (in case the minimum stability is not stable` to test out the changes.

Looking forward to fixing the StyleCI pipeline checks in 2.x and add tests as well down there.

@rennokki rennokki mentioned this pull request Aug 13, 2020
14 tasks
@rennokki rennokki merged commit 8dc2856 into beyondcode:2.x Aug 13, 2020
@francislavoie francislavoie deleted the redis-replication branch May 15, 2021 14:07
@athamidn
Copy link

In laravel8 I have an error with redis-pusher for broadcast in env.
BROADCAST_DRIVER=redis-pusher
I changed exist pusher config to your config and it works.

@trocker
Copy link

trocker commented Jul 29, 2022

@athamidn - So for horizontal scaling, did you just do the following?

'redis-pusher' => [
            'driver' => 'pusher', //Instead of redis-pusher
            'key' => env('PUSHER_APP_KEY'),
            'secret' => env('PUSHER_APP_SECRET'),
            'app_id' => env('PUSHER_APP_ID'),
            'options' => [
                'cluster' => env('PUSHER_APP_CLUSTER'),
                'encrypted' => true,
                'host' => '127.0.0.1',
                'port' => 6001,
                'scheme' => 'http',
            ],
            'connection' => 'default',
        ],


and then edit websockets.php to add the following?


'replication' => [
        'enabled' => true,

        'driver' => 'redis',

        'redis' => [
            'connection' => 'default',
        ],
    ],


This weirdly does not work for me. Any idea why? Also, if you could point me to any documentation, that'll be helpful!

@trocker
Copy link

trocker commented Jul 29, 2022

@francislavoie - wondering if you still have any comments on this? The work done here is fabulous but unfortunately nowhere mentioned in any docs. Any pointers will be super helpful.

@francislavoie
Copy link
Contributor Author

I'm not involved in maintenance of this project anymore. Sorry.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.