Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Redis Sentinel #131

Closed
vmattila opened this issue Jul 13, 2013 · 70 comments
Closed

Support for Redis Sentinel #131

vmattila opened this issue Jul 13, 2013 · 70 comments
Labels
feature needs feedback redis-sentinel Replication (managed by redis-sentinel)
Milestone

Comments

@vmattila
Copy link

I would be happy to contribute to predis development by implementing automatic master/slave resolution and connection based on the configured sentinels. I tried to explore the Predis codebase, but unfortunately could not figure out where to start.

In practice, we should be able to enter the sentinel addresses instead direct master/slave ip addresses and provide the name of the master. For example:

$sentinels = array(
    'tcp://10.0.0.1:26379/',
    'tcp://10.0.0.2:26379/'
);
$redis = new Predis\Client($sentinels, array('sentinel_master' => 'mymaster');

One of the configured sentinels should be connected and sent queries to get the master ip: SENTINEL get-master-addr-by-name mymaster, slaves SENTINEL slaves mymaster and finally create the actual aggregated connection based on the received information.

Should I create a completely new connection type, extend an existing class or modify something? I also expect that I should implement two new command classes for the two sentinel commands.

Some references:

@nrk
Copy link
Contributor

nrk commented Jul 15, 2013

Hi @vmattila,

Having support for redis-sentinel in Predis would be awesome, unfortunately I don't have the time to investigate an implementation for it right now and I think it's not so trivial to get it right, so any contribution would be greatly appreciated.

I'd start by implementing a whole new connection class, we can eventually refactor the code later to optimize everything for reusing purposes. The second parameter of Predis\Client::__construct() should be used only for client options so it would be cool to play with connection parameters and do something like:

$sentinels = array(
    'tcp://10.0.0.1:26379',
    'tcp://10.0.0.2:26379/'
    'master' => 'mymaster',
);

We just need to pay attention here in the client class when initializing the right connection class for the job, but I think it's not that much of a problem. Anyway to keep things simple I'd just leave changes to Predis\Client for a later stage, you can feed Predis\Client::__construct() with an instance of Predis\Connection\ConnectionInterface after all:

$connection = new Predis\Connection\SentinelReplicationConnection($sentinels);
$client = new Predis\Client($connection);

PS: I just used the first name I could think of, there are probably better names for such a connection class.

As for the SENTINEL command, you only need to implement one class just like we did for CLIENT which supports four different subcommands ([see here]([like we did]%28https://github.com/nrk/predis/blob/v0.8/lib/Predis/Command/ServerClient.php%29).

Feel free to ask if you have doubts, I may not be able to get back to you immediately but I'll do my best! There's no hurry after all, I plan to finally ship v0.8.4 after delaying it for months for no valid reasons.

EDIT: if you start working on this one please use v0.8 as a base branch for you feature branch, and not master.

vmattila pushed a commit to vmattila/predis that referenced this issue Jul 20, 2013
@vmattila
Copy link
Author

Thanks @nrk for your comments and tips, I started working with this feature (based on v0.8) as you can see from the commit above. The code is already working, my branch contains a temporary test script sentinel_test.php in root of the repository.

SentinelBackedReplication simply extends MasterSlaveReplication but at check() queries the sentinels for actual configuration. This works fine in normal circumstances, but I doubt that in case of an already picked and used connection drops, this does not work well anymore. If the dropped connection is master, we should requery a sentinel and rebuild the whole connection stack. If the dropped connection is slave, we could try other slaves (I think this is handled already by MasterSlaveReplication, isn't it?) - if none of them works, then requery sentinel.

I had some problems with getting Sentinel command class working, saying first it cannot be used "with a cluster of connections". I just overrode getHash() to overcome this error for now, but how this should be defined for real?

At least this is a good start. =)

@nrk
Copy link
Contributor

nrk commented Aug 6, 2013

Seeing an actual implementation attempt made me aware of some issues coming with the overall approach, and as it turns out it isn't as easy as I thought to get sentinel right in Predis. This is a list of the problems I've found so far in you branch, minor issues and more serious ones are mixed as I wrote them down with no clear order in mind:

  • The class name for the SENTINEL command should be ServerSentinel for consistency with the naming of other commands.
  • We can't have a client instance inside a connection: the client class is an higher-level abstraction compared to a connection class. Using SingleConnectionInterface::executeCommand($command) to fire commands is enough in this case, but the downside is that we'd need to create the instance for the SENTINEL command by hardcoding a new ServerSentinel() (ugly, but more bearable than having a client inside the connection). BTW your problem with getHash() is related to the fact that the "wrapped" client instance thinks it's operating in a client-sharded cluster.
  • Hard-coded connection initialization assuming that it will always be Predis\Connection\StreamConnection, I might be wanting to use Predis\Connection\PhpiredisConnection after all. The same approach used for Predis\Connection\RedisCluster (that is, passing a ConnectionFactory instance which is used to create connection instances) could work here.

I'm starting to wonder if it'd be better to have a specialized client to handle sentinel-backed replication, more than a specialized connection class. Hmmm...

@vmattila
Copy link
Author

Thanks @nrk for your comments.

As a starter, I modified the SENTINEL command to ServerSentinel. Makes perfect sense. ;)

I was wondering about using SingleConnectionInterface::executeCommand($command). Could it be possible to use ConnectionFactory to build the actual connection to sentinels here? Something like...

$sentinelConnections = array('tcp://127.0.0.1:6380', 'tcp://127.0.0.1:6381');

// in SentinelBackedReplication::__construct()
$factory = new ConnectionFactory();
$sentinelConnection  = $factory->create($sentinelConnections);

// in SentinelBackedReplication::querySentinels()
$queryMastersCmd = new ServerSentinel('get-master-addr-by-name');
$masterResult = $sentinelConnection->executeCommand($queryMastersCmd);

As we always need more than one sentinel server configured, I'd like to rely on already existing Predis functionality to take care of failover to the next sentinel if one of them fails. This is why \Predis\Client was my very initial and uneducated choice - I see your point here well.

@nrk
Copy link
Contributor

nrk commented Aug 12, 2013

Yes but Predis\Connection\ConnectionFactory::create() only accepts parameters to create a single connection object at time, so you have to iterate over $sentinelConnections and use that method on each of its elements. Ideally, since we should be using only one sentinel server at time and switch to a different one only in case of failure, we just need to lazily create them starting with the connection object to the first sentinel server.

On a related note, I think it's really time for me to start thinking about Predis v0.9 since I have a couple of ideas (requiring a few breaking changes) born out of this discussion and that I'd like to try out as soon as possible to see if they are indeed valid or worth the effort: they could make our life easier with sentinel in the future, let alone giving Predis an even more solid design with its internal API.

@vmattila
Copy link
Author

Ah, indeed. In fact Redis documentation gives already some guidelines how reconnections should be handled, so implementing more specialized logic how $sentinelConnections are prioritized is a good idea. I will work on this.

vmattila pushed a commit to vmattila/predis that referenced this issue Aug 12, 2013
@vmattila
Copy link
Author

That was kind of easy refactoring in the end of the day...

I am not fully satisfied how the connection stack to sentinels are handled right now (array_shift and do { } while(true)), but this works now. This all might be better handled in something like SentinelConnectionStack object that could be passed to the constructor of SentinelBackedReplication. Developers could then create more complex sentinel selection strategies.

But some questions about exceptions at this point:

@nulpunkt
Copy link

I'm very interested in seeing this in Predis, if I can help in any way, say the word!

@nrk
Copy link
Contributor

nrk commented Mar 17, 2014

Hi @nulpunkt, support for redis-sentinel will be investigated as soon as Predis v1.0.0 is released (I'm late, so damn late, but it's finally coming :-)) since the new internals of the library will allow us to implement this feature in a decent way, a thing that simply wasn't possible with v0.8.

EDIT: if anyone wants to get their hands dirty, any work should be done against the current master branch. The initial work done by @vmattila and the notes reported on this issue are a good start.

@nulpunkt
Copy link

Cool :) I might have time to look at it on friday. Thank you for your reply!

@adichiru
Copy link

@integrii, you would please care to share the source of that info?
As far as I know from the official redis site and mailing list sentinel and cluster are 2 very different things. Maybe I missed something though....

Thanks,

@integrii
Copy link

I take it back @adichiru ! I found where I thought I read that and it was out of context. Sorry! I'll delete my post to prevent further confusion.

@nrk nrk added this to the 1.1.0 milestone Jun 1, 2014
@vmattila
Copy link
Author

I spent a couple of minutes to go through this issue and recent modifications from master branch. At least based on my recent work there was no significant changes, namely only namespaces etc. The current setup as in sentinel_test.php works well in different scenarios if somebody wants to already use this feature.

A couple of notes that I found out when working with changes:

Command\ServerSentinel::processMastersOrSlaves does never execute. @nrk, could you check AbstractCommand::readResponse() as it seems to currently just call $this->read(). I think there should be a call to CommandInterface::parseResponse() somewhere... I copied the code to SentinelBackedReplication class.

In case there is no usable slaves the MasterSlaveReplication does not work. In Sentinel case, I simply add the master as a slave too. I think that it would be OK to use the master as a slave instead of throwing an exception.

@vmattila
Copy link
Author

Just a reminder: my feature branch & fork can be found from https://github.com/vmattila/predis/tree/feature-sentinel. I will convert this as a pull request later.

@ThijsFeryn
Copy link

I played around with the feature branch that @vmattila made and I'm quite happy with the results.

The only thing I'm missing is a way to see which node was selected for a specific Redis command. But I guess that issue is not related to the Sentinel feature, but an issue related to replication in general.

I'd love to see this merged in v.1.0. @vmattila can you send in the pull request? Maybe add some tests first.

Some time ago I added Sentinel support to Credis, an alternative PHP client for Redis. I wrote some tests for it. Maybe this can help: https://github.com/ThijsFeryn/credis/blob/master/tests/CredisSentinelTest.php

Thanks for the contribution, I'd be very excited to see this merged into Predis.

@nrk nrk removed the postponed label Oct 21, 2014
@nrk
Copy link
Contributor

nrk commented Oct 21, 2014

All in all I think @vmattila's branch is good (actually I have yet to test it in practice, but the code looks fine with only a few minor adjustments needed).

That said, I'm wondering if it would be possible to avoid using a new connection class that extends Predis\Connection\Aggregate\MasterSlaveReplication and instead go with an approach that can be reused in different contexts (see Predis\Cluster\StrategyInterface and how we use it to implement reusable strategies in the context of clusters of redis nodes), something that can be used by the existing connection class that manages replication.

I'm not sure which is the best approach in practical terms but if we consider that Predis will also need replication for redis-cluster (a feature which is currently missing), it would be cool to investigate the chance of having a reusable class that can be used by both Predis\Connection\Aggregate\MasterSlaveReplication and Predis\Connection\Aggregate\RedisCluster.

@saintberry
Copy link

What's the recommended way to work with a sentinel until 1.1.0? Create a separate client for the sentinel, find master and then kill the sentinel client?

@ThijsFeryn
Copy link

Hi @nrk. I've looked both at Redis Cluster & the Sentinel feature branch of @vmattila and I believe his approach is the right one.

I understand that you're trying to find a "one size fits all" solution that applies to Redis cluster too. But Redis cluster has built-in autofailover and read-write splitting. There is no real need to make Predis master and slave aware because the cluster handles this for you.

As a consequence there is no need for Sentinel if you use Redis in cluster mode, because all the features are handled by the cluster and you can connect to a random node (either a master or a slave) and they'll handle read & write calls appropriately.

Using an Aggregate Connection that inherits from "Predis\Connection\Aggregate\MasterSlaveReplication" seems like a solid decision because that's what Sentinel is for: detecting masters and slaves.

Could you please reconsider accepting the pull request and merging the branch? @vmattila has done good work and a lot of people could benefit from it.

Thanks
Thijs

@nrk
Copy link
Contributor

nrk commented Dec 17, 2014

@ThijsFeryn @vmattila I'll try looking at this again during the first week of January as I will have some time off from work and won't be travelling :-)

@ThijsFeryn
Copy link

Any updates, Daniele?

Thx
Thijs

Op 17-dec.-2014, om 21:58 heeft Daniele Alessandri notifications@github.com het volgende geschreven:

@ThijsFeryn https://github.com/ThijsFeryn @vmattila https://github.com/vmattila I'll try looking at this again during the first week of January as I will have some time off from work and won't be travelling :-)


Reply to this email directly or view it on GitHub #131 (comment).

@integrii
Copy link

Are we sure that you can read/write to any node in the cluster? The six node cluster I have setup replies with MOVED and the correct node's location. I then have to connect to that node to complete the write. Making Predis aware of shards means the process goes faster. When a MOVED comes back from a server, Predis should ask again for the node distribution.

i could see this being slightly wasteful in small applications that only do one read or write, but long running applications would benefit from it greatly. Maybe caching the node information should be an option you can enable?

@ThijsFeryn
Copy link

Hi @nrk & @vmattila

I considered your advice again (using Sentinel as a strategy rather than a replication connection type). I tried implementing it as a strategy, but it felt wrong.

If you look at the problem Sentinel is trying to solve, it's autofailover and that's directly linked to the replication. Unlike the other strategy patterns, this is initializing new connections.

I believe the implementation of @vmattila is still the right decision.

Would you please consider merging this? I could really use this feature right now. And as mentioned in one of the comments above, Redis Cluster has its own autofailover mechanism.

Love to hear you opinion and plans for this feature.

Thanks again
Thijs

@vmattila
Copy link
Author

Just got an email from a user testing my branch. He pointed out a problem with authentication, just copy-pasting the message here so we don't forget the issue:

"I'm testing your feature sentinel for predis.
It works pretty good but did you try redis with AUTH ?
How can you add password to \Predis\Client($sentinel);?

I added $client->auth('mypassword') then it works fine for $client->get() but fails for $client->set()

PHP Fatal error: Uncaught exception 'Predis\Response\ServerException' with message 'NOAUTH Authentication required.' in /home/labuser/predis-feature-sentinel/src/Client.php:348"

@ziv-k
Copy link

ziv-k commented Mar 27, 2016

@nrk, finalizing this feature would be wonderful.

@vmattila
Copy link
Author

I have used the current v1.1-sentinel branch in multiple production environments without real problems.

100 ms default timeout has been slightly too low in our environment (for some reason) when sentinels have been distributed around different AWS availability zones. This is just to give heads up that most likely the timeout needs to be adjusted for each implementation for best performance.

Do you think that sentinels should actually be connected in random order? As now they are connected in the same order as they are defined, there will be kind of "unnecessary" and constant delay if first sentinels are down. Of course we could implement a completely new component to provide the sentinels in any order, just thinking whether it should be part of Predis or not.

@nrk
Copy link
Contributor

nrk commented Mar 27, 2016

I'm currently travelling around Taiwan so I'm basically afk for a while,
I'll be addressing this feature once and for all (along with a much needed
new release of Predis) around mid-April when I'm back home :-)
Il 27/mar/2016 21:07, "Ville Mattila" notifications@github.com ha scritto:

I have used the current v1.1-sentinel branch in multiple production
environments without real problems.

100 ms default timeout has been slightly too low in our environment (for
some reason) when sentinels have been distributed around different AWS
availability zones. This is just to give heads up that most likely the
timeout needs to be adjusted for each implementation for best performance.

Do you think that sentinels should actually be connected in random order?
As now they are connected in the same order as they are defined, there will
be kind of "unnecessary" and constant delay if first sentinels are down. Of
course we could implement a completely new component to provide the
sentinels in any order, just thinking whether it should be part of Predis
or not.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#131 (comment)

@nrk
Copy link
Contributor

nrk commented May 9, 2016

Alright I'm not dead. I resumed my work on Predis and am planning to tinker on the v1.1-sentinel branch this week and merge it into the master the next weekend, please note that I will most likely push a few minor breaking changes in Predis\Connection\Aggregate\SentinelReplication in the process. After that I'll be working on various fixes (especially for redis-cluster) before releasing v1.1.0 hopefully in a month but in the meanwhile I hope devs will be able to test the changes in redis-sentinel and report issues or missing features before shipping the new release. Thanks!

@nrk
Copy link
Contributor

nrk commented May 9, 2016

@vmattila I think connecting to a random sentinel makes also sense to distribute the load among the various sentinels provided via configuration, I will probably make it the default behaviour in Predis.

@tback
Copy link

tback commented May 9, 2016

Nice to hear this is moving forward. Please consider #291 (database selection) as well. I implemented a redneck version of this in traum-ferienwohnungen@5887201

@nrk
Copy link
Contributor

nrk commented May 9, 2016

@tback don't worry it's on the roadmap, well actually I think I'll just do something on the same line of Predis\Connection\Aggregate\RedisCluster::setDefaultParameters() so that you can actually specify more than just database.

@nrk
Copy link
Contributor

nrk commented May 10, 2016

First step in the right direction: forget the scary block of code previously needed to configure Predis\Client to use redis-sentinel, now you can just do the following:

$sentinels = ['tcp://127.0.0.1:5380', 'tcp://127.0.0.1:5381', 'tcp://127.0.0.1:5382'];
$options = ['replication' => 'sentinel', 'service' => 'mymaster'];

$client = new Predis\Client($sentinels, $options);

You can still use the aggregate option passing a callable if you need more control over the initialization of the underlying connection.

@nrk
Copy link
Contributor

nrk commented May 11, 2016

Just added support for ROLE in f8b3f18 which was an important missing bit for proper support of redis-sentinel (see http://redis.io/topics/sentinel-clients). Please note that the ROLE command was added in Redis 2.8.12 so Predis won't work with older versions of Redis 2.8. Prior to the introduction of ROLE clients had to rely on the output of INFO (definitely suboptimal) and I'm thinking about skipping that and just live with a minimum required version of Redis.

@nrk nrk added the redis-sentinel Replication (managed by redis-sentinel) label May 11, 2016
@nrk
Copy link
Contributor

nrk commented May 12, 2016

In the end I'm rewriting the whole Predis\Connection\Aggregate\SentinelReplication class from scratch without extending Predis\Connection\Aggregate\MasterSlaveReplication in order to squeeze a few more optimizations (e.g. Predis won't ask either for the master or the list of slaves unless necessary, thus avoiding a useless round-trip). All in all I'd say it's turning out really well and I'm pretty much satisfied with the result!

My plan is to merge the v1.1-sentinel into master as-is so that more people can start "playing" ASAP with redis-sentinel and create a new branch v1.1-sentinel-rewrite (or something like that) for the braves that will be kind enough to do some early testing with the new approach. The next and last step will be writing tests for the test suite...

@nrk
Copy link
Contributor

nrk commented May 13, 2016

@tback database selection and authentication will both be possible by setting database and password in an array passed to the parameters client option. This is an example of configuration that will use the same database and password for every connection when using redis-sentinel:

$sentinels = ['tcp://127.0.0.1:5380', 'tcp://127.0.0.1:5381'];

$options = [
    'replication' => 'sentinel',
    'service'     => 'mymaster',
    'parameters'  => ['database' => 2, 'password' => 'MYPASSWORD'],
];

$client = new Predis\Client($sentinels, $options);

@nrk
Copy link
Contributor

nrk commented May 14, 2016

I updated v1.1-sentinel with the rewritten Predis\Connection\Aggregate\SentinelReplication class (see b047e8b for the diff) which means now I need your feedback. If you want to see what happens under the hood using a CLI script you can use this quick&dirty connection class:

class DebugConnection extends StreamConnection
{
    public function connect()
    {
        if (!$this->isConnected()) {
            echo "[$this] ** CONNECTING\n";
            parent::connect();
        }
    }

    public function disconnect()
    {
        if ($this->isConnected()) {
            parent::disconnect();
            echo "[$this] ** DISCONNECTED\n";
        }
    }

    public function executeCommand(CommandInterface $command)
    {
        $response = parent::executeCommand($command);
        echo "[$this] => ", $command->getId(), " ", join(" ", $command->getArguments()), "\n";

        return $response;
    }
}

Then initialize the client like this:

$client = new Predis\Client($sentinels, [
    'replication' => 'sentinel',
    'service' => $yourservice,
    'connections' => ['tcp' => 'DebugConnection'],
]);

@lulyon
Copy link

lulyon commented May 16, 2016

That's Great! I would be very glad to follow the new approach for configuring Predis\Client to use redis-sentinel, and test the frequently used redis io features.

@nrk
Copy link
Contributor

nrk commented May 19, 2016

Now we also have tests for Predis\Connection\Aggregate\SentinelReplication (commit d6d3076), they are already a good bunch but I'll add a few more in the next days. I think the v1.1-sentinel branch is basically almost ready to get merged into master, I'd love to hear some feedback just in case anyone had the chance to give it a spin already in the last 5-6 days. Thanks!

PS: Predis v1.1.0 is planned to be released during the first week of June, we're almost there :)

nrk added a commit that referenced this issue May 21, 2016
This merge resolves #131.
@nrk
Copy link
Contributor

nrk commented May 21, 2016

I went ahead and merged v1.1-sentinel into master because I think everything looks good, consequently I'm closing this issue. If you encounter any problem with this new redis-sentinel connection class please open a new specific issue instead of posting new comments on this one. Thanks for the long wait, I admit it took quite a while 😄

@nrk nrk closed this as completed May 21, 2016
@tom--
Copy link

tom-- commented May 21, 2016

thanks @nrk

@ThijsFeryn
Copy link

Thanks @nrk. Any idea when the new release will be planned? Rather use a new release than dev-master in my composer.json.

Cheers
Thijs

@nrk
Copy link
Contributor

nrk commented May 23, 2016

@ThijsFeryn very soon, I'd say during the he first week of June.

@nrk
Copy link
Contributor

nrk commented Jun 2, 2016

JFYI Predis v1.1.0 is now available so we finally have redis-sentinel support in a stable release! Please read the release notes, it took quite some time (and in case of redis-sentinel, way too much time) but thanks for waiting!

@seoilkyu
Copy link

good! all clear thanks

@andrewmclagan
Copy link

Thank you!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature needs feedback redis-sentinel Replication (managed by redis-sentinel)
Development

No branches or pull requests