Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perl-redis doesn't receive subcribed message if server restart (and take 100% cpu) #34

Closed
shadowwa opened this issue Feb 1, 2013 · 2 comments
Assignees

Comments

@shadowwa
Copy link

shadowwa commented Feb 1, 2013

Two weeks ago, we used perl-redis in an application that subscribed topic and waiting message, basically something like:

#!/usr/bin/perl
use lib q{lib};
use 5.010;
use strict;
use warnings;

use Redis;

my $rsubcriber = Redis->new(
    reconnect => 1,
);

say 'subscribing';
$rsubcriber->subscribe(
    'topic',
    sub {
        say 'I received a message on my topic';

    },
);
say 'now restart the redis and watch the cpu';
$rsubcriber->wait_for_messages(10) while 1;

we found out that if the redis server was restarted after our application began to wait for messages, our application suddenly began to take 100% of CPU and didn't received message for our topic any more. We solved the problem by reconnect and resending a (P)SUBSCRIBE command to the redis server that we just reconnected (callbacks still were in $self->{subscribers}).

reconnect to a subscribed redis server when this one restart
Jérôme "Melkor" Bourgeois <jerome.bourgeois@sfr.com>
Stéphane "Shad" Pontier <stephane.pontier@free.fr>
Index: b/lib/Redis.pm
===================================================================
--- a/lib/Redis.pm  2013-01-09 18:10:58.801927022 +0100
+++ b/lib/Redis.pm  2013-01-10 11:08:33.769644668 +0100
@@ -274,6 +274,28 @@

   my $count = 0;
   while ($s->can_read($timeout)) {
+    my $status = __try_read_sock($sock);
+    if ( (!defined $status) || $status == 0) {
+      $s->remove($sock);
+      delete $self->{sock};
+      close $sock;
+
+      sleep 1;
+      $self->__connect;
+      while (my ($topic, $cbs) = each %{$self->{subscribers}}) {
+        if ($topic =~ /(p?message):(.*)$/ ) {
+          my $key = $1;
+          my $channel = $2;
+          if ($key eq 'message') {
+            $self->__send_command('SUBSCRIBE', $channel);
+          } else {
+            $self->__send_command('PSUBSCRIBE', $channel);
+          }
+        }
+      }
+      $sock = $self->{sock};
+      $s->add($sock);
+    }
     while (__try_read_sock($sock)) {
       my ($reply, $error) = $self->__read_response('WAIT_FOR_MESSAGES');
       confess "[WAIT_FOR_MESSAGES] $error, " if defined $error;

I saw this morning that since two week the code evolved in the master git but this problem was still there and I didn't had time to rework the patch. this one was applied to the v1.956 branch.

@melo
Copy link
Member

melo commented Feb 1, 2013

Ok, the problem is that after a reconnect the subscriptions are not resent. Makes sense.

I'll take you patch as a base for a solution, but I won't use it directly because I don't want the sleep(1) in there. I'll probably move your code to resubscribe to __build_sock, before the return.

Thanks for the report, I'll try and prep a release during the weekend.

@ghost ghost assigned dams Nov 6, 2013
@dams
Copy link
Member

dams commented Nov 17, 2013

I think that the diagnostic is slightly wrong (sorry!). When looking at the code (in latest version), if the server is restarted, then we go out of the while ($s->can_read($timeout)) loop, so we return from wait_for_message, and we go back to the while(1) from the application code. No reconnect is attempted. So I'll change the patch a bit to call reconnect if immediately after the can_read we have an EOF. And reconnect will re-subscribe.

@dams dams closed this as completed in 849782d Nov 18, 2013
dams added a commit that referenced this issue Nov 29, 2013
    * fix #60: TEST_REQUIRES needs newer MakeMaker
    * fix #34: perl-redis doesn't receive subcribed message if server restart
    * fix #38: select new database doesn't survive after reconnect
    * minor documentation fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants