debug object command issue #633

Closed
AnuragBerdia opened this Issue Aug 16, 2012 · 9 comments

Projects

None yet

4 participants

@AnuragBerdia

Hi,

We are using redis version 2.4.4.
We are using debug object command to get lru_seconds_idle: in redis.
It worked properly without any issue untill system's time changed somehow to some future time.
The system's timestamp again came back to normal in 15-20 mins.
But after this redis is returning lru_seconds_idle:0 for all keys.

This is easily reproducible at my end.

Thanks in Advance,
Anurag Berdia

@pietern
pietern commented Aug 16, 2012

Redis uses the system clock to estimate the idle time, so tampering with the clock will influence this.

Have you tried if this also happens with the latest 2.4 version, and the 2.6 release candidate?

@AnuragBerdia

Hi Pietern,

I agree with what you are saying. But shouldn't it recover from this once the time is again current time? I mean for those keys which inserted in future time it will give idle time 0, But for keys which inserted in current time it should return proper idle time value.
We accidentally changed the time of the system. I will try it with the latest 2.4 version and let you know.

@AnuragBerdia

The issue persist in Redis release 2.4.16.
My observations are:

  1. Before changing the time, I inserted the object into redis. Its lru_seconds_idle time keeps on increasing as normal.
  2. I changed system time to +1 day. The lru_seconds_idle is now increased to 86400 seconds(1 day) and it keeps on increasing normally.
  3. I changed system time back to normal. Now for that key it is same as it was before. For newly inserted keys it is zero.

If system clock changed somehow then restarting redis is the only solution, till now. Even flushing database doesn't help in this situation. May be a command or something could be helpful that one can issue from redis-cli prompt to make it right.
Please comment on above issue.

Thanks,
Anurag Berdia

@AnuragBerdia

Any updates on this? Can you please check this issue?

@jokea
jokea commented Aug 27, 2012

Just looked into this issue and it turns out that this doesnot only impact the object's lru time.
When changing the system time back at step 3, redis is malfunctioning.

The timer event serverCron() will get called +1 day later after you change the system's time back, and some core functionality
of redis will stop working, like replication with master. After step 3, I sent SLAVEOF command to redis and
it did nothing:

[4989] 27 Aug 16:24:00 * SLAVE OF 127.0.0.1:6379 enabled (user request)
[4989] 27 Aug 16:24:10 * SLAVE OF 127.0.0.1:6380 enabled (user request)
[4989] 27 Aug 16:24:17 * SLAVE OF 127.0.0.1:6381 enabled (user request)

??? Nothing happened, the server didn't try to connect to the master.

I think this bug can be avoided by changing the impl of the timer.

@jokea
jokea commented Aug 28, 2012

Here's a temporary fix for this issue. It simply save the last time(second), and if current
time is before the saved time, the timer event is triggered regardless of it's timeout value.

Index: ae.c
===================================================================
--- ae.c    (revision 482786)
+++ ae.c    (working copy)
@@ -54,6 +54,8 @@
     #endif
 #endif

+/* Store the last time(in seconds) when a time event being executed */
+static long last_sec = 0;
+
 aeEventLoop *aeCreateEventLoop(void) {
     aeEventLoop *eventLoop;
     int i;
@@ -235,7 +237,8 @@
         }
         aeGetTime(&now_sec, &now_ms);
         if (now_sec > te->when_sec ||
-            (now_sec == te->when_sec && now_ms >= te->when_ms))
+            (now_sec == te->when_sec && now_ms >= te->when_ms) ||
+            now_sec < last_sec)  
         {
             int retval;

@@ -261,6 +264,8 @@
                 aeDeleteTimeEvent(eventLoop, id);
             }
             te = eventLoop->timeEventHead;
+            /* Save current time */
+            last_sec = now_sec;
         } else {
             te = te->next;
         }
@jokea jokea added a commit to jokea/redis that referenced this issue Aug 28, 2012
@jokea jokea Fix for issue #633
When system time changes back, the timer will not worker properly hence some core functionality of redis will stop working(e.g. replication, bgsave, etc).

The patch saves the previous time and when a system clock skew is detected, it will force expire all timers.

This patch also make the timer list in sorted order, so getting the nearest timer is O(1) now. (Although adding/deleting timers are still O(N)).
75bbf71
@jokea jokea added a commit to jokea/redis that referenced this issue Aug 29, 2012
@jokea jokea Fix for issue #633
When system time changes back, the timer will not worker properly
hence some core functionality of redis will stop working(e.g. replication,
bgsave, etc).

The patch saves the previous time and when a system clock skew is detected,
it will force expire all timers.

This patch also make the timer list in sorted order, so getting the nearest
timer is O(1) now.(Although adding/deleting timers are still O(N)).
03e1f0c
@jokea jokea added a commit to jokea/redis that referenced this issue Aug 30, 2012
@jokea jokea Force expire all timer events when system clock skew is detected.
When system time changes back, the timer will not worker properly
hence some core functionality of redis will stop working(e.g. replication,
bgsave, etc). See issue #633 for details.

The patch saves the previous time and when a system clock skew is detected,
it will force expire all timers.
4274554
@antirez
Owner
antirez commented Oct 4, 2012

@AnuragBerdia I'm checking the issue you described today, news soon.

@antirez antirez added a commit that referenced this issue Oct 4, 2012
@jokea jokea Force expire all timer events when system clock skew is detected.
When system time changes back, the timer will not worker properly
hence some core functionality of redis will stop working(e.g. replication,
bgsave, etc). See issue #633 for details.

The patch saves the previous time and when a system clock skew is detected,
it will force expire all timers.

Modiifed by @antirez: the previous time was moved into the eventLoop
structure to make sure the library is still thread safe as long as you
use different event loops into different threads (otherwise you need
some synchronization). More comments added about the reasoning at the
base of the patch, that's worth reporting here:

/* If the system clock is moved to the future, and then set back to the
 * right value, time events may be delayed in a random way. Often this
 * means that scheduled operations will not be performed soon enough.
 *
 * Here we try to detect system clock skews, and force all the time
 * events to be processed ASAP when this happens: the idea is that
 * processing events earlier is less dangerous than delaying them
 * indefinitely, and practice suggests it is. */
5130b6f
@antirez
Owner
antirez commented Oct 4, 2012

@AnuragBerdia I think that @jokea's patch fixes this issue as well, as it resulted for the time event never being called to update the LRU clock. I'm merging the patch into 2.6 and unstable right now.

@antirez antirez added a commit that referenced this issue Oct 4, 2012
@jokea jokea Force expire all timer events when system clock skew is detected.
When system time changes back, the timer will not worker properly
hence some core functionality of redis will stop working(e.g. replication,
bgsave, etc). See issue #633 for details.

The patch saves the previous time and when a system clock skew is detected,
it will force expire all timers.

Modiifed by @antirez: the previous time was moved into the eventLoop
structure to make sure the library is still thread safe as long as you
use different event loops into different threads (otherwise you need
some synchronization). More comments added about the reasoning at the
base of the patch, that's worth reporting here:

/* If the system clock is moved to the future, and then set back to the
 * right value, time events may be delayed in a random way. Often this
 * means that scheduled operations will not be performed soon enough.
 *
 * Here we try to detect system clock skews, and force all the time
 * events to be processed ASAP when this happens: the idea is that
 * processing events earlier is less dangerous than delaying them
 * indefinitely, and practice suggests it is. */
e480c27
@antirez antirez added a commit that referenced this issue Oct 4, 2012
@jokea jokea Force expire all timer events when system clock skew is detected.
When system time changes back, the timer will not worker properly
hence some core functionality of redis will stop working(e.g. replication,
bgsave, etc). See issue #633 for details.

The patch saves the previous time and when a system clock skew is detected,
it will force expire all timers.

Modiifed by @antirez: the previous time was moved into the eventLoop
structure to make sure the library is still thread safe as long as you
use different event loops into different threads (otherwise you need
some synchronization). More comments added about the reasoning at the
base of the patch, that's worth reporting here:

/* If the system clock is moved to the future, and then set back to the
 * right value, time events may be delayed in a random way. Often this
 * means that scheduled operations will not be performed soon enough.
 *
 * Here we try to detect system clock skews, and force all the time
 * events to be processed ASAP when this happens: the idea is that
 * processing events earlier is less dangerous than delaying them
 * indefinitely, and practice suggests it is. */
b7b2a1c
@antirez
Owner
antirez commented Oct 4, 2012

Patch merged, thank you. Closing this issue :-)

@antirez antirez closed this Oct 4, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment