New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis did not save neither regularly nor on server reboot, one month of data lost #3000

Closed
Radiergummi opened this Issue Jan 12, 2016 · 9 comments

Comments

Projects
None yet
3 participants
@Radiergummi

Radiergummi commented Jan 12, 2016

I'm a bit in shock right now, you may of course close this at any time. My redis server is using the default save configuration:

save 900 1
save 300 10
save 60 10000

A cronjob executes a file backup every 12 hours, copying the dump.rdb to a remote location which worked flawlessly. When I just rebooted the db server to move to a bigger VPS instance, I noticed the complete last month has not been written to the dumpfile, its timestamp being the same on the server as well as each backup, leaving me with several thousands of datasets lost. This is the absolute worst case imaginable for me right now. Looking at the logfiles, redis reported without exceptions for the last weeks:

[27194] 12 Jan 02:58:06.483 * RDB: 1 MB of memory used by copy-on-write
[8621] 12 Jan 02:58:06.613 * Background saving terminated with success
[8621] 12 Jan 03:03:07.103 * 10 changes in 300 seconds. Saving...
[8621] 12 Jan 03:03:07.131 * Background saving started by pid 27195

Checking the file stamps of the dumpfile now that everything is up again, it seems like redis dumps ordinarily.
I know I won't get my data back, I'm just absolutely stunned by the possibility of this happening.

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Jan 12, 2016

Owner

Hello, I'm sorry you faced this severe issue. Redis reporting to persist on disk, while it actually does not, is something we never saw in 6 years. When there is a bug, the child process reports an issue and the parent detects the issue and stops the writes refusing any new update, until the condition that prevents saving on disk is fixed.

Unfortunately, based on the informations you posted above, it is impossible to understand what was happening in your server. Is it possible that the directory where Redis persisted was changed? This is easy to check with the following two commands:

redis-cli CONFIG GET dir
redis-cli CONFIG GET dbfilename

In this case, Redis would still be able to persist correctly, however the dump files will be not where you expect them to be, and the backup will also still fetch the old file.

Without further details, and without ever having observed an issue like this, the highest probability for the medal model I've of how Redis works, is that this was just an operation error of some kind. However if this is not the case in your opinion, I'm very welcome to investigate further, assuming you have still the ability to retrieve informations from the old box.

Owner

antirez commented Jan 12, 2016

Hello, I'm sorry you faced this severe issue. Redis reporting to persist on disk, while it actually does not, is something we never saw in 6 years. When there is a bug, the child process reports an issue and the parent detects the issue and stops the writes refusing any new update, until the condition that prevents saving on disk is fixed.

Unfortunately, based on the informations you posted above, it is impossible to understand what was happening in your server. Is it possible that the directory where Redis persisted was changed? This is easy to check with the following two commands:

redis-cli CONFIG GET dir
redis-cli CONFIG GET dbfilename

In this case, Redis would still be able to persist correctly, however the dump files will be not where you expect them to be, and the backup will also still fetch the old file.

Without further details, and without ever having observed an issue like this, the highest probability for the medal model I've of how Redis works, is that this was just an operation error of some kind. However if this is not the case in your opinion, I'm very welcome to investigate further, assuming you have still the ability to retrieve informations from the old box.

@Radiergummi

This comment has been minimized.

Show comment
Hide comment
@Radiergummi

Radiergummi Jan 12, 2016

Hello antirez, thank you for your quick reply. I know myself this is most probably some kind of configuration error, though I just don't what it is.
I'll try to give you as much information as I can.

Output of redis-cli CONFIG GET dir:

1) "dir"
2) "/var/redis/6379"

Output of redis-cli CONFIG GET dbfilename:

1) "dbfilename"
2) "dump.rdb"

Output of locate *.rdb:

/backup/var/redis/6379/dump.rdb   # my local backup copy
/home/mo/dump.rdb                       # a copy of /var/redis/6379/dump.rdb I made before the reboot
/var/redis/6379/dump.rdb

Output of ls -Alh /var/redis/6379/dump.rdb:

-rw-r--r-- 1 root root 124M Jan 12 17:06 /var/redis/6379/dump.rdb

Output of the redis log file before and after the reboot:

[28204] 12 Jan 12:57:33.511 * RDB: 4 MB of memory used by copy-on-write
[8621] 12 Jan 12:57:33.642 * Background saving terminated with success
[8621] 12 Jan 13:02:34.045 * 10 changes in 300 seconds. Saving...
[8621] 12 Jan 13:02:34.098 * Background saving started by pid 28210
[28210] 12 Jan 13:02:45.191 * DB saved on disk
[28210] 12 Jan 13:02:45.217 * RDB: 2 MB of memory used by copy-on-write
[8621] 12 Jan 13:02:45.332 * Background saving terminated with success
[8621 | signal handler] (1452600307) Received SIGTERM scheduling shutdown...
[8621] 12 Jan 13:05:07.876 # User requested shutdown...
[8621] 12 Jan 13:05:07.877 * Saving the final RDB snapshot before exiting.
[2617] 12 Jan 13:24:27.494 * Increased maximum number of open files to 10032 (it was originally set to 1024).
                _._
           _.-``__ ''-._
      _.-``    `.  `_.  ''-._           Redis 2.8.19 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._
 (    '      ,       .-`  | `,    )     Running in stand alone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 2617
  `-._    `-._  `-./  _.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |           http://redis.io
  `-._    `-._`-.__.-'_.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |
  `-._    `-._`-.__.-'_.-'    _.-'
      `-._    `-.__.-'    _.-'
          `-._        _.-'
              `-.__.-'

[2617] 12 Jan 13:24:27.499 # Server started, Redis version 2.8.19
[2617] 12 Jan 13:24:27.499 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
[2617] 12 Jan 13:24:27.499 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
[2617] 12 Jan 13:24:27.499 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
[2617] 12 Jan 13:24:40.332 * DB loaded from disk: 12.832 seconds
[2617] 12 Jan 13:24:40.332 * The server is now ready to accept connections on port 6379
[2617] 12 Jan 13:30:40.724 * 10 changes in 300 seconds. Saving...
[2617] 12 Jan 13:30:40.744 * Background saving started by pid 2991
[2991] 12 Jan 13:30:49.462 * DB saved on disk
[2991] 12 Jan 13:30:49.476 * RDB: 1 MB of memory used by copy-on-write
[2617] 12 Jan 13:30:49.588 * Background saving terminated with success
[2617] 12 Jan 13:35:50.076 * 10 changes in 300 seconds. Saving...
[2617] 12 Jan 13:35:50.093 * Background saving started by pid 3143
[3143] 12 Jan 13:35:58.760 * DB saved on disk

Current redis configuration file (mostly the default one): https://gist.github.com/Radiergummi/c8fcc0f7ec381bb87088

Unfortunately, I just resized my DigitalOcean Droplet, so there is nothing left of the "old" box. Please tell me if I can provide you with any more information.

Radiergummi commented Jan 12, 2016

Hello antirez, thank you for your quick reply. I know myself this is most probably some kind of configuration error, though I just don't what it is.
I'll try to give you as much information as I can.

Output of redis-cli CONFIG GET dir:

1) "dir"
2) "/var/redis/6379"

Output of redis-cli CONFIG GET dbfilename:

1) "dbfilename"
2) "dump.rdb"

Output of locate *.rdb:

/backup/var/redis/6379/dump.rdb   # my local backup copy
/home/mo/dump.rdb                       # a copy of /var/redis/6379/dump.rdb I made before the reboot
/var/redis/6379/dump.rdb

Output of ls -Alh /var/redis/6379/dump.rdb:

-rw-r--r-- 1 root root 124M Jan 12 17:06 /var/redis/6379/dump.rdb

Output of the redis log file before and after the reboot:

[28204] 12 Jan 12:57:33.511 * RDB: 4 MB of memory used by copy-on-write
[8621] 12 Jan 12:57:33.642 * Background saving terminated with success
[8621] 12 Jan 13:02:34.045 * 10 changes in 300 seconds. Saving...
[8621] 12 Jan 13:02:34.098 * Background saving started by pid 28210
[28210] 12 Jan 13:02:45.191 * DB saved on disk
[28210] 12 Jan 13:02:45.217 * RDB: 2 MB of memory used by copy-on-write
[8621] 12 Jan 13:02:45.332 * Background saving terminated with success
[8621 | signal handler] (1452600307) Received SIGTERM scheduling shutdown...
[8621] 12 Jan 13:05:07.876 # User requested shutdown...
[8621] 12 Jan 13:05:07.877 * Saving the final RDB snapshot before exiting.
[2617] 12 Jan 13:24:27.494 * Increased maximum number of open files to 10032 (it was originally set to 1024).
                _._
           _.-``__ ''-._
      _.-``    `.  `_.  ''-._           Redis 2.8.19 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._
 (    '      ,       .-`  | `,    )     Running in stand alone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 2617
  `-._    `-._  `-./  _.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |           http://redis.io
  `-._    `-._`-.__.-'_.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |
  `-._    `-._`-.__.-'_.-'    _.-'
      `-._    `-.__.-'    _.-'
          `-._        _.-'
              `-.__.-'

[2617] 12 Jan 13:24:27.499 # Server started, Redis version 2.8.19
[2617] 12 Jan 13:24:27.499 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
[2617] 12 Jan 13:24:27.499 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
[2617] 12 Jan 13:24:27.499 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
[2617] 12 Jan 13:24:40.332 * DB loaded from disk: 12.832 seconds
[2617] 12 Jan 13:24:40.332 * The server is now ready to accept connections on port 6379
[2617] 12 Jan 13:30:40.724 * 10 changes in 300 seconds. Saving...
[2617] 12 Jan 13:30:40.744 * Background saving started by pid 2991
[2991] 12 Jan 13:30:49.462 * DB saved on disk
[2991] 12 Jan 13:30:49.476 * RDB: 1 MB of memory used by copy-on-write
[2617] 12 Jan 13:30:49.588 * Background saving terminated with success
[2617] 12 Jan 13:35:50.076 * 10 changes in 300 seconds. Saving...
[2617] 12 Jan 13:35:50.093 * Background saving started by pid 3143
[3143] 12 Jan 13:35:58.760 * DB saved on disk

Current redis configuration file (mostly the default one): https://gist.github.com/Radiergummi/c8fcc0f7ec381bb87088

Unfortunately, I just resized my DigitalOcean Droplet, so there is nothing left of the "old" box. Please tell me if I can provide you with any more information.

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Jan 12, 2016

Owner

Hello @Radiergummi, the new configuration is good indeed. The problem is to understand what happened in the past. Unfortunately a possibility is that you got hacked via this attack: http://antirez.com/news/96

So basically, it is possible that somebody changed your configuration from the outside accessing your Redis instance. The attack was however perhaps not successful and the only result was misconfiguration. Is it possible that the instance was reachable via internet?

Redis 3.2 will contain counter measures in order to avoid this kind of issues, assuming this one is related to that.

Owner

antirez commented Jan 12, 2016

Hello @Radiergummi, the new configuration is good indeed. The problem is to understand what happened in the past. Unfortunately a possibility is that you got hacked via this attack: http://antirez.com/news/96

So basically, it is possible that somebody changed your configuration from the outside accessing your Redis instance. The attack was however perhaps not successful and the only result was misconfiguration. Is it possible that the instance was reachable via internet?

Redis 3.2 will contain counter measures in order to avoid this kind of issues, assuming this one is related to that.

@Radiergummi

This comment has been minimized.

Show comment
Hide comment
@Radiergummi

Radiergummi Jan 12, 2016

Hey @antirez, the machine has only a few selected ports opened (80, 443, 25, 443, 587), SSH is restricted to non-root key-based authentication and all web services run as upstreams behind nginx.
Log files don't show anything suspicious so far, the box is up-to-date.
I believe there was a misunderstanding concerning the config file - with "current", I meant the way the config file always was. I did not change anything since the data loss today.

Radiergummi commented Jan 12, 2016

Hey @antirez, the machine has only a few selected ports opened (80, 443, 25, 443, 587), SSH is restricted to non-root key-based authentication and all web services run as upstreams behind nginx.
Log files don't show anything suspicious so far, the box is up-to-date.
I believe there was a misunderstanding concerning the config file - with "current", I meant the way the config file always was. I did not change anything since the data loss today.

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Jan 12, 2016

Owner

Hello @Radiergummi, if ports are closed, unlikely to be an attack. About the configuration, even if the configuration you see in redis.conf is always the same, the configuration can also be modified using the CONFIG SET command. If then CONFIG REWRITE is not used, the configuration in memory is different, but the configuration file still reflects the old config. Now CONFIG SET is sometimes used to mount an attack, but unlikely to happen if everything is firewalled indeed.

The only other thing I can think is that something was happening at filesystem level. Unfortunately without the old instance and state it is extremely hard to understand now what happened, because this failure mode you experienced is not linked to any bug and/or common issue I can think of.

Note that just removing the directory is not enough to cause this:

$ mkdir /tmp/my-redis-dir
$ redis-cli
127.0.0.1:6379> config set dir /tmp/my-redis-dir
OK
127.0.0.1:6379> bgsave
Background saving started
127.0.0.1:6379> config get dir
1) "dir"
2) "/private/tmp/my-redis-dir"
127.0.0.1:6379> ECHO "here I remove the directory my-redis-dir"
"here I remove the directory my-redis-dir"
127.0.0.1:6379> bgsave
Background saving started
127.0.0.1:6379> set foo bar
(error) MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.
127.0.0.1:6379>
Owner

antirez commented Jan 12, 2016

Hello @Radiergummi, if ports are closed, unlikely to be an attack. About the configuration, even if the configuration you see in redis.conf is always the same, the configuration can also be modified using the CONFIG SET command. If then CONFIG REWRITE is not used, the configuration in memory is different, but the configuration file still reflects the old config. Now CONFIG SET is sometimes used to mount an attack, but unlikely to happen if everything is firewalled indeed.

The only other thing I can think is that something was happening at filesystem level. Unfortunately without the old instance and state it is extremely hard to understand now what happened, because this failure mode you experienced is not linked to any bug and/or common issue I can think of.

Note that just removing the directory is not enough to cause this:

$ mkdir /tmp/my-redis-dir
$ redis-cli
127.0.0.1:6379> config set dir /tmp/my-redis-dir
OK
127.0.0.1:6379> bgsave
Background saving started
127.0.0.1:6379> config get dir
1) "dir"
2) "/private/tmp/my-redis-dir"
127.0.0.1:6379> ECHO "here I remove the directory my-redis-dir"
"here I remove the directory my-redis-dir"
127.0.0.1:6379> bgsave
Background saving started
127.0.0.1:6379> set foo bar
(error) MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.
127.0.0.1:6379>
@Radiergummi

This comment has been minimized.

Show comment
Hide comment
@Radiergummi

Radiergummi Jan 12, 2016

Well, thank you anyway for your quick support... I'll do regular timestamp checks on the file now.

Radiergummi commented Jan 12, 2016

Well, thank you anyway for your quick support... I'll do regular timestamp checks on the file now.

@antirez

This comment has been minimized.

Show comment
Hide comment
@antirez

antirez Jan 12, 2016

Owner

Thanks, please if you observe the same stuff again, notify me before shutting down the server so that we can try to grab full state. Thanks for reporting.

Owner

antirez commented Jan 12, 2016

Thanks, please if you observe the same stuff again, notify me before shutting down the server so that we can try to grab full state. Thanks for reporting.

@hallard

This comment has been minimized.

Show comment
Hide comment
@hallard

hallard Jan 3, 2018

Hi there,
I faced exactly the same issue, I rebooted my VM and just lost 4 months of data because redis database state came back to the same that it was when I migrated to this VM 4 months ago.
The log file of the reboot

[39644] 01 Jan 22:38:28.021 * Background saving started by pid 30262
[30262] 01 Jan 22:38:30.770 * DB saved on disk
[30262] 01 Jan 22:38:30.771 * RDB: 1 MB of memory used by copy-on-write
[39644] 01 Jan 22:38:30.827 * Background saving terminated with success
[39644] 01 Jan 22:42:34.313 # User requested shutdown...
[39644] 01 Jan 22:42:34.313 * Saving the final RDB snapshot before exiting.
[39644] 01 Jan 22:42:38.170 * DB saved on disk
[39644] 01 Jan 22:42:38.170 * Removing the pid file.
[39644] 01 Jan 22:42:38.170 # Redis is now ready to exit, bye bye...
[333] 01 Jan 22:43:02.402 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
[333] 01 Jan 22:43:02.668 # Redis can't set maximum open files to 10032 because of OS error: Operation not permitted.
[333] 01 Jan 22:43:02.668 # Current maximum open files is 1024. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
                _._
           _.-``__ ''-._
      _.-``    `.  `_.  ''-._           Redis 2.8.17 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._
 (    '      ,       .-`  | `,    )     Running in stand alone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 333
  `-._    `-._  `-./  _.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |           http://redis.io
  `-._    `-._`-.__.-'_.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |
  `-._    `-._`-.__.-'_.-'    _.-'
      `-._    `-.__.-'    _.-'
          `-._        _.-'
              `-.__.-'

[333] 01 Jan 22:43:02.669 # Server started, Redis version 2.8.17
[333] 01 Jan 22:43:05.663 * DB loaded from disk: 2.993 seconds
[333] 01 Jan 22:43:05.663 * The server is now ready to accept connections on port 6379
[333] 02 Jan 11:13:01.342 * Background saving started by pid 41017
[41017] 02 Jan 11:13:01.591 * DB saved on disk
[41017] 02 Jan 11:13:01.591 * RDB: 0 MB of memory used by copy-on-write
[333] 02 Jan 11:13:01.684 * Background saving terminated with success
[333] 03 Jan 11:13:02.009 * Background saving started by pid 50778
[50778] 03 Jan 11:13:02.486 * DB saved on disk
[50778] 03 Jan 11:13:02.486 * RDB: 0 MB of memory used by copy-on-write
[333] 03 Jan 11:13:02.498 * Background saving terminated with success
[333] 03 Jan 19:33:18.722 * 1 changes in 900 seconds. Saving...
[333] 03 Jan 19:33:18.722 * Background saving started by pid 11705
[11705] 03 Jan 19:33:19.689 * DB saved on disk

So I though, not a problem, I've got some backup (done every day with 31 days keeping) but the hell is that even is backup are done, the content are stuck of 4 month old !!!

root@srv:/var/lib/redis# l backup/
total 459920
drwxr-xr-x 2 redis redis     4096 Jan  3 21:40 .
drwxrwxr-x 3 redis redis     4096 Jan  3 22:03 ..
-rw-r--r-- 1 root  root  15394559 Dec  4 11:13 6379-dump.201712041113.rdb
-rw-r--r-- 1 root  root  15394559 Dec  5 11:13 6379-dump.201712051113.rdb
-rw-r--r-- 1 root  root  15394559 Dec  6 11:13 6379-dump.201712061113.rdb
-rw-r--r-- 1 root  root  15394559 Dec  7 11:13 6379-dump.201712071113.rdb
-rw-r--r-- 1 root  root  15394559 Dec  8 11:13 6379-dump.201712081113.rdb
-rw-r--r-- 1 root  root  15394559 Dec  9 11:13 6379-dump.201712091113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 10 11:13 6379-dump.201712101113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 11 11:13 6379-dump.201712111113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 12 11:13 6379-dump.201712121113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 13 11:13 6379-dump.201712131113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 14 11:13 6379-dump.201712141113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 15 11:13 6379-dump.201712151113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 16 11:13 6379-dump.201712161113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 17 11:13 6379-dump.201712171113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 18 11:13 6379-dump.201712181113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 19 11:13 6379-dump.201712191113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 20 11:13 6379-dump.201712201113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 21 11:13 6379-dump.201712211113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 22 11:13 6379-dump.201712221113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 23 11:13 6379-dump.201712231113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 24 11:13 6379-dump.201712241113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 25 11:13 6379-dump.201712251113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 26 11:13 6379-dump.201712261113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 27 11:13 6379-dump.201712271113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 28 11:13 6379-dump.201712281113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 29 11:13 6379-dump.201712291113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 30 11:13 6379-dump.201712301113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 31 11:13 6379-dump.201712311113.rdb
-rw-r--r-- 1 root  root  15394559 Jan  3 19:37 6379-dump.201801011113.rdb
-rw-r--r-- 1 root  root   6088333 Jan  2 11:13 6379-dump.201801021113.rdb
-rw-r--r-- 1 root  root   6088333 Jan  3 11:13 6379-dump.201801031113.rdb
-rw-r--r-- 1 root  root   6089159 Jan  1 11:13 6379-dump.201801011113.rdb
-rw-r--r-- 1 root  root   6166505 Jan  3 21:40 6379-dump.201801032140.rdb
root@srv:/var/lib/redis#

as you can see until reboot, backup size was not changed, the last files have lower size because nodebb compacted data in new version but all .rdb file has 4 months old data

my backup script executed in crontab every day

#!/bin/bash
#
## redis backup script
## usage
## redis-backup.sh port backup.dir

port=${1:-6379}
backup_dir=${2:-"/var/lib/redis/backup"}

cli="/usr/bin/redis-cli -p $port"
rdb="/var/lib/redis/dump.rdb"

test -f $rdb || {
  echo "[$port] No RDB Found" ; exit 1
}
test -d $backup_dir || {
  echo "[$port] Create backup directory $backup_dir" && mkdir -p $backup_dir
}

# perform a bgsave before copy
echo bgsave | $cli
echo "[$port] waiting for 5 seconds..."
sleep 5
try=10
while [ $try -gt 0 ] ; do
  ## redis-cli output dos format line feed '\r\n', remove '\r'
  bg=$(echo 'info Persistence' | $cli | awk -F: '/rdb_bgsave_in_progress/{sub(/\r/, "", $0); print $2}')
  ok=$(echo 'info Persistence' | $cli | awk -F: '/rdb_last_bgsave_status/{sub(/\r/, "", $0); print $2}')
  if [ "$bg" = "0" ] && [ "$ok" = "ok" ] ; then
    dst="$backup_dir/$port-dump.$(date +%Y%m%d%H%M).rdb"
    cp $rdb $dst
    if [ $? = 0 ] ; then
      echo "[$port] redis rdb $rdb copied to $dst."

      # delete rdb created 30 days ago
      cd $backup_dir
      find . \( -name "$port-dump*" \) -mtime +30 -exec rm -f {} \;
      exit 0
    else
      echo "[$port] >> Failed to copy $rdb to $dst!"
    fi
  fi
  try=$((try - 1))
  echo "[$port] redis maybe busy, waiting and retry in 5s..."
  sleep 5
done
exit 1

What I'm doing wrong and what I need to do to

  • stopping redis-server write file to disk?
  • backup works as expected ?

Thanks for your help

hallard commented Jan 3, 2018

Hi there,
I faced exactly the same issue, I rebooted my VM and just lost 4 months of data because redis database state came back to the same that it was when I migrated to this VM 4 months ago.
The log file of the reboot

[39644] 01 Jan 22:38:28.021 * Background saving started by pid 30262
[30262] 01 Jan 22:38:30.770 * DB saved on disk
[30262] 01 Jan 22:38:30.771 * RDB: 1 MB of memory used by copy-on-write
[39644] 01 Jan 22:38:30.827 * Background saving terminated with success
[39644] 01 Jan 22:42:34.313 # User requested shutdown...
[39644] 01 Jan 22:42:34.313 * Saving the final RDB snapshot before exiting.
[39644] 01 Jan 22:42:38.170 * DB saved on disk
[39644] 01 Jan 22:42:38.170 * Removing the pid file.
[39644] 01 Jan 22:42:38.170 # Redis is now ready to exit, bye bye...
[333] 01 Jan 22:43:02.402 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
[333] 01 Jan 22:43:02.668 # Redis can't set maximum open files to 10032 because of OS error: Operation not permitted.
[333] 01 Jan 22:43:02.668 # Current maximum open files is 1024. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
                _._
           _.-``__ ''-._
      _.-``    `.  `_.  ''-._           Redis 2.8.17 (00000000/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._
 (    '      ,       .-`  | `,    )     Running in stand alone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
 |    `-._   `._    /     _.-'    |     PID: 333
  `-._    `-._  `-./  _.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |           http://redis.io
  `-._    `-._`-.__.-'_.-'    _.-'
 |`-._`-._    `-.__.-'    _.-'_.-'|
 |    `-._`-._        _.-'_.-'    |
  `-._    `-._`-.__.-'_.-'    _.-'
      `-._    `-.__.-'    _.-'
          `-._        _.-'
              `-.__.-'

[333] 01 Jan 22:43:02.669 # Server started, Redis version 2.8.17
[333] 01 Jan 22:43:05.663 * DB loaded from disk: 2.993 seconds
[333] 01 Jan 22:43:05.663 * The server is now ready to accept connections on port 6379
[333] 02 Jan 11:13:01.342 * Background saving started by pid 41017
[41017] 02 Jan 11:13:01.591 * DB saved on disk
[41017] 02 Jan 11:13:01.591 * RDB: 0 MB of memory used by copy-on-write
[333] 02 Jan 11:13:01.684 * Background saving terminated with success
[333] 03 Jan 11:13:02.009 * Background saving started by pid 50778
[50778] 03 Jan 11:13:02.486 * DB saved on disk
[50778] 03 Jan 11:13:02.486 * RDB: 0 MB of memory used by copy-on-write
[333] 03 Jan 11:13:02.498 * Background saving terminated with success
[333] 03 Jan 19:33:18.722 * 1 changes in 900 seconds. Saving...
[333] 03 Jan 19:33:18.722 * Background saving started by pid 11705
[11705] 03 Jan 19:33:19.689 * DB saved on disk

So I though, not a problem, I've got some backup (done every day with 31 days keeping) but the hell is that even is backup are done, the content are stuck of 4 month old !!!

root@srv:/var/lib/redis# l backup/
total 459920
drwxr-xr-x 2 redis redis     4096 Jan  3 21:40 .
drwxrwxr-x 3 redis redis     4096 Jan  3 22:03 ..
-rw-r--r-- 1 root  root  15394559 Dec  4 11:13 6379-dump.201712041113.rdb
-rw-r--r-- 1 root  root  15394559 Dec  5 11:13 6379-dump.201712051113.rdb
-rw-r--r-- 1 root  root  15394559 Dec  6 11:13 6379-dump.201712061113.rdb
-rw-r--r-- 1 root  root  15394559 Dec  7 11:13 6379-dump.201712071113.rdb
-rw-r--r-- 1 root  root  15394559 Dec  8 11:13 6379-dump.201712081113.rdb
-rw-r--r-- 1 root  root  15394559 Dec  9 11:13 6379-dump.201712091113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 10 11:13 6379-dump.201712101113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 11 11:13 6379-dump.201712111113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 12 11:13 6379-dump.201712121113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 13 11:13 6379-dump.201712131113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 14 11:13 6379-dump.201712141113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 15 11:13 6379-dump.201712151113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 16 11:13 6379-dump.201712161113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 17 11:13 6379-dump.201712171113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 18 11:13 6379-dump.201712181113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 19 11:13 6379-dump.201712191113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 20 11:13 6379-dump.201712201113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 21 11:13 6379-dump.201712211113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 22 11:13 6379-dump.201712221113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 23 11:13 6379-dump.201712231113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 24 11:13 6379-dump.201712241113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 25 11:13 6379-dump.201712251113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 26 11:13 6379-dump.201712261113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 27 11:13 6379-dump.201712271113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 28 11:13 6379-dump.201712281113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 29 11:13 6379-dump.201712291113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 30 11:13 6379-dump.201712301113.rdb
-rw-r--r-- 1 root  root  15394559 Dec 31 11:13 6379-dump.201712311113.rdb
-rw-r--r-- 1 root  root  15394559 Jan  3 19:37 6379-dump.201801011113.rdb
-rw-r--r-- 1 root  root   6088333 Jan  2 11:13 6379-dump.201801021113.rdb
-rw-r--r-- 1 root  root   6088333 Jan  3 11:13 6379-dump.201801031113.rdb
-rw-r--r-- 1 root  root   6089159 Jan  1 11:13 6379-dump.201801011113.rdb
-rw-r--r-- 1 root  root   6166505 Jan  3 21:40 6379-dump.201801032140.rdb
root@srv:/var/lib/redis#

as you can see until reboot, backup size was not changed, the last files have lower size because nodebb compacted data in new version but all .rdb file has 4 months old data

my backup script executed in crontab every day

#!/bin/bash
#
## redis backup script
## usage
## redis-backup.sh port backup.dir

port=${1:-6379}
backup_dir=${2:-"/var/lib/redis/backup"}

cli="/usr/bin/redis-cli -p $port"
rdb="/var/lib/redis/dump.rdb"

test -f $rdb || {
  echo "[$port] No RDB Found" ; exit 1
}
test -d $backup_dir || {
  echo "[$port] Create backup directory $backup_dir" && mkdir -p $backup_dir
}

# perform a bgsave before copy
echo bgsave | $cli
echo "[$port] waiting for 5 seconds..."
sleep 5
try=10
while [ $try -gt 0 ] ; do
  ## redis-cli output dos format line feed '\r\n', remove '\r'
  bg=$(echo 'info Persistence' | $cli | awk -F: '/rdb_bgsave_in_progress/{sub(/\r/, "", $0); print $2}')
  ok=$(echo 'info Persistence' | $cli | awk -F: '/rdb_last_bgsave_status/{sub(/\r/, "", $0); print $2}')
  if [ "$bg" = "0" ] && [ "$ok" = "ok" ] ; then
    dst="$backup_dir/$port-dump.$(date +%Y%m%d%H%M).rdb"
    cp $rdb $dst
    if [ $? = 0 ] ; then
      echo "[$port] redis rdb $rdb copied to $dst."

      # delete rdb created 30 days ago
      cd $backup_dir
      find . \( -name "$port-dump*" \) -mtime +30 -exec rm -f {} \;
      exit 0
    else
      echo "[$port] >> Failed to copy $rdb to $dst!"
    fi
  fi
  try=$((try - 1))
  echo "[$port] redis maybe busy, waiting and retry in 5s..."
  sleep 5
done
exit 1

What I'm doing wrong and what I need to do to

  • stopping redis-server write file to disk?
  • backup works as expected ?

Thanks for your help

@Radiergummi

This comment has been minimized.

Show comment
Hide comment
@Radiergummi

Radiergummi Jan 4, 2018

Just for completeness sake: the application I used with Redis back then was NodeBB, too. It worked properly during the time no background saves were made. It would be strange if an application interfacing with Redis could break the dumps, but it's at least an interesting coincidence.

Radiergummi commented Jan 4, 2018

Just for completeness sake: the application I used with Redis back then was NodeBB, too. It worked properly during the time no background saves were made. It would be strange if an application interfacing with Redis could break the dumps, but it's at least an interesting coincidence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment