Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis 4.0.8 cluster, slaver crash during bgsave #5287

Open
qingyuan18 opened this issue Aug 28, 2018 · 2 comments
Open

Redis 4.0.8 cluster, slaver crash during bgsave #5287

qingyuan18 opened this issue Aug 28, 2018 · 2 comments

Comments

@qingyuan18
Copy link

we have a redis 4.0.8 cluster with 3 master and 3 slavers , one or two of the slaver nodes alwasy crash during bgsave, the crash report as below:

Suspect RAM error? Use redis-server --test-memory to verify it.

17841:S 28 Aug 02:19:51.394 # Background saving terminated by signal 11
17841:S 28 Aug 02:19:52.021 * 10000 changes in 120 seconds. Saving...
17841:S 28 Aug 02:20:01.355 * Background saving started by pid 21707

=== REDIS BUG REPORT START: Cut & paste starting from here ===
21707:C 28 Aug 02:25:41.946 # ------------------------------------------------
21707:C 28 Aug 02:25:41.947 # !!! Software Failure. Press left mouse button to continue
21707:C 28 Aug 02:25:41.947 # Guru Meditation: Unknown object type #rdb.c:628
21707:C 28 Aug 02:25:41.947 # (forcing SIGSEGV in order to print the stack trace)
21707:C 28 Aug 02:25:41.947 # ------------------------------------------------
21707:C 28 Aug 02:25:41.947 # Redis 4.0.8 crashed by signal: 11
21707:C 28 Aug 02:25:41.947 # Crashed running the instruction at: 0x466fd3
21707:C 28 Aug 02:25:41.947 # Accessing address: 0xffffffffffffffff
21707:C 28 Aug 02:25:41.947 # Failed assertion: (:0)

------ STACK TRACE ------
EIP:
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x466fd3]

Backtrace:
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x466dfc]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x468033]
/lib64/libpthread.so.0[0x30a3e0f790]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x466fd3]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x446ed7]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x4488d7]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x448bc6]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x448d95]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x449060]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x42d4d0]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x424d7d]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x424f2b]
redis-rdb-bgsave 10.146.14.17:8003 cluster[0x42db42]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x30a3a1ed5d]
redis-rdb-bgsave 10.146.14.17:8003 [cluster][0x422319]

the rdb.c:628 code shows there is no correct object type:

case OBJ_ZSET:
    if (o->encoding == OBJ_ENCODING_ZIPLIST)
        return rdbSaveType(rdb,RDB_TYPE_ZSET_ZIPLIST);
    else if (o->encoding == OBJ_ENCODING_SKIPLIST)
        return rdbSaveType(rdb,RDB_TYPE_ZSET_2);
    else
        serverPanic("Unknown sorted set encoding");
case OBJ_HASH:
    if (o->encoding == OBJ_ENCODING_ZIPLIST)
        return rdbSaveType(rdb,RDB_TYPE_HASH_ZIPLIST);
    else if (o->encoding == OBJ_ENCODING_HT)
        return rdbSaveType(rdb,RDB_TYPE_HASH);
    else
        serverPanic("Unknown hash encoding");
case OBJ_MODULE:
    return rdbSaveType(rdb,RDB_TYPE_MODULE_2);
default:
    serverPanic("Unknown object type");
}

our data is zset with 18 lenth String and other simply String type:

[00.00%] Biggest string found so far 'music_user:a2d4ee50-c0e5-4712-b059-8d79c4e387f8' with 11 bytes
[00.00%] Biggest zset found so far 'video:long:15252799960' with 100 members
[00.00%] Biggest zset found so far 'music:fm:18701531951' with 279 members
[00.00%] Biggest zset found so far 'music:fm:355981050728643' with 289 members
[00.00%] Biggest zset found so far 'music:fm:867483029348786' with 290 members

10.146.14.15:7001> zrange music:fm:867483029348786 0 -1

  1. "600902000007331809"
  2. "600907000007852796"
  3. "600907000004182649"
  4. "600902000009205261"
  5. "600907000008439207"

can anyone help to locate which issue and how to fix it? so far I didn't see any client code's error ,do I need to upgrade the redis to 4.0.9?

any suggestion would be appreciated ! Thanks

@qingyuan18
Copy link
Author

any suggestion?

BTW: we found the master-slaver synchronize transfer 40G+ every time:
_20180829103834
seems like master - slaver synchronize full nodes's data ,not increnmental synchronzied

@itamarhaber
Copy link
Member

Hello @qingyuan18

Upgrading to the latest version is generally recommended. Also, please include the full crash report in the issue.

BTW: the following is from Redis' documentation

However if there is not enough backlog in the master buffers, or if the slave is referring to an history (replication ID) which is no longer known, than a full resynchronization happens: in this case the slave will get a full copy of the dataset, from scratch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants