Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Segfault in redis-cli 2.4.14 while dumping a huge key #697

Open
agladysh opened this Issue · 15 comments

2 participants

@agladysh

zzz:chainsv is a hash that contains ~2.8 GB of string data in ~10K keys

$ redis-cli -n 2 hgetall zzz:chainsv >~/chainsv.dump
Segmentation fault
$ redis-cli info
redis_version:2.4.11
redis_git_sha1:00000000
redis_git_dirty:0
arch_bits:64
multiplexing_api:epoll
gcc_version:4.5.2
process_id:22166
uptime_in_seconds:793437
uptime_in_days:9
lru_clock:744582
used_cpu_sys:224.30
used_cpu_user:122.26
used_cpu_sys_children:38.91
used_cpu_user_children:2.68
connected_clients:31
connected_slaves:0
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
used_memory:3223424976
used_memory_human:3.00G
used_memory_rss:2806063104
used_memory_peak:3227161416
used_memory_peak_human:3.01G
mem_fragmentation_ratio:0.87
mem_allocator:jemalloc-2.2.5
loading:0
aof_enabled:1
changes_since_last_save:538431
bgsave_in_progress:0
last_save_time:1348829666
bgrewriteaof_in_progress:0
total_connections_received:39843
total_commands_processed:1871788
expired_keys:1
evicted_keys:0
keyspace_hits:1439075
keyspace_misses:60681
pubsub_channels:0
pubsub_patterns:0
latest_fork_usec:727800
vm_enabled:0
role:master
aof_current_size:3217180226
aof_base_size:3102969082
aof_pending_rewrite:0
aof_buffer_length:0
aof_pending_bio_fsync:0
db2:keys=85,expires=0
db3:keys=4201,expires=0
db4:keys=20,expires=0

Please tell me what can I do to help you debug this.

@agladysh

Segfault happened after it worked for some time, so maybe it is something 64-bit-related? Now trying to reproduce the crash once again.

@agladysh

Looks like it is reproducible. I'll be able to keep that machine in this state for a day or two in case if you need me to reproduce that and gather some data.

@antirez
Owner

Thanks for reporting this issue @agladysh, the issue appears to be with redis-cli, not the server, but it's anyway important to fix it for sure.

Please could you do the following?
. run redis-cli with gdb (gdb redis-cli) and type 'run -n 2 hgetall zzz:chainsv'
. it should segfault hopefully as it did in the past.
. type by to get the back trace.
. provide the output of bt here. Done! :-)

Thanks

@agladysh

Running $ gdb --args redis-cli -n 2 hgetall zzz:chainsv >~/chainsv.dump right now for quite a while to get stacktrace. No results yet.

@agladysh

...Hmm. Definitely I'm too tired to think straight now. It is asking to type run in the redirected stdout. :-) Will try again.

@agladysh
$ gdb --args redis-cli -n 2 hgetall zzz:chainsv
GNU gdb (Ubuntu/Linaro 7.2-1ubuntu11) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /usr/bin/redis-cli...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/bin/redis-cli -n 2 hgetall zzz:chainsv
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff742b471 in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff742b471 in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x0000000000404e95 in sdscatlen ()
#2  0x00000000004050a7 in sdscatvprintf ()
#3  0x0000000000405194 in sdscatprintf ()
#4  0x0000000000406798 in ?? ()
#5  0x0000000000406bd5 in ?? ()
#6  0x00000000004070ec in ?? ()
#7  0x00000000004079ab in main ()
(gdb) 

Done.

@agladysh

Just in case, more info:

$ uname -a
Linux MYHOST 2.6.38-15-virtual #59-Ubuntu SMP Fri Apr 27 16:38:04 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 11.04
Release:    11.04
Codename:   natty

$ redis-cli --version
redis-cli 2.4.11
@antirez
Owner

Thanks! Unfortunately this is not helping us a lot as it appears that this redis-cli was compiled without symbols. If it's not too much work please could you download redis in /tmp, compile it with make noopt and retry? No need to upgrade Redis, just use the locally compiled redis-cli executable to run the example.

If this is not convenient on sunday in a production host, no problem :-) I'll try to simulate the issue, in that case just tell me know many elements there are in that hash, probably it can be explained just with an output that reaches the 2GB of size as a plain string in the dump file.

Another way to check this is if the generated dump file appears to be near 2GB in size. Thanks!

@agladysh

Thanks! Unfortunately this is not helping us a lot as it appears that this redis-cli was compiled without symbols. If it's not too much work please could you download redis in /tmp, compile it with make noopt and retry? No need to upgrade Redis, just use the locally compiled redis-cli executable to run the example.

Will do.

If this is not convenient on sunday in a production host, no problem :-)

This is a hot standby host, so I can poke it within reason.

I'll try to simulate the issue, in that case just tell me know many elements there are in that hash, probably it can be explained just with an output that reaches the 2GB of size as a plain string in the dump file.

$ redis-cli -n 2 hlen zzz:chainsv
(integer) 30869

Another way to check this is if the generated dump file appears to be near 2GB in size. Thanks!

No file is generated, unfortunately.

@agladysh

Built 2.4.14 sources zipball from GH:

~/build/antirez-redis-d198bfc/src$ gdb --args ./redis-cli -n 2 hgetall mrx:chainsv
GNU gdb (Ubuntu/Linaro 7.2-1ubuntu11) 7.2
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /home/agladysh/build/antirez-redis-d198bfc/src/redis-cli...done.
(gdb) run
Starting program: /home/agladysh/build/antirez-redis-d198bfc/src/redis-cli -n 2 hgetall zzz:chainsv
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff742b471 in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff742b471 in memcpy () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x0000000000405700 in sdscatlen (
    s=0x7ffe9e400008 "    1) \"1347150305\"\n    2) \"\\x01T\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00S\\x06\\x00\\x00\\x00chainsT \\x00\\x00\\x00\\x00\\x00\\x00\\x00N\\x00\\x00\\x00\\x00\\x00\\x00\\xf0?T\\x00\\x00\\x00\\x00\\t\\x00\\x00\\x00S\\b\\x00\\x00\\x00keywor"..., t=0x7ffff6407bb0, len=7) at sds.c:152
#2  0x000000000040577c in sdscat (
    s=0x7ffe9e400008 "    1) \"1347150305\"\n    2) \"\\x01T\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00S\\x06\\x00\\x00\\x00chainsT \\x00\\x00\\x00\\x00\\x00\\x00\\x00N\\x00\\x00\\x00\\x00\\x00\\x00\\xf0?T\\x00\\x00\\x00\\x00\\t\\x00\\x00\\x00S\\b\\x00\\x00\\x00keywor"..., t=0x7ffff6407bb0 "24693) ") at sds.c:160
#3  0x0000000000405981 in sdscatvprintf (
    s=0x7ffe9e400008 "    1) \"1347150305\"\n    2) \"\\x01T\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00S\\x06\\x00\\x00\\x00chainsT \\x00\\x00\\x00\\x00\\x00\\x00\\x00N\\x00\\x00\\x00\\x00\\x00\\x00\\xf0?T\\x00\\x00\\x00\\x00\\t\\x00\\x00\\x00S\\b\\x00\\x00\\x00keywor"..., fmt=0x7fffffffe3f0 "%s%5d) ", ap=0x7fffffffe2d0) at sds.c:210
#4  0x0000000000405a4c in sdscatprintf (
    s=0x7ffe9e400008 "    1) \"1347150305\"\n    2) \"\\x01T\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00S\\x06\\x00\\x00\\x00chainsT \\x00\\x00\\x00\\x00\\x00\\x00\\x00N\\x00\\x00\\x00\\x00\\x00\\x00\\xf0?T\\x00\\x00\\x00\\x00\\t\\x00\\x00\\x00S\\b\\x00\\x00\\x00keywor"..., fmt=0x7fffffffe3f0 "%s%5d) ") at sds.c:219
#5  0x0000000000407d17 in cliFormatReplyTTY (r=0x631550, prefix=0x425fca "") at redis-cli.c:392
#6  0x0000000000408279 in cliReadReply (output_raw_strings=0) at redis-cli.c:510
#7  0x0000000000408539 in cliSendCommand (argc=2, argv=0x7ffff6407b90, repeat=0) at redis-cli.c:569
#8  0x000000000040906b in noninteractive (argc=2, argv=0x7ffff6407b90) at redis-cli.c:802
#9  0x000000000040a031 in main (argc=2, argv=0x7fffffffe650) at redis-cli.c:1163
@agladysh

Should I keep this machine in that state for a while longer?

@antirez
Owner

Not needed, thanks... I'm pretty sure I know what this is about.
Just another info, does it take some second before crashing? Thanks.

@agladysh

It takes a few minutes. Glad to help.

@antirez
Owner

This confirms everything :-) It's just the 2GB overflow of SDS strings. Thank you for your help, I'll fix this ASAP but it's not a minor fix at all. I happens when the output you ask to redis-cli is more than 2GB in size, so for instance when dumping a very large hash. However this is not an acceptable behaviour of course...

I'll backport the fix to 2.4 from 2.6 / unstable.

@agladysh

Cool, thank you for support!

@JackieXie168 JackieXie168 referenced this issue from a commit
@AtnNn AtnNn Remove DEPENDENCIES file
Review 1026 by @neumino
Closes #697
2a93383
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.