hiredis hangs on redisBufferRead #263

saravanan-elangovan · 2014-08-13T07:29:14Z

We use hiredis to connect to redis server in block mode. Our application runs in RHEL 4 32 bit. It works fine and after some time while trying to do an operation, it hangs. On analysis, found it got hanged in the redisBufferRead. This application act like a switch which gets the data from TCP Socket and store the same in the Redis database for recoverability.

On netstat output, We found the SendQ of the our client has non-zero value which indicates that request has not reached redis server. This behaviour happens with INCR & INCRBY mostly.

We opened a redis-client and did the same operation and it works fine.

gdb output
#0 0x008787a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x001de69b in __read_nocancel () from /lib/tls/libpthread.so.0
#2 0x08073a42 in redisBufferRead (c=0x89c2fb8) at hiredis.c:1142
#3 0x08073c59 in redisGetReply (c=0x89c2fb8, reply=0x590dcbc) at hiredis.c:1228
#4 0x08073e99 in redisvCommand (c=0x89c2fb8, format=0x8077f40 "INCRBY %s %ld", ap=0x590dcf8 "t\021\225\b\001") at hiredis.c:1332
#5 0x08073ed2 in redisCommand (c=0x89c2fb8, format=0x8077f40 "INCRBY %s %ld") at hiredis.c:1349

pietern · 2014-08-13T19:24:08Z

It is blocked on a system call. Looks like a kernel issue to me. Have you tried replicating the issue on machines with newer kernel versions?

saravanan-elangovan · 2014-08-14T11:20:45Z

I believe the client has sent the message which has not reached or not handled by redis server. As the client is in the block mode, it waits indefintely for the response from the redis server. Even netstat output indicates SendQ of the tcp socket having 43 bytes ( in our case ).

This behaviour is observed only between hiredis & redis communication. We have couple of program which uses tcp & http and its works fine.

michael-grunder · 2014-12-17T20:30:36Z

@saravanan-elangovan Are you connecting with a timeout? This just looks like it's blocking on the socket which doesn't mean there is any actual issue with hiredis, per se. Now, it's possible that there is some weird edge case where hiredis fails to send the complete command, but we use hiredis to send billions of commands to Redis over several hours and/or days and don't see this behavior.

The only obvious difference is that we're running 64 bit hiredis.

shoutrain · 2015-03-27T08:47:40Z

I met the same issue in 2012 and 2013. And in these days I met them again. Maybe it's due to the old lib, but why this issue is still keeping open status.

enviroment:

Linux tesla001 2.6.32-504.12.2.el6.x86_64 memory leak when dealing with integer response ? #1 SMP Wed Mar 11 22:03:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux.
libhiredis.so.0.10 (yum updated from default centos 6 repo)

thread stacks:

[Switching to thread 3 (Thread 0x7fdbee57b700 (LWP 7795))]#0 0x000000348c40e75d in read () fro m /lib64/libpthread.so.0
(gdb) bt
#0 0x000000348c40e75d in read () from /lib64/libpthread.so.0
#1 0x00007fdbeeddd7bb in redisBufferRead () from /usr/lib64/libhiredis.so.0.10
#2 0x00007fdbeeddefcf in redisGetReply () from /usr/lib64/libhiredis.so.0.10
#3 0x00007fdbeeddf085 in redisvCommand () from /usr/lib64/libhiredis.so.0.10
#4 0x00007fdbeeddf118 in redisCommand () from /usr/lib64/libhiredis.so.0.10
#5 0x0000000000405843 in CRedisOperator::handleGeneratorInfo (this=0x7e5a30, ownerId=3, data=0 x7e59b8) at src/database/CRedisOperator.cpp:146
#6 0x00000000004038a2 in CTransaction::onGeneratorInfo (this=0x7e4d40, sequence=5112, data=0x7 e59b8) at src/transaction/CTransaction.cpp:161
#7 0x0000000000403367 in CTransaction::onMessage (this=0x7e4d40, msg=0x7e59a8) at src/transact ion/CTransaction.cpp:58
#8 0x000000000040c802 in CNodeGroup::working (this=0x7e5858) at src/traffic/CNodeGroup.cpp:70
#9 0x000000000040b5b2 in CWorker::run (object=0x7e5860) at src/common/CWorker.cpp:68
#10 0x000000348c4079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x000000348c0e88fd in clone () from /lib64/libc.so.6

badboy · 2015-03-27T09:20:49Z

No one investigated this in more detail yet, but as per the comments above this might be an issue with the network or the Redis contacting and not a bug in hiredis itself.
Are you using timeouts on the connection? Blocking commands? What's the network status, what does netstat say about the Send-Q buffer?

shoutrain · 2015-03-27T09:29:49Z

I used timeout on the connection, it's 1 second. But I think the timeout only works with connect, not for redisCommand().
Yes, it's a blocking command.
Actually the redis-server and the application using hiredis are in the same machine. netstat outputs related to redis-server below:

[rafael@tesla001 softwares]$ Active Internet connections Proto Recv-Q Send-Q Local Address tcp 0 0 localhost:38836 tcp 0 0 localhost:38839 tcp 0 0 localhost:38838 tcp 0 0 localhost:38842 tcp 0 0 localhost:38845 tcp 0 0 localhost:38852 tcp 0 0 localhost:38854 tcp 0 0 localhost:38851 tcp 0 0 localhost:38847 tcp 0 0 localhost:38849 tcp 0 0 localhost:38837 tcp 0 0 localhost:38848 tcp 0 0 localhost:38840 tcp 0 0 localhost:38841 tcp 0 0 localhost:38844 tcp 0 0 localhost:38843 tcp 0 0 localhost:38855 tcp 0 0 localhost:38850 tcp 0 0 localhost:38853 netstat
(w/o servers)
Foreign Address State
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT
localhost:6379 TIME_WAIT

badboy · 2015-03-27T09:35:27Z

So all connections to the Redis server are already half-closed, which might indicate a completely different problem.

shoutrain · 2015-03-27T09:41:18Z

sorry bad boy, I run netstat again, it looks I deleted some established connections last time. actually there are active connections. paste them all:

[rafael@tesla001 softwares]$ netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:38836 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38839 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38838 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38842 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38845 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38852 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38854 localhost:6379 TIME_WAIT
tcp 0 0 120.24.169.81:10506 49.80.219.76:10174 ESTABLISHED
tcp 0 0 localhost:38851 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38847 localhost:6379 TIME_WAIT
tcp 0 0 localhost:6379 localhost:37590 ESTABLISHED
tcp 0 0 120.24.169.81:ssh 49.80.221.174:28720 ESTABLISHED
tcp 0 0 120.24.169.81:ssh 49.80.219.76:9326 ESTABLISHED
tcp 0 0 localhost:38849 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38837 localhost:6379 TIME_WAIT
tcp 0 0 localhost:35097 localhost:6379 ESTABLISHED
tcp 0 0 localhost:38848 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38840 localhost:6379 TIME_WAIT
tcp 0 0 localhost:6379 localhost:35097 ESTABLISHED
tcp 0 0 localhost:38841 localhost:6379 TIME_WAIT
tcp 0 0 localhost:37590 localhost:6379 ESTABLISHED
tcp 0 0 localhost:38844 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38843 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38855 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38850 localhost:6379 TIME_WAIT
tcp 0 0 localhost:38853 localhost:6379 TIME_WAIT
tcp 0 0 120.24.169.81:53500 42.156.166.25:http ESTABLISHED
tcp 0 0 localhost:38846 localhost:6379 TIME_WAIT
tcp 0 0 120.24.169.81:ssh 49.80.219.76:10273 ESTABLISHED

echoma · 2015-04-26T23:42:08Z

your netstat output shows there is no none-zero sendq.

btw. you can use redisSetTimeout() to set read/write timeout value.

shoutrain · 2015-05-04T00:57:42Z

@echoma Thanks.

Actually I used redisSetTimeout() and I found the reason finally. I used multi-threads with the api, but it doesn't support.

badboy · 2015-05-04T07:05:51Z

So this can be closed?

shoutrain · 2015-05-04T12:33:18Z

@badboy Yes, I have solved the problem by move all api calls in the same thread. But @saravanan-elangovan opened this issue, I am not sure he met the same situation like me.

abigbigbird · 2016-07-06T12:14:42Z

shoutrain commented on 4 May 2015
@echoma Thanks.

Actually I used redisSetTimeout() and I found the reason finally. I used multi-threads with the api, but it doesn't support.

Question：hiredis doesn't support multi-threads? where i can see it? I used multi-threads also!

badboy · 2016-07-06T12:19:25Z

@abigbigbird Can see what?
I added a short note a while back, see the note under Connecting.
The context object has no synchronization of its internals, using it from multiple threads at once will most likely result in race conditions.

abigbigbird · 2016-07-06T12:57:01Z

Oh,I saw the note ! thanks

I use hiredis in my project,It works fine in most of the time，but there are several times ，it hangs。

## the gdb core stack as below:：
#0 redisBufferRead (c=0x7f26c4013a10) at hiredis.c:800
#1 0x00007f27479e9cd0 in redisGetReply (c=0x7f26c4013a10, reply=0x7f26ea4e7ea0) at hiredis.c:903
#2 0x00007f27479ea02b in __redisBlockForReply (c=0x7f26c4013a10, argc=, argv=,
argvlen=) at hiredis.c:1010
#3 redisCommandArgv (c=0x7f26c4013a10, argc=, argv=, argvlen=)
at hiredis.c:1035
#4 0x00007f27479d31cc in bfd::codis::CodisClient::RedisCommand (this=0x200ceb0, command=..., tt=0) at src/CodisClient.cpp:995
#5 0x00007f27479d3662 in bfd::codis::CodisClient::RedisCommand (this=0x200ceb0, command=..., tt=0) at src/CodisClient.cpp:1035
#6 0x00007f27479c9f64 in bfd::codis::CodisClient::get (this=0x200ceb0, key=..., tt=0) at src/CodisClient.cpp:152
#7 0x00007f27479e5a18 in bfd::codis::BfdCodis::get (this=0x2130120, key=..., tt=0) at src/BfdCodis.cpp:60
#8 0x000000000050d89e in RedisClientProxy::get (this=0x20e34f0, key=...)
(gdb) p c
$1 = {err = 0, errstr = '\000' <repeats 127 times>, fd = 72, flags = 3, obuf = 0x7f26b40087c8 "", reader = 0x7f26c4013af0,
connection_type = REDIS_CONN_TCP, timeout = 0x0, tcp = {host = 0x7f26c4013d50 "192.168.*", source_addr = 0x0, port = 19000},
unix_sock = {path = 0x0}}
(gdb) p *buf
Cannot access memory at address 0x7f26ea4e3d80

os：Red Hat Enterprise Linux Server release 6.4 (Santiago)
gcc version:4.6.3
hiredis version:0.13.3

badboy · 2016-07-06T13:16:13Z

@abigbigbird Is this the same as #442? If so, please re-open there. This issue here seems to be another problem.

abigbigbird · 2016-07-06T13:22:02Z

oh，sorry，i saw it core at the same code。my fault！

naugaria7 · 2017-04-03T13:02:02Z

I am running into a similar backtrace on gdb. I am using hiredis for mainly get/set queries. It works fine most of the times but once in a while it hangs on RedisBufferRead().
I am using a connection pool which has Redis Contexts being created and given to threads as and when required. Could this be an issue?

ryancheung · 2017-09-06T06:50:36Z

The same issue here:

(gdb) bt
#0  0x00007f84e681a51d in read () at ../sysdeps/unix/syscall-template.S:84
#1  0x00007f84e5432f8e in redisBufferRead () from /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13
#2  0x00007f84e5433280 in redisGetReply () from /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13
#3  0x00007f84e54332b5 in ?? () from /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13
#4  0x00007f84e5433634 in redisCommand () from /usr/lib/x86_64-linux-gnu/libhiredis.so.0.13
#5  0x00000000006bd86f in DisqueClient::AddJob (this=0x7f84dc0008c0, queueName="umbrella_api", job="device_connected:862670225013751", retry=5, ttl=2592000) at /root/device_manager/src/managerd/DisqueClient.cpp:17
#6  0x0000000000666c79 in WorldSocket::HandleAuth (this=0x7f84c0112f40, recvPacket=...) at /root/device_manager/src/managerd/WorldSocket.cpp:235

The server is not redis, but similar as redis: https://github.com/antirez/disque

EDIT: I use a single instance for redisContext. But I just read Note: A redisContext is not thread-safe. in README of hiredis repo. So this should be a multithreading issue.

xinquanatcolezhu · 2017-11-24T02:22:29Z

same issue here if i run it in a multi- threads

mfmain · 2018-04-19T05:34:52Z

sometimes we are even unaware of threading usage, like being in an callback routine, why dont we introduce a mutex inside redisCommand? I believe there are many products in the market shipped with this bug and the authors have not noticed

michael-grunder · 2018-04-19T15:55:20Z

Making hiredis thread safe would require a lot more than just adding a mutex to redisCommand. Besides, if we added thread safety to hiredis proper it would reduce flexibility.

The solution, I think, is to handle synchronization in each individual application in the best way for that application. There are also thread-safe wrappers to hiredis available but I have not used any of them personally.

michael-grunder · 2019-11-18T20:23:09Z

Closing this old issue, since it's not clear there is actually a bug being described.

michael-grunder closed this as completed Nov 18, 2019

catterer mentioned this issue Oct 3, 2021

Fix hiredis blocking indefinitely despite timeout being set (#593) #986

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hiredis hangs on redisBufferRead #263

hiredis hangs on redisBufferRead #263

saravanan-elangovan commented Aug 13, 2014

pietern commented Aug 13, 2014

saravanan-elangovan commented Aug 14, 2014

michael-grunder commented Dec 17, 2014

shoutrain commented Mar 27, 2015

badboy commented Mar 27, 2015

shoutrain commented Mar 27, 2015

badboy commented Mar 27, 2015

shoutrain commented Mar 27, 2015

echoma commented Apr 26, 2015

shoutrain commented May 4, 2015

badboy commented May 4, 2015

shoutrain commented May 4, 2015

abigbigbird commented Jul 6, 2016

badboy commented Jul 6, 2016

abigbigbird commented Jul 6, 2016

badboy commented Jul 6, 2016

abigbigbird commented Jul 6, 2016

naugaria7 commented Apr 3, 2017

ryancheung commented Sep 6, 2017 •

edited

xinquanatcolezhu commented Nov 24, 2017

mfmain commented Apr 19, 2018

michael-grunder commented Apr 19, 2018

michael-grunder commented Nov 18, 2019

hiredis hangs on redisBufferRead #263

hiredis hangs on redisBufferRead #263

Comments

saravanan-elangovan commented Aug 13, 2014

pietern commented Aug 13, 2014

saravanan-elangovan commented Aug 14, 2014

michael-grunder commented Dec 17, 2014

shoutrain commented Mar 27, 2015

enviroment:

thread stacks:

badboy commented Mar 27, 2015

shoutrain commented Mar 27, 2015

badboy commented Mar 27, 2015

shoutrain commented Mar 27, 2015

echoma commented Apr 26, 2015

shoutrain commented May 4, 2015

badboy commented May 4, 2015

shoutrain commented May 4, 2015

abigbigbird commented Jul 6, 2016

badboy commented Jul 6, 2016

abigbigbird commented Jul 6, 2016

badboy commented Jul 6, 2016

abigbigbird commented Jul 6, 2016

naugaria7 commented Apr 3, 2017

ryancheung commented Sep 6, 2017 • edited

xinquanatcolezhu commented Nov 24, 2017

mfmain commented Apr 19, 2018

michael-grunder commented Apr 19, 2018

michael-grunder commented Nov 18, 2019

ryancheung commented Sep 6, 2017 •

edited