Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core dump on calling sdsMakeRoomFor #493

Closed
fenglonz opened this issue Dec 8, 2016 · 15 comments
Closed

core dump on calling sdsMakeRoomFor #493

fenglonz opened this issue Dec 8, 2016 · 15 comments

Comments

@fenglonz
Copy link

fenglonz commented Dec 8, 2016

I am getting core dump on realloc of sdsMakeRoomFor. My coredump backtrace is given below:
`Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff67a5700 (LWP 11207)]
0x0000003fc8c32625 in raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0 0x0000003fc8c32625 in raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1 0x0000003fc8c33e05 in abort () at abort.c:92
#2 0x0000003fc8c70537 in __libc_message (do_abort=2, fmt=0x3fc8d588c0 "*** glibc detected *** %s: %s: 0x%s ***\n")
at ../sysdeps/unix/sysv/linux/libc_fatal.c:198
#3 0x0000003fc8c75f4e in malloc_printerr (action=3, str=0x3fc8d5687d "corrupted double-linked list", ptr=,
ar_ptr=) at malloc.c:6350
#4 0x0000003fc8c763d3 in malloc_consolidate (av=0x7ffff0000020) at malloc.c:5216
#5 0x0000003fc8c79c28 in _int_malloc (av=0x7ffff0000020, bytes=) at malloc.c:4415
#6 0x0000003fc8c7bbda in _int_realloc (av=0x7ffff0000020, oldp=0x7ffff0000b60, oldsize=, nb=1312) at malloc.c:5339
#7 0x0000003fc8c7bf78 in __libc_realloc (oldmem=0x7ffff0000b70, bytes=1299) at malloc.c:3823

#8 0x00000000004c7c06 in sdsMakeRoomFor (s=, addlen=) at sds.c:142

#9 0x00000000004c8044 in sdscatlen (s=, t=0x7fffea673080, len=43) at sds.c:241
#10 0x00000000004c5e00 in __redisAppendCommand (c=0x735090, cmd=, len=) at hiredis.c:910
#11 0x00000000004c703c in redisvAppendCommand (c=0x735090, format=, ap=) at hiredis.c:942
#12 0x00000000004c7208 in redisAppendCommand (c=, format=) at hiredis.c:956
#13 0x0000000000446e88 in DataReader::Do (this=0x7ffff6742f50, task=0x7ffff5da3c50) at schsv_dtsvr_V1.0.0/ds/datareader.cpp:106
#14 0x000000000044d1f9 in TaskHandler::Do (p=) at schsv_dtsvr_V1.0.0/ds/task_handler.cpp:41
#15 0x0000003fc9007a51 in start_thread (arg=0x7ffff67a5700) at pthread_create.c:301
#16 0x0000003fc8ce893d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115`

I have worked on the issue for a long time without no progress. Any idea? Thanks!

@badboy
Copy link
Contributor

badboy commented Dec 8, 2016

Do you have example code triggering the error?

@fenglonz
Copy link
Author

fenglonz commented Dec 9, 2016

I use pipeline, and when getting reply, core dump occurs.

for(i=0; i<5; i++)
	{
		uint32_t docid = rd->ids(i);		
		redisContext* rediscon = rc->connect;	
		string command = "HGET " + key + " mulfield";  
                int err = redisAppendCommand(rediscon, command.c_str());			
	 }
        for(i=0; i<5; i++)
	{
		uint32_t docid = rd->ids(i);
		stringstream ss;
		ss<<docid;
		
		string key = ss.str();
				
		redisContext* rediscon = rc->connect;
		
		void *reply1 = NULL;
		redisGetReply(rediscon,&reply1);  **//core dump**
		redisReply *reply = (redisReply*)reply1;
		if (reply == NULL && rediscon->err != 0)
		{
		}			
		if(reply)
			freeReplyObject(reply); 
	 }

@fenglonz
Copy link
Author

fenglonz commented Dec 9, 2016

I have only one connenction. I also have a detecting thread to rebuild the connenction when it is closed.

bool RedisConnectPool::detectConnects()
{
	bool bPingRet = true;
	for (size_t i=0; i<m_connects.size(); i++)
	{	
		redisContext *tempRedisConn = m_connects[i].connect;
		if (NULL == tempRedisConn)
		{
			bPingRet = false;
			break;
		}

		string command = "PING";
		redisReply *reply = (redisReply*)redisCommand(tempRedisConn, command.c_str());
		if (reply == NULL && tempRedisConn->err != 0)
		{
			bPingRet = false;
			break;
		}

		if (reply && reply->type == REDIS_REPLY_STATUS)
		{
			string strRet = reply->str;
			if (strRet.find("PONG") != string::npos)
			{
				freeReplyObject(reply);
				continue;
			}
		}

		bPingRet = false;
		freeReplyObject(reply);
		break;
	}
	
	return bPingRet;
}

@fenglonz
Copy link
Author

fenglonz commented Dec 9, 2016

After I close the detecting thread, the coredump does not occur now.
Maybe something wrong when processing thread and detecting thread runing at the same time.
But I don 't know why and how the coredump occurs.

I also refer to your reply in #447
I only see see two ways how this can lead to a failure:

A already broken pointer is passed, making it fail in completely unpredictable ways
The pointer is concurrently modified from somewhere else, making it a data race.

@badboy
Copy link
Contributor

badboy commented Dec 9, 2016

Wait, are you using a single hiredis context in multiple threads?

@fenglonz
Copy link
Author

Yes, I init a connect pool. The processing thread get one connect for pool to use and the detecting thread check every connect in pool if it is useable..

@dagostinelli
Copy link

dagostinelli commented Dec 21, 2016

I'm also seeing this. I'm not using threads. In my use-case, I start with the redis service down (or I start it up and then take it down while my app is running). I was testing that my app gracefully handles the case where redis become unavailable.

I test for redis availability by attempting to send PING's to redis.

	if (redisAsyncCommand(c, fn, context, command) != REDIS_OK)
	{
		info("Redis: Error sending command %s", command);
		/*  snip - error handling - snip */
	}

The "command" is "PING"

My code gets in here 3x, attempting to send the PING and then it seg faults in that same sds function.

@dagostinelli
Copy link

Maybe we're doing something wrong. Are we allowed to keep calling redisAsyncCommand (which seems to append subsequent commands to the output buffer) before we get a reply?

@dagostinelli
Copy link

FYI-- this is the same issue as: #460

@michael-grunder
Copy link
Collaborator

@dagostinelli OK since this is happening without threading, I'm more inclined to believe that it might actually be a bug in the library. What event library are you using in your case?

@dagostinelli
Copy link

@michael-grunder I'm using libevent

@dagostinelli
Copy link

@michael-grunder If it helps, I'm calling redisAsyncCommand from the callback of an evtimer.

@badboy
Copy link
Contributor

badboy commented Dec 22, 2016

@dagostinelli can you provide a minimal example to reproduce the issue? I can take a closer look, but likely not before the Christmas days

@axot
Copy link

axot commented Jul 24, 2017

We are using hiredis-vip for Redis cluster, this bug is same as hiredis.
Our program will dispatch redisClusterAsyncCommand in multiple worker threads,
Is it the reason for this issue? See backstrace below.

==13234==ERROR: AddressSanitizer: attempting double-free on 0x6140004ce440 in thread T30:
#0 0x56a1b3 in realloc (xxx)
#1 0x9706dc in sdsMakeRoomFor /opt/hiredis-vip/sds.c:142
#2 0x970849 in sdscatlen /opt/hiredis-vip/sds.c:241
#3 0x97023f in __redisAppendCommand /opt/hiredis-vip/hiredis.c:926
#4 0x9640c2 in __redisAsyncCommand /opt/hiredis-vip/async.c:646
#5 0x96cc4d in redisClusterAsyncFormattedCommand /opt/hiredis-vip/hircluster.c:4819
#6 0x96d0ca in redisClusterAsyncCommandArgv /opt/hiredis-vip/hircluster.c:4900

0x6140004ce440 is located 0 bytes inside of 431-byte region [0x6140004ce440,0x6140004ce5ef)
freed by thread T28 here:
#0 0x569e39 in free (xxx)
#1 0x9700bc in redisBufferWrite /opt/hiredis-vip/hiredis.c:865

previously allocated by thread T24 here:
#0 0x56a1b3 in realloc (xxx)
#1 0x9706dc in sdsMakeRoomFor /opt/hiredis-vip/sds.c:142

Thread T30 created by T0 here:
#0 0x559472 in pthread_create (xxx)
#1 0x9d99f6 in std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) (xxx)
#2 0x6020004ac0ef (+0x4ac0ef)

Thread T28 created by T0 here:
#0 0x559472 in pthread_create (xxx)
#1 0x9d99f6 in std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) (xxx)
#2 0x6020004ac0ef (+0x4ac0ef)

Thread T24 created by T0 here:
#0 0x559472 in pthread_create (xxx)
#1 0x9d99f6 in std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) (xxx)
#2 0x6020004ac0ef (+0x4ac0ef)

SUMMARY: AddressSanitizer: double-free ??:0 realloc

@michael-grunder
Copy link
Collaborator

Going to close this issue but if it's still an issue please provide a minimal example that can trigger the segfault.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants