Skip to content
This repository has been archived by the owner on Apr 6, 2019. It is now read-only.

linux write data 100% cpu #77

Closed
little-nil opened this issue May 8, 2017 · 28 comments
Closed

linux write data 100% cpu #77

little-nil opened this issue May 8, 2017 · 28 comments

Comments

@little-nil
Copy link

Thread 4 (Thread 0x7f6ca5eaa700 (LWP 910)):
#0 0x00007f6cacedc933 in select () from /lib64/libc.so.6
#1 0x00000000004ad6fa in tacopie::io_service::poll() ()
#2 0x00007f6cad77d230 in ?? () from /lib64/libstdc++.so.6
#3 0x00007f6cacbdadf3 in start_thread () from /lib64/libpthread.so.0
#4 0x00007f6cacee51ad in clone () from /lib64/libc.so.6

Threads: 2344 total, 3 running, 2341 sleeping, 0 stopped, 0 zombie
%Cpu(s): 21.0 us, 32.5 sy, 0.0 ni, 46.3 id, 0.0 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem: 16269700 total, 15801636 used, 468064 free, 158872 buffers
KiB Swap: 0 total, 0 used, 0 free. 2272844 cached Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
910 root 20 0 416432 1336 980 R 99.8 0.0 46:30.66 PlayBackServer

@sedapsfognik
Copy link

Maybe refers to: #75
Also interesting in this question.

@Cylix
Copy link
Owner

Cylix commented Jun 21, 2017

Hi,

I submitted a possible fix: you can find more information about it in the thread of #75.

Hope it will solve the issue :)

Best!

@LazyPlanet
Copy link

linux get also up to 100%

#0 0x00007fbc25bcb6d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007fbc259699ec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x000000000053c20d in std::condition_variable::waitstd::__future_base::_State_base::wait()::{lambda()#1}(std::unique_lockstd::mutex&, std::__future_base::_State_base::wait()::{lambda()#1}) (this=0x7fbc00afb070, __lock=..., __p=...) at /usr/include/c++/4.8.2/condition_variable:93
#3 0x0000000000533500 in std::__future_base::_State_base::wait (this=0x7fbc00afb038) at /usr/include/c++/4.8.2/future:327
#4 0x00000000005449b9 in std::__basic_future<cpp_redis::reply>::_M_get_result (this=0x7fbc1ddb6770) at /usr/include/c++/4.8.2/future:596
#5 0x000000000053c3de in std::future<cpp_redis::reply>::get (this=0x7fbc1ddb6770) at /usr/include/c++/4.8.2/future:675
#6 0x00000000005347e1 in Adoter::Redis::Get (this=0x7fbc00afacb0, key="user:guest_1179", value=..., async=false) at ../Include/RedisManager.h:184

@LazyPlanet
Copy link

multithread get data from redis as follow:

bool Get(const std::string& key, std::string& value, bool async = true)
{
if (!_client.is_connected())
{
_client.connect(_hostname, _port);

     if (!_client.is_connected())
     {
         return false;
     }

     auto has_auth = _client.auth(_password);
     if (has_auth.get().ko())
     {
         return false;
     }
 }

 auto get = _client.get(key);
 cpp_redis::reply reply = get.get();

 if (async) {
     _client.commit();
 }
 else {
     _client.sync_commit(std::chrono::milliseconds(1000));
 }

 if (!reply.is_string())
 {
     return false;
 }

 value = reply.as_string();

 return true;

}

thanks.

@Cylix
Copy link
Owner

Cylix commented Nov 20, 2017

Hi @LazyPlanet ,

Are you experiencing high CPU?
Seems like the stacktrace you are showing, the thread is sleeping on a condvar, so it can't be the root cause.
Would you mind sharing a full working small example reproducing the issue? (typically a main that calls your get function and reproduce the 100% CPU).

Thanks

@LazyPlanet
Copy link

Most of threads are in that state, It caused by multi-thread, linux centos 7.2 gcc builded, 16 cores, about 1000 calls.
Let me test and tell you in a demo.
Thanks. @Cylix

@sedapsfognik
Copy link

@Cylix, after migration to a conditional variable, I'm experiencing 100% CPU very-very rarely. It will be very cool if @LazyPlanet will find how to reproduce this bug.

@LazyPlanet
Copy link

The demo , i define a class as follows, when i want a data, i calls Redis().Get(string key);

class Redis
{
private:
std::string _hostname = "127.0.0.1";
int32_t _port = 6379;
std::string _password = "!QAZ%TGBaa.";

cpp_redis::future_client _client;

public:

Redis()
{
std::string hostname = ConfigInstance.GetString("Redis_ServerIP", "127.0.0.1");
int32_t port = ConfigInstance.GetInt("Redis_ServerPort", 6379);
std::string password = ConfigInstance.GetString("Redis_Password", "!QAZ%TGB&UJM9ol.");

	if (port > 0) _port = port;
	if (!hostname.empty()) _hostname = hostname;
	if (!password.empty()) _password = password;
}

bool Get(const std::string& key, std::string& value, bool async = true)
{
if (!_client.is_connected())
{
_client.connect(_hostname, _port);

		if (!_client.is_connected()) 
		{
			return false;
		}
		
		auto has_auth = _client.auth(_password);
		if (has_auth.get().ko()) 
		{
			return false;
		}
	}

	auto get = _client.get(key);
	cpp_redis::reply reply = get.get();
	
	if (async) {
		_client.commit(); 
	} else {
		_client.sync_commit(); 
	}

	if (!reply.is_string()) 
	{
		return false;
	}

	value = reply.as_string();

	return true;
}

bool Save(const std::string& key, const std::string& value, bool async = true)
{
	if (!_client.is_connected()) 
	{
		_client.connect(_hostname, _port);

		if (!_client.is_connected()) 
		{
			return false;
		}
		
		auto has_auth = _client.auth(_password);
		if (has_auth.get().ko()) 
		{
			return false;
		}
	}

	auto set = _client.set(key, value);

	if (async) {
		_client.commit(); 
	} else {
		_client.sync_commit(); 
	}

	auto get = set.get();
	std::string result;
	if (get.is_string()) result = get.as_string();

	return true;
}

};

In multhread i calls 500 times get or save , i will cause as follows:
(gdb) bt
#0 0x00007f257c6776d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f257d2129ec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x000000000053a01b in std::condition_variable::waitstd::__future_base::_State_base::wait()::{lambda()#1}(std::unique_lockstd::mutex&, std::__future_base::_State_base::wait()::{lambda()#1}) (this=0x7f2558005540, __lock=..., __p=...) at /usr/include/c++/4.8.2/condition_variable:93
#3 0x000000000053341a in std::__future_base::_State_base::wait (this=0x7f2558005508) at /usr/include/c++/4.8.2/future:327
#4 0x00000000005414c3 in std::__basic_future<cpp_redis::reply>::_M_get_result (this=0x7f25754440c0) at /usr/include/c++/4.8.2/future:596
#5 0x000000000053a1de in std::future<cpp_redis::reply>::get (this=0x7f25754440c0) at /usr/include/c++/4.8.2/future:675
#6 0x0000000000533c77 in Adoter::Redis::Get (this=0x7f2558004880, key="player:2101280", value=..., async=false) at ../Include/RedisManager.h:186
#7 0x000000000053477e in Adoter::Redis::GetPlayer (this=0x7f2558004880, player_id=2101280, player=...) at ../Include/RedisManager.h:313

Do I misuse it?

@Cylix
Copy link
Owner

Cylix commented Dec 11, 2017

Hi,

You should call commit or sync_commit before calling .get() on the std::future object returns by the redis client.
Right now, you are waiting indefinitely while you haven't pushed the pipelined commands to the redis server.

Remember, when you call .get, .set or any other redis command, the command is not sent yet to the redis server. Instead it is buffered and flushed when you can commit or sync_commit.

@LazyPlanet
Copy link

Oh, sorry!!
I am wrong!!
Thank u very much, I have thought it is because I donnot put paras timeout in sync_commit. such as _client.sync_commit(std::chrono::milliseconds(100));

@LazyPlanet
Copy link

LazyPlanet commented Dec 13, 2017

I am sorry to interrupt you that there is a lock, Is there I am wrong? About 100 connections with redis at this time.
Thank you very much.

void Game::SavePlayBack()
{
if (!_room) return;

cpp_redis::future_client client;
client.connect(ConfigInstance.GetString("Redis_ServerIP", "127.0.0.1"), ConfigInstance.GetInt("Redis_ServerPort", 6379));
if (!client.is_connected()) return;

auto has_auth = client.auth(ConfigInstance.GetString("Redis_Password", "!ssl."));
if (has_auth.get().ko()) return; ///error

auto set = client.set("playback:" + std::to_string(_room_id) + "_" + std::to_string(_game_id), _playback.SerializeAsString());
client.commit();

auto get = set.get();
std::string result;
if (get.is_string()) result = get.as_string();

//auto redis_cli = make_unique<Redis>();
//std::string key = "playback:" + std::to_string(_room_id) + "_" + std::to_string(_game_id);
//redis_cli->Save(key, _playback);  

}
#0 0x00007f257c6776d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f257d2129ec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x000000000053a01b in std::condition_variable::waitstd::__future_base::_State_base::wait()::{lambda()#1}(std::unique_lockstd::mutex&, std::__future_base::_State_base::wait()::{lambda()#1}) (this=0x7f2558005540, __lock=..., __p=...) at /usr/include/c++/4.8.2/condition_variable:93
#3 0x000000000053341a in std::__future_base::_State_base::wait (this=0x7f2558005508) at /usr/include/c++/4.8.2/future:327
#4 0x00000000005414c3 in std::__basic_future<cpp_redis::reply>::_M_get_result (this=0x7f25754440c0) at /usr/include/c++/4.8.2/future:596
#5 0x000000000053a1de in std::future<cpp_redis::reply>::get (this=0x7f25754440c0) at /usr/include/c++/4.8.2/future:675

@LazyPlanet
Copy link

I want to know if I use _client.sync_commit(std::chrono::milliseconds(100)); , can it not cause pthread_cond_wait ?
Thank u.

@Cylix
Copy link
Owner

Cylix commented Dec 14, 2017

In the new example you showed, you again did called commit() after calling get() on the std::future.

You need to call it before.

void Game::SavePlayBack()
{
    if (!_room) return;

    cpp_redis::future_client client;
    client.connect(ConfigInstance.GetString("Redis_ServerIP", "127.0.0.1"), ConfigInstance.GetInt("Redis_ServerPort", 6379));
    if (!client.is_connected()) return;

    auto has_auth = client.auth(ConfigInstance.GetString("Redis_Password", "!ssl."));


    //!
    //! Here you call .get() but you havent call commit or sync commit
    //!


    if (has_auth.get().ko()) return; ///error





    auto set = client.set("playback:" + std::to_string(_room_id) + "_" + std::to_string(_game_id), _playback.SerializeAsString());



    //!
    //! Here you call commit, but only after the get of above
    //!




    client.commit();

    auto get = set.get();
    std::string result;
    if (get.is_string()) result = get.as_string();

    //auto redis_cli = make_unique<Redis>();
    //std::string key = "playback:" + std::to_string(_room_id) + "_" + std::to_string(_game_id);
    //redis_cli->Save(key, _playback);  
}

@LazyPlanet
Copy link

But sometimes it cause error while sometimes not, I am very sorry to ask this question again.
I need to check if I have auth or not, I have known it.
Thank you very much.

@LazyPlanet
Copy link

LazyPlanet commented Dec 14, 2017

I have changed my code, but it alse does not work as below:
when all work thread of this process enter this state, the use of cpu goes up to 100%.
(gdb) bt
#0 0x00007f71d90576d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f71d9bf29ec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x000000000056edc7 in std::condition_variable::waitstd::__future_base::_State_base::wait()::{lambda()#1}(std::unique_lockstd::mutex&, std::__future_base::_State_base::wait()::{lambda()#1}) (
this=0x7f71d0d04e70, __lock=..., __p=...) at /usr/include/c++/4.8.2/condition_variable:93
#3 0x0000000000569fd8 in std::__future_base::_State_base::wait (this=0x7f71d0d04e38) at /usr/include/c++/4.8.2/future:327
#4 0x000000000057759d in std::__basic_future<cpp_redis::reply>::_M_get_result (this=0x7f71d662bed0) at /usr/include/c++/4.8.2/future:596
#5 0x000000000056efc2 in std::future<cpp_redis::reply>::get (this=0x7f71d662bed0) at /usr/include/c++/4.8.2/future:675
#6 0x000000000056a89d in Adoter::Redis::Get (this=0x7f71d662c070, key="user:guest_1082", value=..., async=true) at ../Include/RedisManager.h:191
#7 0x000000000056b4a9 in Adoter::Redis::GetUser (this=0x7f71d662c070, username="guest_1082", user=...) at ../Include/RedisManager.h:479
#8 0x000000000055be55 in Adoter::Player::GetWechat (this=0x7f71c0c4cd48) at Player.cpp:4376

Code example:

auto has_auth = _client.auth(_password);
auto get = _client.get(key);
if (async) {
 _client.commit(); 
} else {
  _client.sync_commit(std::chrono::milliseconds(100));
}
if (has_auth.get().ko()) { return false;} /////////////ERROR
auto reply = get.get();
if (!reply.is_string()) { return false;}
auto value = reply.as_string();

If I delete line:
if (has_auth.get().ko()) { return false;} /////////////ERROR
It works fine.

@LazyPlanet
Copy link

LazyPlanet commented Dec 16, 2017

OK, It has caused this bug all the same.

bool Get(const std::string& key, google::protobuf::Message& value, bool async = true)
{
    if (!_client.is_connected()) {
        _client.connect(_hostname, _port);
        if (!_client.is_connected()) {   return false;     }
    }

    auto has_auth = _client.auth(_password);
    auto get = _client.get(key);

    if (async) {
        _client.commit(); 
    } else {
        _client.sync_commit(std::chrono::milliseconds(100)); ////////////Call this.
    }

    /*
    if (has_auth.get().ko()) {
        return false;
    }
    */

    auto reply = get.get(); //////////////////////Wait Error
    if (!reply.is_string()) {  return false;  }

    auto success = value.ParseFromString(reply.as_string());
    if (!success)  {  return false;   }

    return true;
}

`bool Save(const std::string& key, const std::string& value, bool async = true)
{
if (!_client.is_connected()) {

		_client.connect(_hostname, _port);

		if (!_client.is_connected()) {
			return false;
		}
	}

	auto has_auth = _client.auth(_password);
	auto set = _client.set(key, value);

	if (async) {
		_client.commit(); 
	} else {
		_client.sync_commit(std::chrono::milliseconds(100)); 
	
	/*
	if (has_auth.get().ko()) {
		return false;
	}
	*/

	auto get = set.get(); //////////////////////Wait Error
	std::string result = "ERROR";
	if (get.is_string()) result = get.as_string();

	return true;
}`
Stack:
(gdb) t 7
[Switching to thread 7 (Thread 0x7f23db63a700 (LWP 15595))]
#0  0x00007f23e10596d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007f23e10596d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f23e1bf49ec in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /lib64/libstdc++.so.6
#2  0x0000000000539f35 in std::condition_variable::wait<std::__future_base::_State_base::wait()::{lambda()#1}>(std::unique_lock<std::mutex>&, std::__future_base::_State_base::wait()::{lambda()#1}) (
    this=0x7f23d0005660, __lock=..., __p=...) at /usr/include/c++/4.8.2/condition_variable:93
#3  0x00000000005335ca in std::__future_base::_State_base::wait (this=0x7f23d0005628) at /usr/include/c++/4.8.2/future:327
#4  0x00000000005414df in std::__basic_future<cpp_redis::reply>::_M_get_result (this=0x7f23db628e20) at /usr/include/c++/4.8.2/future:596
#5  0x000000000053a122 in std::future<cpp_redis::reply>::get (this=0x7f23db628e20) at /usr/include/c++/4.8.2/future:675
#6  0x0000000000533e7e in Adoter::Redis::Get (this=0x7f23d0004a90, key="player:2101437", value=..., async=false) at ../Include/RedisManager.h:202
#7  0x0000000000534698 in Adoter::Redis::GetPlayer (this=0x7f23d0004a90, player_id=2101437, player=...) at ../Include/RedisManager.h:312
#8  0x00000000005223ce in Adoter::ServerSession::OnCommandProcess (this=0x2c5cbc8, command=...) at ServerSession.cpp:303
#9  0x00000000005208ad in Adoter::ServerSession::OnInnerProcess (this=0x2c5cbc8, meta=...) at ServerSession.cpp:119
#10 0x000000000051ff77 in Adoter::ServerSession::InitializeHandler (this=0x2c5cbc8, error=..., bytes_transferred=22) at ServerSession.cpp:59

(gdb) t 1
[Switching to thread 1 (Thread 0x7f641bf64780 (LWP 23169))]
#0  0x00007f641a7176d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007f641a7176d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f641b2b29ec in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /lib64/libstdc++.so.6
#2  0x000000000056edab in std::condition_variable::wait<std::__future_base::_State_base::wait()::{lambda()#1}>(std::unique_lock<std::mutex>&, std::__future_base::_State_base::wait()::{lambda()#1}) (
    this=0x20f2040, __lock=..., __p=...) at /usr/include/c++/4.8.2/condition_variable:93
#3  0x000000000056a288 in std::__future_base::_State_base::wait (this=0x20f2008) at /usr/include/c++/4.8.2/future:327
#4  0x00000000005775f5 in std::__basic_future<cpp_redis::reply>::_M_get_result (this=0x7ffc51d23290) at /usr/include/c++/4.8.2/future:596
#5  0x000000000056efa6 in std::future<cpp_redis::reply>::get (this=0x7ffc51d23290) at /usr/include/c++/4.8.2/future:675
#6  0x000000000056b03f in Adoter::Redis::Save (this=0x7ffc51d23420, key="player:262942", value=..., async=true) at ../Include/RedisManager.h:246
#7  0x000000000056b3f6 in Adoter::Redis::SavePlayer (this=0x7ffc51d23420, player_id=262942, player=...) at ../Include/RedisManager.h:382
#8  0x0000000000543e66 in Adoter::Player::Save (this=0x7f6400517fd8, force=false) at Player.cpp:105
#9  0x0000000000549062 in Adoter::Player::Update (this=0x7f6400517fd8) at Player.cpp:1151
#10 0x000000000060cdb8 in Adoter::CenterSession::Update (this=0x21e0308) at CenterSession.cpp:267
#11 0x000000000059fa11 in Adoter::World::Update (this=0xa64fa0 <Adoter::World::Instance()::_instance>, diff=49) at World.cpp:87
#12 0x0000000000626781 in WorldUpdateLoop () at Main.cpp:51
#13 0x0000000000626e55 in main (argc=2, argv=0x7ffc51d23a68) at Main.cpp:142

Can u help me please? @Cylix

@Cylix
Copy link
Owner

Cylix commented Dec 16, 2017

Which version of the library are you using? If not the latest, can you try to upgrade to see if it solves your issue?
If you are using the latest, can you try to call cpp_redis::network::set_default_nb_workers(2) (2 or greater) to see if it resolves your issue?

Best

@LazyPlanet
Copy link

LazyPlanet commented Dec 16, 2017

v3.5.3

July 2nd, 2017

I will get master and try it, thank you very much!

@Cylix
Copy link
Owner

Cylix commented Dec 17, 2017

yep, if you can try the versions above 4.0+, would be perfect. There are lots of changes and fix.

Best

@LazyPlanet
Copy link

LazyPlanet commented Dec 18, 2017

Problem above has been solved when I use v4.3.0 and cpp_redis::network::set_default_nb_workers(3)
But when I have 10000 connections (TIME_WAIT) with redis server, It cause:
terminate called after throwing an instance of 'cpp_redis::redis_error'
what(): connect() failure

Redis Server down.
Can I disconnect redis server when _client.disconnect(); ??
My file link,I think the connection pool is necessary!!!
https://github.com/LazyPlanet/MX-Architecture/blob/master/Include/RedisManager.h

@Cylix
Copy link
Owner

Cylix commented Dec 18, 2017

10 000 connections sounds like a problem of design in your software to be honest.

The connect() failure must happen either because you can't create anymore sockets (you used all fd allowed by the OS) or the redis-server is not supporting any more connection from you (flooded by the number of connections).

Connection pool should either be done on your side by having a pool of cpp_redis clients & subscribers (both classes are thread safe) and you should control how many clients you have.
For most cases, a single client and a single subscriber are enough.

Destroying a client or subscriber instance automatically disconnect the client and clean all OS resources (sockets).

I don't know how you handle it, but for reaching 10k, seems like you are spawning a new client for each command without deleting it. Try to re-use already existing clients.

Best

@LazyPlanet
Copy link

LazyPlanet commented Dec 19, 2017

I think after executing my command, the client should disconnect automatically from redis server, so every command I create a Redis object for connecting redis server to work.
I will try to use a single client instance in multhreads, and cannot the client lose its connection? Should I check if connected every time? I mean should I maintain connection by myself?
Thank u very much.

@LazyPlanet
Copy link

LazyPlanet commented Dec 21, 2017

I am sorry to cause this dead lock, I donnot know if it is a bug, the call stack:
(gdb) info threads
Id Target Id Frame
20 Thread 0x7fd46e832700 (LWP 10345) "GmtServer.debug" 0x00007fd46ed1449d in nanosleep () from /lib64/libc.so.6
19 Thread 0x7fd46e031700 (LWP 10346) "GmtServer.debug" 0x00007fd46ed1449d in nanosleep () from /lib64/libc.so.6
18 Thread 0x7fd46d830700 (LWP 10347) "GmtServer.debug" 0x00007fd46ed1449d in nanosleep () from /lib64/libc.so.6
17 Thread 0x7fd46d02f700 (LWP 10348) "GmtServer.debug" 0x00007fd46ed1449d in nanosleep () from /lib64/libc.so.6
16 Thread 0x7fd46c82e700 (LWP 10349) "GmtServer.debug" 0x00007fd46ea466d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
15 Thread 0x7fd46c02d700 (LWP 10350) "GmtServer.debug" 0x00007fd46ed4d863 in epoll_wait () from /lib64/libc.so.6
14 Thread 0x7fd46b82c700 (LWP 10351) "GmtServer.debug" 0x00007fd46ea466d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
13 Thread 0x7fd46b02b700 (LWP 10352) "GmtServer.debug" 0x00007fd46ea466d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
12 Thread 0x7fd46a82a700 (LWP 10353) "GmtServer.debug" 0x00007fd46ea466d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
11 Thread 0x7fd46a029700 (LWP 10354) "GmtServer.debug" 0x00007fd46ed4e13b in recv () from /lib64/libc.so.6
10 Thread 0x7fd469828700 (LWP 10355) "GmtServer.debug" 0x00007fd46ed449b3 in select () from /lib64/libc.so.6
9 Thread 0x7fd469027700 (LWP 10356) "GmtServer.debug" 0x00007fd46ea466d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
8 Thread 0x7fd468826700 (LWP 10357) "GmtServer.debug" 0x00007fd46ed4e13b in recv () from /lib64/libc.so.6
7 Thread 0x7fd463fff700 (LWP 10358) "GmtServer.debug" 0x00007fd46ea48f4d in __lll_lock_wait () from /lib64/libpthread.so.0
6 Thread 0x7fd4637fe700 (LWP 10359) "GmtServer.debug" 0x00007fd46ea466d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
5 Thread 0x7fd462ffd700 (LWP 10360) "GmtServer.debug" 0x00007fd46ea466d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
4 Thread 0x7fd4627fc700 (LWP 10361) "GmtServer.debug" 0x00007fd46ea48f4d in __lll_lock_wait () from /lib64/libpthread.so.0
3 Thread 0x7fd461ffb700 (LWP 10362) "GmtServer.debug" 0x00007fd46ed4d863 in epoll_wait () from /lib64/libc.so.6
2 Thread 0x7fd4617fa700 (LWP 10363) "GmtServer.debug" 0x00007fd46ed4d863 in epoll_wait () from /lib64/libc.so.6

  • 1 Thread 0x7fd470297780 (LWP 10343) "GmtServer.debug" 0x00007fd46ea466d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

DEAD LOCK:
[Switching to thread 4]
1 0x00007fd46ea44d02 in _L_lock_791 () from /lib64/libpthread.so.0
#2 0x00007fd46ea44c08 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x0000000000673d1d in tacopie::tcp_client::clear_read_requests() ()
#4 0x0000000000674aa3 in tacopie::tcp_client::disconnect(bool) ()
#5 0x0000000000667801 in cpp_redis::network::redis_connection::disconnect(bool) ()
#6 0x00000000006226cd in cpp_redis::client::~client() ()
#7 0x0000000000533c78 in Adoter::Redis::~Redis (this=0x7fd44c004a50, __in_chrg=) at ../Include/RedisManager.h:32
#8 0x00000000005448ca in std::default_deleteAdoter::Redis::operator() (this=0x7fd4627eb070, __ptr=0x7fd44c004a50)
at /usr/include/c++/4.8.2/bits/unique_ptr.h:67
#9 0x000000000053b6d1 in std::unique_ptr<Adoter::Redis, std::default_deleteAdoter::Redis >::~unique_ptr (this=0x7fd4627eb070, __in_chrg=)
at /usr/include/c++/4.8.2/bits/unique_ptr.h:184
#10 0x0000000000522ecc in Adoter::ServerSession::OnCommandProcess (this=0x7fd42ca8e068, command=...) at ServerSession.cpp:461
#11 0x0000000000520de3 in Adoter::ServerSession::OnInnerProcess (this=0x7fd42ca8e068, meta=...) at ServerSession.cpp:123

[Switching to thread 7 (Thread 0x7fd463fff700 (LWP 10358))]
#0 0x00007fd46ea48f4d in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0 0x00007fd46ea48f4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00007fd46ea44d02 in _L_lock_791 () from /lib64/libpthread.so.0
#2 0x00007fd46ea44c08 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x0000000000673d1d in tacopie::tcp_client::clear_read_requests() ()
#4 0x0000000000674aa3 in tacopie::tcp_client::disconnect(bool) ()
#5 0x0000000000667801 in cpp_redis::network::redis_connection::disconnect(bool) ()
#6 0x00000000006226cd in cpp_redis::client::~client() ()
#7 0x0000000000533c78 in Adoter::Redis::~Redis (this=0x7fd45c000a60, __in_chrg=) at ../Include/RedisManager.h:32
#8 0x00000000005448ca in std::default_deleteAdoter::Redis::operator() (this=0x7fd463fee070, __ptr=0x7fd45c000a60)
at /usr/include/c++/4.8.2/bits/unique_ptr.h:67
#9 0x000000000053b6d1 in std::unique_ptr<Adoter::Redis, std::default_deleteAdoter::Redis >::~unique_ptr (this=0x7fd463fee070, __in_chrg=)
at /usr/include/c++/4.8.2/bits/unique_ptr.h:184
#10 0x0000000000522ecc in Adoter::ServerSession::OnCommandProcess (this=0x7fd43cabed88, command=...) at ServerSession.cpp:461

Thank u very much! @Cylix

@LazyPlanet
Copy link

I think it should add a lock in cpp_redis::client::~client() ()

@Cylix
Copy link
Owner

Cylix commented Dec 22, 2017

Hi,

Do you have a code example to help me reproduce this issue?
From where do you destroy the redis instance?

Best

@LazyPlanet
Copy link

LazyPlanet commented Dec 22, 2017

I am not sure it will cause this problem 100%, I use your code to code game server. Most of time, it does not have any problem. When running 2 days, it appears and all players cannot operate... In multi threads, many players Save and Get data using Redis().Get(xxx) or Redis().Save(xxx) as follows.... Code example:
`namespace Adoter
{

class Redis
{
private:
std::string _hostname = "127.0.0.1";
int32_t _port = 6379;
std::string _password = "!s%Ta";

cpp_redis::client _client;

public:
~Redis() { ////Maybe this?
//_client.disconnect();
}

Redis() 
{ 
	std::string hostname = ConfigInstance.GetString("Redis_ServerIP", "127.0.0.1");
	int32_t port = ConfigInstance.GetInt("Redis_ServerPort", 6379); 
	std::string password = ConfigInstance.GetString("Redis_Password", "!QAZ%TGB&UJM9ol.");

	if (port > 0) _port = port;
	if (!hostname.empty()) _hostname = hostname;
	if (!password.empty()) _password = password;

}

bool Connect()
{
	try 
	{
		if (_client.is_connected()) return true;

		_client.connect(_hostname, _port, [this](const std::string& host, std::size_t port, cpp_redis::client::connect_state status) {
				if (status == cpp_redis::client::connect_state::dropped) {
					std::cout << "Connet failed..." << std::endl;
				}
			});
		if (!_client.is_connected()) return false;
	}
	catch (std::exception& e)
	{
		return false;
	}
	return true;
}

bool Get(const std::string& key, std::string& value, bool async = true)
{
	if (!Connect()) return false;

	auto has_auth = _client.auth(_password);
	auto get = _client.get(key);
	
	if (async) {
		_client.commit(); 
	} else {
		_client.sync_commit(std::chrono::milliseconds(100)); 
	}
	
	if (has_auth.get().ko()) {
		return false;
	}
	
	auto reply = get.get();
	if (!reply.is_string()) {
		return false;
	}

	value = reply.as_string();
	return true;
}

bool Save(const std::string& key, const std::string& value, bool async = true)
{
	if (!Connect()) return false;

	auto has_auth = _client.auth(_password);
	auto set = _client.set(key, value);

	if (async) {
		_client.commit(); 
	} else {
		_client.sync_commit(std::chrono::milliseconds(100)); 
	}
	
	if (has_auth.get().ko()) {
		return false;
	}

	auto get = set.get();
	std::string result = "ERROR";
	if (get.is_string()) result = get.as_string();
	
	return true;
}

}
}`

Thank u very much and sorry to cause this problem.
@Cylix

@Eggache666
Copy link

int main()
{
cpp_redis::network::set_default_nb_workers(2);
cpp_redis::client client;

client.connect("172.16.2.175", 6379, [](const std::string& host, std::size_t port, cpp_redis::client::connect_state status) {
	if (status == cpp_redis::client::connect_state::dropped) {
		std::cout << "client disconnected from " << host << ":" << port << std::endl;
	}
});

// test write speed
std::vector<std::string> vec_cmds;
for (int i = 0; i < 500000; ++i) {
	char cmd[100] = { 0 };
	sprintf(cmd, "set %d %d", i, i + 1);
	vec_cmds.emplace_back(cmd);
}

boost::timer t;
client.send(vec_cmds, [](cpp_redis::reply& rep) { std::cout << "send done\n"; });
client.sync_commit();
std::cout << t.elapsed() << std::endl;

}

I use this frame for few days, it works in windows, but crashed in linux(ubuntu).

info:
terminate called after throwing an instance of 'cpp_redis::redis_error' what(): connect() failure

and sometimes crashed info:
client disconnected from 172.16.2.175:6379
terminate called after throwing an instance of 'cpp_redis::redis_error'
what(): tcp_client is disconnected

How can i solve this problem? Thank you so much!!!

@Cylix
Copy link
Owner

Cylix commented Mar 15, 2018

Hi @Eggache666,

Your issue is unrelated to this current thread, so can you open a new issue instead?
I'm gonna lock this issue as it has derived from the initial topic multiple times and creates confusion.

Additionally, concerning your additional problem, you may try to first connect to your server using redis-cli. If redis-cli fails to connect, you may double-check if your server is running. If so, which IP/port and if the IP is reachable using your network configuration.

Best

Repository owner locked as off-topic and limited conversation to collaborators Mar 15, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants