Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daemon wont completely shutdown #569

Closed
proletesseract opened this issue Jul 15, 2019 · 6 comments
Closed

Daemon wont completely shutdown #569

proletesseract opened this issue Jul 15, 2019 · 6 comments

Comments

@proletesseract
Copy link
Member

proletesseract commented Jul 15, 2019

Version: Running commit 2b13cc9 of the 4.7.0 RC (#545) built locally using the depends folder.

OS: Mac OSX 10.14.5

Steps;

  • start navcoind; run ./src/navcoind &
  • wait for it to load
  • when the application is fully loaded and the rpc server is available try to shut down the daemon using the cli; run ./src/navcoin-cli stop

Result;

  • the rpc server shuts down, navcoin-cli commands return error: couldn't connect to server.
  • attempting to restart navcoind with ./src/navcoind & results in Error: Cannot obtain a lock on data directory /Users/xxx/Library/Application Support/NavCoin4. NavCoin Core is probably already running.
  • The logs show the shutdown procedure is resulting in a deadlock;
2019-07-15 21:01:20 POTENTIAL DEADLOCK DETECTED
2019-07-15 21:01:20 Previous lock order was:
2019-07-15 21:01:20  (1) pnode->cs_vSend  net.cpp:1847 (TRY)
2019-07-15 21:01:20  (2) cs_main  main.cpp:7762 (TRY)
2019-07-15 21:01:20  (2) cs_main  main.cpp:4564
2019-07-15 21:01:20 Current lock order is:
2019-07-15 21:01:20  pnode->cs_vRecvMsg  net.cpp:1828 (TRY)
2019-07-15 21:01:20  (2) cs_main  main.cpp:6753
2019-07-15 21:01:20  (1) cs_vSend  net.cpp:2537
2019-07-15 21:01:20 POTENTIAL DEADLOCK DETECTED
2019-07-15 21:01:20 Previous lock order was:
2019-07-15 21:01:20  (1) pnode->cs_vSend  net.cpp:1847 (TRY)
2019-07-15 21:01:20  (2) cs_main  main.cpp:7762 (TRY)
2019-07-15 21:01:20  (2) cs_main  main.cpp:4564
2019-07-15 21:01:20 Current lock order is:
2019-07-15 21:01:20  pnode->cs_vRecvMsg  net.cpp:1828 (TRY)
2019-07-15 21:01:20  (2) cs_main  main.cpp:6753
2019-07-15 21:01:20  (1) cs_vSend  net.cpp:2537

The only way to stop the daemon entirely is to force kill the process id or reboot your machine.

Expected result;

navcoind would fully shut down and exit when using the cli stop command.

@mxaddict
Copy link
Contributor

Can this behavior be replicated via the gitian built binary?

@proletesseract
Copy link
Member Author

not sure, i can try

@mxaddict
Copy link
Contributor

mxaddict commented Jul 16, 2019

After some digging I found 2 PRs on bitcoin upstream that I think if we ported into navcoin would do some good:
bitcoin/bitcoin#9289
bitcoin/bitcoin#9441

Basically those PRs refactor the net.cpp and net.h code to be more efficient and from I've read, they also eliminate the thread issue you reported in this PR

But the amount of code changes require is alot.

Not sure if it's worth it for this release.

Basically the PRs I mention also come with some performance benefits, aparently they speed up the network sync by a large amount, refer to this PR from dash about more details on the performance impacts: dashpay/dash#1586

@mxaddict
Copy link
Contributor

mxaddict commented Jul 16, 2019

Basically this issue seems to have been caused by a peer that was not yet properly handled by src/net.* code and caused a deadlock

@mxaddict
Copy link
Contributor

Since I could not replicate this on my nodes (I could shut them down gracefully) this must only happen under certain conditions,

IE only when a peer is trying to connect and shutdown is called on the node via stop command

@mxaddict
Copy link
Contributor

mxaddict commented Feb 4, 2022

Closing for now as I think this may have been an isolated issue on the test system

@mxaddict mxaddict closed this as completed Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants