Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Client doesn't get new blocks after reload and is stuck in "Releasing unqued broadcasts"-loops #2558

Closed
punkr0ck opened this issue Nov 19, 2018 · 6 comments
Assignees

Comments

@punkr0ck
Copy link

punkr0ck commented Nov 19, 2018

Expected behavior

After a reload Lisk Core should run as normal and receive blocks again after seconds or at least a minute or so.

Actual behavior

After a reload it can happen, that Lisk Core only shows Releasing unqued broadcasts followed by a Queue empty and this for a long time (I had it three times for 30-40 minutes). This happens not always but for me it happened every 4-5 reloads. Tested on two differnt VPS' on the same provider. Sometimes new blocks come in 10 seconds after a reloads and the other time I don't get new blocks for 40 minutes. Even several reloads seem not to fix this. I'm literally not able to get the client to receive new blocks. While I am writing this it needed six reloads to bring it back working normally. See this example of the issue and when it finally worked: https://pastebin.com/GevReWqe

Again: It's not happening after every reload. Some reloads work fine and the client back in track in seconds, like this example shows: https://pastebin.com/eqV8s2wn

Steps to reproduce

Reload the client several times. I am using this command: bash lisk.sh reload

Which version(s) does this affect? (Environment, OS, etc...)

I noticed this since 1.3.0-rc.0 on testnet and reproduced it on two testnet server. Not tested on mainnet.

@diego-G
Copy link

diego-G commented Dec 14, 2018

We're not able to reproduce it. @punkr0ck could you tell us if you've seen again this issue? Please, send us debug level logs.

@punkr0ck
Copy link
Author

I was able to reproduce it right away on testnet with latest version 1.3.1-rc.0.
See the debug-logs: https://pastebin.com/YjSpq9b2

After 10 minutes I received my first block after the restart:
[inf] 2018-12-14 16:52:40 | Received new block id: 13769382669840184057 height: 6980775 round: 69117 slot: 8069716 reward: 300000000

So, it might be a performance issue but as in the starting post explained it happened that I didn't get new blocks within 40 minutes. It seems to vary.

@diego-G
Copy link

diego-G commented Dec 18, 2018

@punkr0ck do you use any custom configuration? May you please paste it here?

@diego-G
Copy link

diego-G commented Dec 19, 2018

Reading the logs, I tend to relate this issue to the sync process because it's not being triggered. Two PRs recently cleaned, fix and improve the synchronisation process. They were merged into 1.4.0-rc.0. Please @punkr0ck , try to reproduce it upgrading your node to the new release candidate. If you are able to reproduce it, reopened this issue.
Close by #2555 & #2607

@diego-G diego-G closed this as completed Dec 19, 2018
@punkr0ck
Copy link
Author

punkr0ck commented Dec 20, 2018

I reproduced it again after the first reload with teh latest version 1.4.0-rc.0: https://pastebin.com/MN10Y78q
No new block after reload for 5 minutes now.

[inf] 2018-12-19 13:24:45 | Releasing enqueued broadcasts
[inf] 2018-12-19 13:24:45 | Queue empty
[inf] 2018-12-19 13:24:50 | Releasing enqueued broadcasts
[inf] 2018-12-19 13:24:50 | Queue empty
[inf] 2018-12-19 13:24:55 | Releasing enqueued broadcasts
[inf] 2018-12-19 13:24:55 | Queue empty
[WRN] 2018-12-19 13:25:00 | Discarded block that does not match with current chain: 7768623374994220208 height: 7022567 round: 69531 slot: 8111670 generator: a6d8d164261a16be65b06325f5f2cc0cfc9562b1abc264939ad80ebad5401452
[WRN] 2018-12-19 13:25:00 | Discarded block that does not match with current chain: 7768623374994220208 height: 7022567 round: 69531 slot: 8111670 generator: a6d8d164261a16be65b06325f5f2cc0cfc9562b1abc264939ad80ebad5401452
[inf] 2018-12-19 13:25:00 | Releasing enqueued broadcasts
[inf] 2018-12-19 13:25:00 | Queue empty
[inf] 2018-12-19 13:25:04 | INSERT peer - 149.28.241.114:7001 success
[inf] 2018-12-19 13:25:05 | Releasing enqueued broadcasts
[inf] 2018-12-19 13:25:05 | Queue empty
[inf] 2018-12-19 13:25:09 | INSERT peer - 87.98.234.61:7001 success
[inf] 2018-12-19 13:25:10 | Releasing enqueued broadcasts
[inf] 2018-12-19 13:25:10 | Queue empty
[WRN] 2018-12-19 13:25:10 | Discarded block that does not match with current chain: 10245240082677178737 height: 7022568 round: 69531 slot: 8111671 generator: 8100bb532f85183cbf6e99f9fcc5893f45536fbc9dbf7a80cfe1c1e165203ec7
[inf] 2018-12-19 13:25:15 | Releasing enqueued broadcasts
[inf] 2018-12-19 13:25:15 | Queue empty
[inf] 2018-12-19 13:25:20 | Releasing enqueued broadcasts
[inf] 2018-12-19 13:25:20 | Queue empty
[WRN] 2018-12-19 13:25:20 | Discarded block that does not match with current chain: 15782196282065557493 height: 7022569 round: 69531 slot: 8111672 generator: 3697a4f8c74cb21949eec31fddde190c16ab2497709fb503c567d3a9e6a6e989
[WRN] 2018-12-19 13:25:20 | Discarded block that does not match with current chain: 15782196282065557493 height: 7022569 round: 69531 slot: 8111672 generator: 3697a4f8c74cb21949eec31fddde190c16ab2497709fb503c567d3a9e6a6e989

After 9 minutes I got the first new block. I guess it's really a problem with the VPS performance but it's just a guess.

@punkr0ck
Copy link
Author

As you can see, everything is peaked while realoading Lisk Core. After it got new blocks the stats are back to normal. Node version is 8.14.0. Lisk version is 1.4.0 rc.0.

System: Ubuntu 16.04, 1 vCore, 1 GB RAM

grafik

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants