Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block sync silently stalls on Schlesi #1740

Closed
q9f opened this issue May 8, 2020 · 10 comments · Fixed by #1779
Closed

Block sync silently stalls on Schlesi #1740

q9f opened this issue May 8, 2020 · 10 comments · Fixed by #1779

Comments

@q9f
Copy link
Contributor

q9f commented May 8, 2020

Description

As an User, I want to sync Schlesi.

Steps to Reproduce (Bug)

I have a hunch that this might be similar to sigp/lighthouse#949

Because around 5 am my local IP is changed by my ISP.

Expected behavior: [What you expect to happen]
It should behave. It should not stop receiving new blocks. It should maybe do some checks on the connections, I have no clue what it could be.

Actual behavior: [What actually happens]
Block processing silently stalls:

05:17:04.205 INFO  - Slot Event *** Slot: 77185, Block: 0x53c4..68a6, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:17:08.139 INFO  - New AggregateAndProof with block root:  0x53c4581ca1d52e0bc31ebae190dc6a66d756fb2fb9c170af0e45b4a3a38e68a6 detected.
05:17:08.199 INFO  - New AggregateAndProof with block root:  0x53c4581ca1d52e0bc31ebae190dc6a66d756fb2fb9c170af0e45b4a3a38e68a6 detected.
05:17:16.205 INFO  - Slot Event *** Slot: 77186, Block: 0xcc13..4213, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:17:20.137 INFO  - New AggregateAndProof with block root:  0xcc1396855b0763d56403ea33d7a219faf84af8119d4e8490c5e89b79ad894213 detected.
05:17:28.206 INFO  - Slot Event *** Slot: 77187, Block: 0xdf75..f4b5, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:17:32.127 INFO  - New AggregateAndProof with block root:  0xdf7539104ae2cb5968ad15a0c13489f147bf42a7ad8973831d9dff9d2ac8f4b5 detected.
05:17:32.187 INFO  - New AggregateAndProof with block root:  0xdf7539104ae2cb5968ad15a0c13489f147bf42a7ad8973831d9dff9d2ac8f4b5 detected.
05:17:40.205 INFO  - Slot Event *** Slot: 77188, Block: 0xbbf4..908c, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:17:44.158 INFO  - New AggregateAndProof with block root:  0xbbf47b0de224c250863ba031c3b958dc917b593b4ce1d521e100da8623c9908c detected.
05:17:44.219 INFO  - New AggregateAndProof with block root:  0xbbf47b0de224c250863ba031c3b958dc917b593b4ce1d521e100da8623c9908c detected.
05:17:52.205 INFO  - Slot Event *** Slot: 77189, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:17:56.133 INFO  - New AggregateAndProof with block root:  0xbbf47b0de224c250863ba031c3b958dc917b593b4ce1d521e100da8623c9908c detected.
05:17:56.192 INFO  - New AggregateAndProof with block root:  0xbbf47b0de224c250863ba031c3b958dc917b593b4ce1d521e100da8623c9908c detected.
05:18:04.205 INFO  - Slot Event *** Slot: 77190, Block: 0xf7c7..47f9, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:18:08.149 INFO  - New AggregateAndProof with block root:  0xf7c7a7cb42a838a6016fb9ecf643c3fbe37e00281c30c416def73eae758b47f9 detected.
05:18:08.210 INFO  - New AggregateAndProof with block root:  0xf7c7a7cb42a838a6016fb9ecf643c3fbe37e00281c30c416def73eae758b47f9 detected.
05:18:16.205 INFO  - Slot Event *** Slot: 77191, Block: 0xa935..b676, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:18:20.131 INFO  - New AggregateAndProof with block root:  0xa93578f7fa37de590aabf0ad0816c993ad8f3f03e04304f350629879a061b676 detected.
05:18:20.335 INFO  - New AggregateAndProof with block root:  0xa93578f7fa37de590aabf0ad0816c993ad8f3f03e04304f350629879a061b676 detected.
05:18:28.205 INFO  - Slot Event *** Slot: 77192, Block: 0x3771..bb7c, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:18:32.160 INFO  - New AggregateAndProof with block root:  0x377127b2fa24d2133a0cd49412e49510af53ba0d3f68f6d2ece18d53aa34bb7c detected.
05:18:32.267 INFO  - New AggregateAndProof with block root:  0x377127b2fa24d2133a0cd49412e49510af53ba0d3f68f6d2ece18d53aa34bb7c detected.
05:18:40.205 INFO  - Slot Event *** Slot: 77193, Block: 0x1d8f..42e9, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:18:44.152 INFO  - New AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.
05:18:44.215 INFO  - New AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.
05:18:44.355 INFO  - New AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.
05:18:44.417 INFO  - New AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.
05:18:52.206 INFO  - Slot Event *** Slot: 77194, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:18:56.127 INFO  - New AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.
05:18:56.187 INFO  - New AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.
05:19:04.206 INFO  - Slot Event *** Slot: 77195, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:19:16.206 INFO  - Slot Event *** Slot: 77196, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:19:28.205 INFO  - Slot Event *** Slot: 77197, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:19:40.205 INFO  - Slot Event *** Slot: 77198, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:19:52.205 INFO  - Slot Event *** Slot: 77199, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:20:04.205 INFO  - Slot Event *** Slot: 77200, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:20:16.205 INFO  - Slot Event *** Slot: 77201, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:20:28.205 INFO  - Slot Event *** Slot: 77202, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:20:40.205 INFO  - Slot Event *** Slot: 77203, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:20:52.206 INFO  - Slot Event *** Slot: 77204, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:21:04.205 INFO  - Slot Event *** Slot: 77205, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:21:16.205 INFO  - Slot Event *** Slot: 77206, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:21:28.205 INFO  - Slot Event *** Slot: 77207, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:21:40.205 INFO  - Slot Event *** Slot: 77208, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:21:52.205 INFO  - Slot Event *** Slot: 77209, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:22:04.206 INFO  - Slot Event *** Slot: 77210, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:22:16.206 INFO  - Slot Event *** Slot: 77211, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:22:28.205 INFO  - Slot Event *** Slot: 77212, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:22:40.205 INFO  - Slot Event *** Slot: 77213, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:22:52.205 INFO  - Slot Event *** Slot: 77214, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:23:04.205 INFO  - Slot Event *** Slot: 77215, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:23:12.205 INFO  - Epoch Event *** Current epoch: 2413, Justified epoch: 2411, Finalized epoch: 2410, Finalized root: 0x08a2..5e03
05:23:16.206 INFO  - Slot Event *** Slot: 77216, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:23:28.205 INFO  - Slot Event *** Slot: 77217, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:23:40.205 INFO  - Slot Event *** Slot: 77218, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:23:52.205 INFO  - Slot Event *** Slot: 77219, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:24:04.206 INFO  - Slot Event *** Slot: 77220, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:24:16.206 INFO  - Slot Event *** Slot: 77221, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:24:28.205 INFO  - Slot Event *** Slot: 77222, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:24:40.205 INFO  - Slot Event *** Slot: 77223, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:24:52.206 INFO  - Slot Event *** Slot: 77224, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:25:04.205 INFO  - Slot Event *** Slot: 77225, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:25:16.205 INFO  - Slot Event *** Slot: 77226, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:25:28.205 INFO  - Slot Event *** Slot: 77227, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:25:40.206 INFO  - Slot Event *** Slot: 77228, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:25:52.206 INFO  - Slot Event *** Slot: 77229, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:26:04.205 INFO  - Slot Event *** Slot: 77230, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:26:16.205 INFO  - Slot Event *** Slot: 77231, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:26:28.205 INFO  - Slot Event *** Slot: 77232, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:26:40.205 INFO  - Slot Event *** Slot: 77233, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:26:52.205 INFO  - Slot Event *** Slot: 77234, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:27:04.205 INFO  - Slot Event *** Slot: 77235, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:27:16.206 INFO  - Slot Event *** Slot: 77236, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:27:28.205 INFO  - Slot Event *** Slot: 77237, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:27:40.205 INFO  - Slot Event *** Slot: 77238, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4
05:27:52.205 INFO  - Slot Event *** Slot: 77239, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4

Frequency: [What percentage of the time does it occur?]
Seen this once

Versions (Add all that apply)

  • Software version: master @ ef496fe
  • Java version: 13.0.2
  • OS Name & Version: Arch Linux
  • Kernel Version: 5.6.5-arch3-1
@ajsutton
Copy link
Contributor

ajsutton commented May 8, 2020

Could you please check the teku.log file and see if there's an OutOfMemoryException logged? It could well be network related and we stopped receiving gossip but I'm also hunting down a memory leak and in the cases I've seen it hit, we don't sync anymore blocks - not sure what if it could also cause that when the node is in sync.

@q9f
Copy link
Contributor Author

q9f commented May 8, 2020

There is nothing in the log either.

2020-05-08 05:17:49.038+02:00 | nioEventLoopGroup-3-40 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:17:52.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77189, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:17:56.133+02:00 | events-27 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0xbbf47b0de224c250863ba031c3b958dc917b593b4ce1d521e100da8623c9908c detected.ESC[0m
2020-05-08 05:17:56.192+02:00 | events-29 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0xbbf47b0de224c250863ba031c3b958dc917b593b4ce1d521e100da8623c9908c detected.ESC[0m
2020-05-08 05:18:04.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77190, Block: 0xf7c7..47f9, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:18:08.149+02:00 | events-29 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0xf7c7a7cb42a838a6016fb9ecf643c3fbe37e00281c30c416def73eae758b47f9 detected.ESC[0m
2020-05-08 05:18:08.210+02:00 | events-27 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0xf7c7a7cb42a838a6016fb9ecf643c3fbe37e00281c30c416def73eae758b47f9 detected.ESC[0m
2020-05-08 05:18:09.046+02:00 | nioEventLoopGroup-3-41 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:18:16.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77191, Block: 0xa935..b676, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:18:20.131+02:00 | events-27 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0xa93578f7fa37de590aabf0ad0816c993ad8f3f03e04304f350629879a061b676 detected.ESC[0m
2020-05-08 05:18:20.335+02:00 | events-28 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0xa93578f7fa37de590aabf0ad0816c993ad8f3f03e04304f350629879a061b676 detected.ESC[0m
2020-05-08 05:18:28.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77192, Block: 0x3771..bb7c, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:18:29.053+02:00 | nioEventLoopGroup-3-42 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:18:32.160+02:00 | events-28 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0x377127b2fa24d2133a0cd49412e49510af53ba0d3f68f6d2ece18d53aa34bb7c detected.ESC[0m
2020-05-08 05:18:32.267+02:00 | events-29 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0x377127b2fa24d2133a0cd49412e49510af53ba0d3f68f6d2ece18d53aa34bb7c detected.ESC[0m
2020-05-08 05:18:40.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77193, Block: 0x1d8f..42e9, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:18:44.152+02:00 | events-27 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.ESC[0m
2020-05-08 05:18:44.215+02:00 | events-28 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.ESC[0m
2020-05-08 05:18:44.355+02:00 | events-27 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.ESC[0m
2020-05-08 05:18:44.417+02:00 | events-28 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.ESC[0m
2020-05-08 05:18:49.059+02:00 | nioEventLoopGroup-3-43 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:18:52.206+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77194, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:18:56.127+02:00 | events-27 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.ESC[0m
2020-05-08 05:18:56.187+02:00 | events-28 | INFO  | teku-event-log | ESC[34mNew AggregateAndProof with block root:  0x1d8fde3613c26d581444ac609fbb00c039be8e3bab34bb908a3e7b74de2442e9 detected.ESC[0m
2020-05-08 05:19:04.206+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77195, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:19:09.066+02:00 | nioEventLoopGroup-3-44 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:19:16.206+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77196, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:19:28.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77197, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:19:29.074+02:00 | nioEventLoopGroup-3-45 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:19:40.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77198, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:19:49.081+02:00 | nioEventLoopGroup-3-46 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:19:52.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77199, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:20:04.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77200, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:20:09.088+02:00 | nioEventLoopGroup-3-47 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:20:16.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77201, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:20:28.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77202, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:20:29.095+02:00 | nioEventLoopGroup-3-48 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:20:40.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77203, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:20:49.103+02:00 | nioEventLoopGroup-3-49 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:20:52.206+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77204, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:21:04.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77205, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:21:09.110+02:00 | nioEventLoopGroup-3-50 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:21:16.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77206, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:21:28.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77207, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:21:29.117+02:00 | nioEventLoopGroup-3-51 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:21:40.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77208, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:21:49.124+02:00 | nioEventLoopGroup-3-52 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:21:52.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77209, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:22:04.206+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77210, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:22:09.131+02:00 | nioEventLoopGroup-3-53 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:22:16.206+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77211, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:22:28.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77212, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:22:29.138+02:00 | nioEventLoopGroup-3-54 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:22:40.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77213, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:22:49.145+02:00 | nioEventLoopGroup-3-55 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:22:52.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77214, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:23:04.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77215, Block:  x ... empty, Epoch: 2412, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:23:09.153+02:00 | nioEventLoopGroup-3-56 | ERROR | NoiseXXSecureChannel | Protocol message contained an invalid tag (zero).
2020-05-08 05:23:12.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[33mEpoch Event *** Current epoch: 2413, Justified epoch: 2411, Finalized epoch: 2410, Finalized root: 0x08a2..5e03ESC[0m
2020-05-08 05:23:16.206+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77216, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m
2020-05-08 05:23:28.205+02:00 | TimeTickChannel-0 | INFO  | teku-event-log | ESC[37mSlot Event *** Slot: 77217, Block:  x ... empty, Epoch: 2413, Finalized Epoch: 2410, Finalized Root: 0x08a2..5e03, Peers: 4ESC[0m

Notably, a restart allowed the node to recover and sync.

@ajsutton
Copy link
Contributor

ajsutton commented May 9, 2020

I believe I've reproduced this. Teku was running inside docker on my Mac when I disconnected the ethernet cable so networking had to switch to wifi. In this case there were a number of error messages in the log like:

2020-05-09 22:51:56.429+00:00 | nioEventLoopGroup-3-2 | ERROR | SafeFuture | Unhandled exception
io.netty.channel.ExtendedClosedChannelException: null
        at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown Source) ~[netty-all-4.1.36.Final
.jar:4.1.36.Final]

@Nashatyrev any chance you could look into this from the jvm-libp2p side and see if it's not detecting peers disconnecting because of external network changes?

@Nashatyrev
Copy link
Contributor

@ajsutton Ok, will try to check that on libp2p level. 👍

@Nashatyrev
Copy link
Contributor

One potential reason is that in similar cases TCP connection might not be closed on OS level (TCP FIN packet is not received) and hang for some long period. That's generally why on the application level some kind of PING is periodically sent and the connection is terminated when no data was received for some period ('read timeout').

@ajsutton @mbaxter from the first glance I couldn't find any matching code. Is this still to be done?

@mbaxter
Copy link
Contributor

mbaxter commented May 12, 2020

@Nashatyrev - right, this isn't implemented yet. I can add a ticket to implement this functionality using the new PING rpc method.

@ajsutton
Copy link
Contributor

We now periodically send PING (I think) so if we disconnect the peer if we don't get a response that should be all we need (may want to provide some leniency to only disconnect if 2 or 3 responses are missed in a row though)

@q9f
Copy link
Contributor Author

q9f commented May 13, 2020

Now it's getting more complicated, I keep losing peers on Schlesi:

13:46:16.235 INFO  - Slot Event *** Slot: 115731, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 3
13:46:28.235 INFO  - Slot Event *** Slot: 115732, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 0
13:46:40.236 INFO  - Slot Event *** Slot: 115733, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 3
13:46:52.236 INFO  - Slot Event *** Slot: 115734, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 0
13:47:04.235 INFO  - Slot Event *** Slot: 115735, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 3
13:47:16.235 INFO  - Slot Event *** Slot: 115736, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 3
13:47:28.236 INFO  - Slot Event *** Slot: 115737, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 0
13:47:40.236 INFO  - Slot Event *** Slot: 115738, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 3
13:47:52.234 INFO  - Slot Event *** Slot: 115739, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 0
13:48:04.235 INFO  - Slot Event *** Slot: 115740, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 3
13:48:16.235 INFO  - Slot Event *** Slot: 115741, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 3
13:48:28.236 INFO  - Slot Event *** Slot: 115742, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 0
13:48:40.235 INFO  - Slot Event *** Slot: 115743, Block:    ... empty, Epoch: 3616, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 3
13:48:48.234 INFO  - Epoch Event *** Epoch: 3617, Justified checkpoint: 3612, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f
13:48:52.236 INFO  - Slot Event *** Slot: 115744, Block:    ... empty, Epoch: 3617, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 0
13:49:04.235 INFO  - Slot Event *** Slot: 115745, Block:    ... empty, Epoch: 3617, Finalized checkpoint: 3611, Finalized root: 0x38ff..8e3f, Peers: 0
13:49:12.235 INFO  - Sync Event *** Current slot: 115746, Head slot: 115616, Connected peers: 3
13:49:24.246 INFO  - Sync Event *** Current slot: 115747, Head slot: 115652, Connected peers: 0
13:49:40.236 INFO  - Slot Event *** Slot: 115748, Block:    ... empty, Epoch: 3617, Finalized checkpoint: 3615, Finalized root: 0x8101..382b, Peers: 3
13:49:52.238 INFO  - Slot Event *** Slot: 115749, Block:    ... empty, Epoch: 3617, Finalized checkpoint: 3615, Finalized root: 0x8101..382b, Peers: 3
13:50:04.236 INFO  - Slot Event *** Slot: 115750, Block:    ... empty, Epoch: 3617, Finalized checkpoint: 3615, Finalized root: 0x8101..382b, Peers: 0
13:50:16.237 INFO  - Slot Event *** Slot: 115751, Block: 0x3874..7766, Epoch: 3617, Finalized checkpoint: 3615, Finalized root: 0x8101..382b, Peers: 3
13:50:28.235 INFO  - Slot Event *** Slot: 115752, Block:    ... empty, Epoch: 3617, Finalized checkpoint: 3615, Finalized root: 0x8101..382b, Peers: 0
13:50:40.235 INFO  - Slot Event *** Slot: 115753, Block:    ... empty, Epoch: 3617, Finalized checkpoint: 3615, Finalized root: 0x8101..382b, Peers: 3
13:50:52.234 INFO  - Slot Event *** Slot: 115754, Block:    ... empty, Epoch: 3617, Finalized checkpoint: 3615, Finalized root: 0x8101..382b, Peers: 3
13:51:04.234 INFO  - Slot Event *** Slot: 115755, Block:    ... empty, Epoch: 3617, Finalized checkpoint: 3615, Finalized root: 0x8101..382b, Peers: 0

Check the peercount. Also, it barely receives blocks in real-time.

Edit: master @ 2c03585

@q9f
Copy link
Contributor Author

q9f commented May 13, 2020

One of my bootnodes can't synchronize anymore.

14:07:45.976 INFO  - Initializing storage                                                                                                                                                                                                                                                                           [36/1946]
14:08:51.937 INFO  - Storage initialization complete                                                                                                          
14:08:57.056 INFO  - Listening for connections on: /ip4/51.15.70.7/tcp/9500/p2p/16Uiu2HAmQ7fRM64ddh1hLfwiXhyB9EWs9RpCpSdoesL9tcHCHGZv                         
14:08:58.433 INFO  - Local ENR: enr:-KG4QO5d0SAxNIKqk5Q2GL0pUT5uZ6lybRxAjeK6RY09H7gNW1AllL844mi21sNIe_8HXqgIcp5gEnahNyqDcgv1mi0ChGV0aDKQmSXv1gAAAAD__________4JpZIJ2NIJpcIQzD0YHiXNlY3AyNTZrMaEDqkXVcXWMknCjh6ynMHNWHHDsBxRS7S8o3k5UmNgbcQGDdGNwgiUcg3VkcIIlHA                                                               
14:09:05.143 INFO  - Slot Event *** Slot: 115844, Block:    ... empty, Epoch: 3620, Finalized checkpoint: 3236, Finalized root: 0xfc0d..3910, Peers: 2        
14:09:09.433 INFO  - Sync Event *** Current slot: 115845, Head slot: 103644, Connected peers: 2                                                               
14:09:12.496 INFO  - Sync Event *** Current slot: 115846, Head slot: 103654, Connected peers: 2                                                               
14:09:24.356 INFO  - Sync Event *** Current slot: 115847, Head slot: 103710, Connected peers: 2
14:09:36.215 INFO  - Sync Event *** Current slot: 115848, Head slot: 103734, Connected peers: 8   
14:09:48.213 INFO  - Sync Event *** Current slot: 115849, Head slot: 103736, Connected peers: 4                                                               
14:10:00.223 INFO  - Sync Event *** Current slot: 115850, Head slot: 103736, Connected peers: 4                                                               
14:10:12.213 INFO  - Sync Event *** Current slot: 115851, Head slot: 103741, Connected peers: 3                                                               
14:10:24.215 INFO  - Sync Event *** Current slot: 115852, Head slot: 103741, Connected peers: 3    

[...]

14:15:00.213 INFO  - Sync Event *** Current slot: 115875, Head slot: 103741, Connected peers: 3
14:15:12.213 INFO  - Sync Event *** Current slot: 115876, Head slot: 103741, Connected peers: 3
14:15:24.220 INFO  - Sync Event *** Current slot: 115877, Head slot: 103741, Connected peers: 3
14:15:36.214 INFO  - Sync Event *** Current slot: 115878, Head slot: 103741, Connected peers: 3
14:15:48.213 INFO  - Sync Event *** Current slot: 115879, Head slot: 103741, Connected peers: 3
14:16:00.214 INFO  - Sync Event *** Current slot: 115880, Head slot: 103741, Connected peers: 3
14:16:12.214 INFO  - Sync Event *** Current slot: 115881, Head slot: 103741, Connected peers: 3
14:16:24.213 INFO  - Sync Event *** Current slot: 115882, Head slot: 103741, Connected peers: 3
14:16:36.213 INFO  - Sync Event *** Current slot: 115883, Head slot: 103741, Connected peers: 3
14:16:48.213 INFO  - Sync Event *** Current slot: 115884, Head slot: 103741, Connected peers: 3
14:17:00.216 INFO  - Sync Event *** Current slot: 115885, Head slot: 103741, Connected peers: 3
14:17:12.216 INFO  - Sync Event *** Current slot: 115886, Head slot: 103741, Connected peers: 3
14:17:24.214 INFO  - Sync Event *** Current slot: 115887, Head slot: 103741, Connected peers: 3

@q9f
Copy link
Contributor Author

q9f commented May 15, 2020

Thanks, my nodes recovered after applying #1779 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants