Fixed reconnection issue in the dev cluster with AWS cluster #1915

xgreenx · 2024-05-29T22:19:55Z

This change improves and simplifies some aspects of the P2P service. It also fixes the issue of not reconnecting to the reserved nodes when the reserved node is restarted and got new IP.

The change moves the reconnection handling into the PeerReport behavior. It removes ping-ponging reserved peers between the primary behavior and the PeerReport behavior and encapsulates the logic inside the PeerReport. Also, it eliminates the timer and replaces it with the queue of reconnections, reducing noise in logs(before, we had much more trash errors).
Added logs for cases when the dial fails. They are very helpful to debug issues with connection.
Simplified initialization of the ConnectionTracker and FuelAuthenticated. It allows the reuse of libp2p built-in connections builder.
Removed the usage of the Mplex since it doesn't have a backpressure mechanism. Now we use Yamux by default. It is breaking the change since nodes with the old codebase can't connect to new nodes.
Propagated max_concurrent_streams for request-response protocol.

Checklist

Breaking changes are clearly marked as such in the PR description and changelog

Before requesting review

I have reviewed the code myself

…It also fixes the issue of not reconnecting to the reserved nodes when the reserved node is restarted and a new IP. - The change moves the reconnection handling into the `PeerReport` behavior. It removes ping-ponging reserved peers between the primary behavior and the `PeerReport` behavior and encapsulates the logic inside the `PeerReport`. Also, it eliminates the timer and replaces it with the queue of reconnections, reducing noise in logs(before, we had much more trash errors). - Added logs for cases when the dial fails. They are very helpful to debug issues with connection. - Simplified initialization of the `ConnectionTracker` and `FuelAuthenticated`. It allows the reuse of libp2p built-in connections builder. - Removed the usage of the Mplex since it doesn't have a backpressure mechanism. Now we use Yamux by default. It is breaking the change since nodes with the old codebase can't connect to new nodes. - Propagated `max_concurrent_streams` for yamux.

…ction-issue

xgreenx · 2024-05-29T22:21:32Z

crates/services/p2p/src/p2p_service.rs

+            .with_tcp(
+                tcp_config,
+                transport_function,
+                libp2p::yamux::Config::default,


Breaking change: We don't support Mplex anymore.

xgreenx · 2024-05-29T22:30:00Z

crates/services/p2p/src/peer_report.rs

+            if instant.elapsed() > Duration::from_secs(HEALTH_CHECK_INTERVAL_IN_SECONDS) {
+                let peer_id = *peer_id;
+                self.reserved_nodes_to_connect.pop_front();
+                let multiaddrs = self


The actual fix for the main reconnection problem that I faced with the dev cluster. The initial DNS address was replaced with a real IP, but when the sentry crashed, the IP was another one. Using initial multiaddrs here allows you to reconnect and get a new IP again.

We should probably have a comment somewhere here explaining this.

…ction-issue

…ssue' into feature/fixed-p2p-reconnection-issue

@xgreenx

## Version v0.28.0 ### Changed - [#1934](#1934): Updated benchmark for the `aloc` opcode to be `DependentCost`. Updated `vm_initialization` benchmark to exclude growing of memory(It is handled by VM reuse). - [#1916](#1916): Speed up synchronisation of the blocks for the `fuel-core-sync` service. - [#1888](#1888): optimization: Reuse VM memory across executions. #### Breaking - [#1934](#1934): Changed `GasCosts` endpoint to return `DependentCost` for the `aloc` opcode via `alocDependentCost`. - [#1934](#1934): Updated default gas costs for the local testnet configuration. All opcodes became cheaper. - [#1924](#1924): `dry_run_opt` has new `gas_price: Option<u64>` argument - [#1888](#1888): Upgraded `fuel-vm` to `0.51.0`. See [release](https://github.com/FuelLabs/fuel-vm/releases/tag/v0.51.0) for more information. ### Added - [#1939](#1939): Added API functions to open a RocksDB in different modes. - [#1929](#1929): Added support of customization of the state transition version in the `ChainConfig`. ### Removed - [#1913](#1913): Removed dead code from the project. ### Fixed - [#1921](#1921): Fixed unstable `gossipsub_broadcast_tx_with_accept` test. - [#1915](#1915): Fixed reconnection issue in the dev cluster with AWS cluster. - [#1914](#1914): Fixed halting of the node during synchronization in PoA service. ## What's Changed * Removed dead code by @xgreenx in #1913 * Added backward and forward compatibility integration tests for forkless upgrades by @xgreenx in #1895 * Fixed halting of the node in rare conditions by @xgreenx in #1914 * Weekly `cargo update` by @github-actions in #1928 * Fixed logging of the WASM executor by @xgreenx in #1930 * Added support of customization of the state transition version in the `ChainConfig` by @xgreenx in #1929 * Document wasm toolchain installation, add rust-toolchain.toml by @Dentosal in #1932 * Add optional `gas_price` argument to `dry_run_opt` by @hal3e in #1924 * Reuse VM memory across executions by @Dentosal in #1888 * Fixed reconnection issue in the dev cluster with AWS cluster by @xgreenx in #1915 * Speeds up synchronisation of the blocks for the `fuel-core-sync` service by @xgreenx in #1916 * Fixed unstable `gossipsub_broadcast_tx_with_accept` test by @xgreenx in #1921 * Added API functions to open a RocksDB in different modes by @xgreenx in #1939 * Use `DependentCost` for `aloc` opcode by @xgreenx in #1934 ## New Contributors * @hal3e made their first contribution in #1924 **Full Changelog**: v0.27.0...v0.28.0

xgreenx added 5 commits May 29, 2024 22:50

Removed dead code

59bd7dc

Updated CHANGELOG.md

5ba7821

Fixed halting of the node in rare conditions

cfc3029

Updated CHANGELOG.md

e27ac53

xgreenx added the breaking A breaking api change label May 29, 2024

xgreenx requested a review from a team May 29, 2024 22:19

xgreenx self-assigned this May 29, 2024

xgreenx and others added 3 commits May 30, 2024 00:31

Updated CHANGELOG.md

fb1b8fd

Merge branch 'master' into feature/fixed-dead-lock

22acbaa

Merge branch 'feature/fixed-dead-lock' into feature/fixed-p2p-reconne…

54c9852

…ction-issue

xgreenx commented May 29, 2024

View reviewed changes

xgreenx added 2 commits May 30, 2024 00:59

Merge branch 'master' into feature/fixed-dead-lock

a9a1533

Merge branch 'feature/fixed-dead-lock' into feature/fixed-p2p-reconne…

7df6235

…ction-issue

Base automatically changed from feature/fixed-dead-lock to master May 30, 2024 07:47

xgreenx and others added 6 commits May 30, 2024 09:48

Merge branch 'master' into feature/fixed-p2p-reconnection-issue

47b5c07

Merge branch 'master' into feature/fixed-p2p-reconnection-issue

6b8e671

Merge branch 'master' into feature/fixed-p2p-reconnection-issue

324be3f

Merge branch 'master' into feature/fixed-p2p-reconnection-issue

e9da8b9

Added comments

54d4fd5

Merge remote-tracking branch 'origin/feature/fixed-p2p-reconnection-i…

290fe5f

…ssue' into feature/fixed-p2p-reconnection-issue

xgreenx requested a review from Dentosal June 3, 2024 11:56

xgreenx and others added 3 commits June 3, 2024 15:18

Merge branch 'master' into feature/fixed-p2p-reconnection-issue

94fc5b3

Merge branch 'master' into feature/fixed-p2p-reconnection-issue

664738c

Merge branch 'master' into feature/fixed-p2p-reconnection-issue

c8047e9

Dentosal approved these changes Jun 4, 2024

View reviewed changes

xgreenx enabled auto-merge (squash) June 4, 2024 18:49

xgreenx merged commit 0a1d591 into master Jun 4, 2024
28 checks passed

xgreenx deleted the feature/fixed-p2p-reconnection-issue branch June 4, 2024 19:12

xgreenx mentioned this pull request Jun 6, 2024

Release v0.28.0 #1945

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed reconnection issue in the dev cluster with AWS cluster #1915

Fixed reconnection issue in the dev cluster with AWS cluster #1915

xgreenx commented May 29, 2024 •

edited

Loading

xgreenx May 29, 2024

xgreenx May 29, 2024

Dentosal Jun 3, 2024

Fixed reconnection issue in the dev cluster with AWS cluster #1915

Fixed reconnection issue in the dev cluster with AWS cluster #1915

Conversation

xgreenx commented May 29, 2024 • edited Loading

Checklist

Before requesting review

xgreenx May 29, 2024

Choose a reason for hiding this comment

xgreenx May 29, 2024

Choose a reason for hiding this comment

Dentosal Jun 3, 2024

Choose a reason for hiding this comment

xgreenx commented May 29, 2024 •

edited

Loading