Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network debug page assumes that debug RPC port is 3030 #8583

Closed
saketh-are opened this issue Feb 16, 2023 · 3 comments
Closed

Network debug page assumes that debug RPC port is 3030 #8583

saketh-are opened this issue Feb 16, 2023 · 3 comments

Comments

@saketh-are
Copy link
Collaborator

saketh-are commented Feb 16, 2023

The NetworkInfo debug page assumes that the debug RPC port for connected peers is always 3030, and that the RPC address is the same as the node's network address:

href: "http://" + peer_addr.replace(/:.*/, ":3030/debug"),

Localnet nodes spawned using nearup work differently (link):

data['rpc']['addr'] = f'0.0.0.0:{3030 + num_nodes}'
data['network']['addr'] = f'0.0.0.0:{24567 + num_nodes}'

The NetworkInfo page should be modified to handle localnet setups correctly.

This new debug UI will need to be updated as well:

href={"http://localhost:3000/" + peer_addr.replace(/:.*/, "/")}>

@ghost
Copy link

ghost commented Mar 31, 2023

The main issue is that the href (link that appears once you click on it), will always go to 3030/debug instead of the proper values of 3031/debug, 3032/debug etc.

@ghost ghost self-assigned this Apr 1, 2023
@ghost
Copy link

ghost commented Apr 4, 2023

20230404: Debug Page RPC Port

Overview

Github Issue

Code Commit

Google Doc

Problem Statement

Debug pages work by a separate web server making a RPC call to the node it is on. However, it is currently hard-coding the RPC address to 3030, which does not work for localnet. As a result, developers have to manually type the RPC address to view the debug pages from each node’s perspective of debugging information in localnet.

Background: Network Address and RPC Address

There are currently 2 types of addresses

  • Network address
    • These are the socket address used by each node in the network to communicate with each other for Near Protocol to run
  • RPC address
    • These are the socket address used by clients of Near Protocol to use Near’s API

Code assumes each node always uses port numbers:

  • 3030 as the rpc address
  • 24567 as the network address

However, this assumption is only practical if each node runs on a separate computer as they do not share port number space.

For localnet, we run multiple processes on the same computer, hence, the ports cannot all be 3030 or 24567 as it will result in a clash with other processes. Each node on localnet set up their addresses on different ports with a simple increment:

  • RPC address
    • 3030 + nodeNumber, where nodeNumber = {0, 1, 2, ..., number of Nodes - 1)
  • Network address
    • 24567 + nodeNumber, where nodeNumber = {0, 1, 2, ..., number of Nodes - 1)

Technical Challenges

Lack of peer’s RPC address

Detect other node’s RPC: The main issue here is there’s currently no information being sent between nodes on their rpc address as they only share their network addresses

Each node only contains minimal information from other nodes (to reduce network bandwidth)

  • peer identifier (uniqueness guaranteed based on monotonically incrementing counter)
  • network address
  • near account identifier (uniqueness guaranteed during account creation)

Possible Solutions

Feel free to skip and jump straight to the chosen solution: Solution 5

Solution 1: Always Include rpc port in Peer Info

Currently, the PeerInfo does not include rpc port, simply include it in order for other nodes to be aware

  • /// Peer information.
    #[derive(borsh::BorshSerialize, borsh::BorshDeserialize, Clone, Debug, Eq, PartialEq, Hash)]
    pub struct PeerInfo {
    pub id: PeerId,
    pub addr: Option<SocketAddr>,
    pub account_id: Option<AccountId>,
    }

Drawbacks:

  • This will introduce both code and network bandwidth

Solution 2: Optionally Include rpc port in Peer Info

Similar to Solution1, but only include rpc port into PeerInfo as an optional parameter if it’s localnet.

Drawbacks:

  • This will introduce a lot of optional codes that should only run for localnet, and conditional to detect if it’s local net, which isn’t useful for production environments and will complicate the code base.

Solution 3: Expose API for RPC Port

Create a new API to be able to fetch each node’s RPC port in the status code

  • #[derive(serde::Serialize, serde::Deserialize, Debug)]
    pub struct DetailedDebugStatus {
    pub network_info: NetworkInfoView,
    pub sync_status: String,
    pub catchup_status: Vec<CatchupStatusView>,
    pub current_head_status: BlockStatusView,
    pub current_header_head_status: BlockStatusView,
    pub block_production_delay_millis: u64,
    }
    // TODO: add more information to status.
    #[derive(serde::Serialize, serde::Deserialize, Debug)]
    pub struct StatusResponse {
    /// Binary version.
    pub version: Version,
    /// Unique chain id.
    pub chain_id: String,
    /// Currently active protocol version.
    pub protocol_version: u32,
    /// Latest protocol version that this client supports.
    pub latest_protocol_version: u32,
    /// Address for RPC server. None if node doesn’t have RPC endpoint enabled.
    #[serde(skip_serializing_if = "Option::is_none")]
    pub rpc_addr: Option<String>,
    /// Current epoch validators.
    pub validators: Vec<ValidatorInfo>,
    /// Sync status of the node.
    pub sync_info: StatusSyncInfo,
    /// Validator id of the node
    pub validator_account_id: Option<AccountId>,
    /// Public key of the validator.
    pub validator_public_key: Option<PublicKey>,
    /// Public key of the node.
    pub node_public_key: PublicKey,
    /// Deprecated; same as `validator_public_key` which you should use instead.
    pub node_key: Option<PublicKey>,
    /// Uptime of the node.
    pub uptime_sec: i64,
    /// Information about last blocks, network, epoch and chain & chunk info.
    #[serde(skip_serializing_if = "Option::is_none")]
    pub detailed_debug_status: Option<DetailedDebugStatus>,
    }

Also, a new technical challenge would be the chicken-egg problem. The node will need access to another node’s json RPC port in order to call it’s json RPC port.

Drawbacks:

  • This is a lot of effort and not worth it for this task, which is considered low priority.

Solution 4: Hard-code paths to localnet’s configurations

During localnet, all nodes config.json will be available for read on the same computer.

The config.json contains the rpc ports of each node

Drawbacks:

  • This introduces a new technical challenge of figuring out where each node’s config.json are.

Solution 5: Assume default ports are used

Just assume the ports for localnet are always set as:

  • RPC address
    • 3030 + nodeNumber, where nodeNumber = {0, 1, 2, ..., number of Nodes - 1)
  • Network address
    • 24567 + nodeNumber, where nodeNumber = {0, 1, 2, ..., number of Nodes - 1)

Then, perform simple arithmetic to infer a node’s RPC address from it’s network address

// Simple Arithmetic Code
default_network_port = 24567
default_rpc_port = 3030
inferred_peer_rpc_port = default_rpc_port + (peer_network_port - default_network_port)

// Full code
// To mitigate the case where mainnet node could use a different rpc port while maintaining the default port, we set a conditional to only perform the above arithmetic for localnet 
node_num = 0 // first node is 0
if self.ip_address == peer.ip_address
{
  // To mitigate the case where mainnet node could use a different rpc port while maintaining the default port, set a conditional to only perform the above arithmetic for localnet 
  default_network_port = 24567
  default_rpc_port = 3030
  node_num = peer_network_port - default_network_port
}
inferred_peer_rpc_port = default_rpc_port + node_num

Drawbacks:

  • This requires assuming the rpc address and network address are set as such and would break otherwise

Mitigations for drawback

  • This code is mainly for localnet testing on debug pages and does not affect the actual core near protocol. Hence, even if it breaks, a fix would simply be on the front end and does not require a re-deployment of any kind (new debug pages on localnet are able to debug nodes on mainnet
  • Check that it’s localnet by ensuring the ip address (without the port) matches, which is unique to localnet

Tests

I only performed manual testing as there’s no quick way to write unit tests for our current frontend debug pages code. Also, the old debug pages are going to be deprecate so no point investing time into testing the old debug page.

// Ran linter on new debug page (linter not set-up for old debug page)
npm run lint

// Open both old and new debug pages and manually click on the links on network_info page to verify they work
// Start up localnet
cd nearcore
make debug
nearup run localnet --binary-path ~/Github/near/nearcore/target/debug/
// Open old debug page
open http://localhost:3030/debug
// Open new debug page
cd nearcore/tools/debug-ui
npm install
npm start
open http://localhost:3000/localhost:3030/cluster

ghost pushed a commit that referenced this issue Apr 4, 2023
* Fix debug rpc ports localnet #8583

<h1> Problem </h1>
This is only an issue on localnet and not production (mainnet, testnet)
Hence, prefer having this fix within the frontend webserver itself.
Alternatively, could have an optional approach for nodes to fetch rpc
address from each peer. However, this introduces both complexity and
additional bandwidth that will only be useful for localnet testing.

<h1> Solution </h1>
The code currently uses the assumption on the ports that the nodes will
be started in the ranges:
- Network address : 24567 + [0, numNodes - 1]
- RPC address: 3030 + [0, numNodes - 1]
- Reference: https://github.com/near/nearup/blob/0b9a7b60236f3164dd32677b6fa58c531a586200/nearuplib/localnet.py#L93-L94
If this assumption is ever violated in future,
only the debug pages would break and can easily be fixed without affecting the actual core network itself.

<h1> Testing </h1>

- Ran linter

```
npm run lint
```

- Open both old and new debug pages and manually click on the links
```
cd nearcore
make debug
nearup run localnet --binary-path ~/Github/near/nearcore/target/debug/
open http://localhost:3030/debug

cd nearcore/tools/debug-ui
npm install
npm start
open http://localhost:3000/localhost:3030/cluster
```

* Updated typo from 'will be' to 'will are'

* Address 1st code review comments
- Added conditional to only perform arithmetic for localnet nodes.
- Remove any mention of nearup
- use DEFAULT_UPPER_CASE for constants instead of lower_case_assumption
- - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/const
- Remove link to 24567 as irrelevant (used for PING)
- Include comment on how peer_rpc_address aren't shared in PeerInfo

* Updated comments to be more concise

* Fix debug rpc ports localnet #8583

<h1> Problem </h1>
This is only an issue on localnet and not production (mainnet, testnet)
Hence, prefer having this fix within the frontend webserver itself.
Alternatively, could have an optional approach for nodes to fetch rpc
address from each peer. However, this introduces both complexity and
additional bandwidth that will only be useful for localnet testing.

<h1> Solution </h1>
The code currently uses the assumption on the ports that the nodes will
be started in the ranges:
- Network address : 24567 + [0, numNodes - 1]
- RPC address: 3030 + [0, numNodes - 1]
- Reference: https://github.com/near/nearup/blob/0b9a7b60236f3164dd32677b6fa58c531a586200/nearuplib/localnet.py#L93-L94
If this assumption is ever violated in future,
only the debug pages would break and can easily be fixed without affecting the actual core network itself.

<h1> Testing </h1>

- Ran linter

```
npm run lint
```

- Open both old and new debug pages and manually click on the links
```
cd nearcore
make debug
nearup run localnet --binary-path ~/Github/near/nearcore/target/debug/
open http://localhost:3030/debug

cd nearcore/tools/debug-ui
npm install
npm start
open http://localhost:3000/localhost:3030/cluster
```

* Address 1st code review comments
- Added conditional to only perform arithmetic for localnet nodes.
- Remove any mention of nearup
- use DEFAULT_UPPER_CASE for constants instead of lower_case_assumption
- - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/const
- Remove link to 24567 as irrelevant (used for PING)
- Include comment on how peer_rpc_address aren't shared in PeerInfo
- only check 127.0.0.1 for localnet to dive into localnet only code paths
nikurt pushed a commit to nikurt/nearcore that referenced this issue Apr 5, 2023
* Fix debug rpc ports localnet near#8583

<h1> Problem </h1>
This is only an issue on localnet and not production (mainnet, testnet)
Hence, prefer having this fix within the frontend webserver itself.
Alternatively, could have an optional approach for nodes to fetch rpc
address from each peer. However, this introduces both complexity and
additional bandwidth that will only be useful for localnet testing.

<h1> Solution </h1>
The code currently uses the assumption on the ports that the nodes will
be started in the ranges:
- Network address : 24567 + [0, numNodes - 1]
- RPC address: 3030 + [0, numNodes - 1]
- Reference: https://github.com/near/nearup/blob/0b9a7b60236f3164dd32677b6fa58c531a586200/nearuplib/localnet.py#L93-L94
If this assumption is ever violated in future,
only the debug pages would break and can easily be fixed without affecting the actual core network itself.

<h1> Testing </h1>

- Ran linter

```
npm run lint
```

- Open both old and new debug pages and manually click on the links
```
cd nearcore
make debug
nearup run localnet --binary-path ~/Github/near/nearcore/target/debug/
open http://localhost:3030/debug

cd nearcore/tools/debug-ui
npm install
npm start
open http://localhost:3000/localhost:3030/cluster
```

* Updated typo from 'will be' to 'will are'

* Address 1st code review comments
- Added conditional to only perform arithmetic for localnet nodes.
- Remove any mention of nearup
- use DEFAULT_UPPER_CASE for constants instead of lower_case_assumption
- - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/const
- Remove link to 24567 as irrelevant (used for PING)
- Include comment on how peer_rpc_address aren't shared in PeerInfo

* Updated comments to be more concise

* Fix debug rpc ports localnet near#8583

<h1> Problem </h1>
This is only an issue on localnet and not production (mainnet, testnet)
Hence, prefer having this fix within the frontend webserver itself.
Alternatively, could have an optional approach for nodes to fetch rpc
address from each peer. However, this introduces both complexity and
additional bandwidth that will only be useful for localnet testing.

<h1> Solution </h1>
The code currently uses the assumption on the ports that the nodes will
be started in the ranges:
- Network address : 24567 + [0, numNodes - 1]
- RPC address: 3030 + [0, numNodes - 1]
- Reference: https://github.com/near/nearup/blob/0b9a7b60236f3164dd32677b6fa58c531a586200/nearuplib/localnet.py#L93-L94
If this assumption is ever violated in future,
only the debug pages would break and can easily be fixed without affecting the actual core network itself.

<h1> Testing </h1>

- Ran linter

```
npm run lint
```

- Open both old and new debug pages and manually click on the links
```
cd nearcore
make debug
nearup run localnet --binary-path ~/Github/near/nearcore/target/debug/
open http://localhost:3030/debug

cd nearcore/tools/debug-ui
npm install
npm start
open http://localhost:3000/localhost:3030/cluster
```

* Address 1st code review comments
- Added conditional to only perform arithmetic for localnet nodes.
- Remove any mention of nearup
- use DEFAULT_UPPER_CASE for constants instead of lower_case_assumption
- - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/const
- Remove link to 24567 as irrelevant (used for PING)
- Include comment on how peer_rpc_address aren't shared in PeerInfo
- only check 127.0.0.1 for localnet to dive into localnet only code paths
@ghost
Copy link

ghost commented Apr 7, 2023

Closing as code is merged

@ghost ghost closed this as completed Apr 7, 2023
nikurt pushed a commit that referenced this issue Apr 28, 2023
* Fix debug rpc ports localnet #8583

<h1> Problem </h1>
This is only an issue on localnet and not production (mainnet, testnet)
Hence, prefer having this fix within the frontend webserver itself.
Alternatively, could have an optional approach for nodes to fetch rpc
address from each peer. However, this introduces both complexity and
additional bandwidth that will only be useful for localnet testing.

<h1> Solution </h1>
The code currently uses the assumption on the ports that the nodes will
be started in the ranges:
- Network address : 24567 + [0, numNodes - 1]
- RPC address: 3030 + [0, numNodes - 1]
- Reference: https://github.com/near/nearup/blob/0b9a7b60236f3164dd32677b6fa58c531a586200/nearuplib/localnet.py#L93-L94
If this assumption is ever violated in future,
only the debug pages would break and can easily be fixed without affecting the actual core network itself.

<h1> Testing </h1>

- Ran linter

```
npm run lint
```

- Open both old and new debug pages and manually click on the links
```
cd nearcore
make debug
nearup run localnet --binary-path ~/Github/near/nearcore/target/debug/
open http://localhost:3030/debug

cd nearcore/tools/debug-ui
npm install
npm start
open http://localhost:3000/localhost:3030/cluster
```

* Updated typo from 'will be' to 'will are'

* Address 1st code review comments
- Added conditional to only perform arithmetic for localnet nodes.
- Remove any mention of nearup
- use DEFAULT_UPPER_CASE for constants instead of lower_case_assumption
- - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/const
- Remove link to 24567 as irrelevant (used for PING)
- Include comment on how peer_rpc_address aren't shared in PeerInfo

* Updated comments to be more concise

* Fix debug rpc ports localnet #8583

<h1> Problem </h1>
This is only an issue on localnet and not production (mainnet, testnet)
Hence, prefer having this fix within the frontend webserver itself.
Alternatively, could have an optional approach for nodes to fetch rpc
address from each peer. However, this introduces both complexity and
additional bandwidth that will only be useful for localnet testing.

<h1> Solution </h1>
The code currently uses the assumption on the ports that the nodes will
be started in the ranges:
- Network address : 24567 + [0, numNodes - 1]
- RPC address: 3030 + [0, numNodes - 1]
- Reference: https://github.com/near/nearup/blob/0b9a7b60236f3164dd32677b6fa58c531a586200/nearuplib/localnet.py#L93-L94
If this assumption is ever violated in future,
only the debug pages would break and can easily be fixed without affecting the actual core network itself.

<h1> Testing </h1>

- Ran linter

```
npm run lint
```

- Open both old and new debug pages and manually click on the links
```
cd nearcore
make debug
nearup run localnet --binary-path ~/Github/near/nearcore/target/debug/
open http://localhost:3030/debug

cd nearcore/tools/debug-ui
npm install
npm start
open http://localhost:3000/localhost:3030/cluster
```

* Address 1st code review comments
- Added conditional to only perform arithmetic for localnet nodes.
- Remove any mention of nearup
- use DEFAULT_UPPER_CASE for constants instead of lower_case_assumption
- - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/const
- Remove link to 24567 as irrelevant (used for PING)
- Include comment on how peer_rpc_address aren't shared in PeerInfo
- only check 127.0.0.1 for localnet to dive into localnet only code paths
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant