Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zebra attempts new peer connections in a fixed, predictable order #1875

Open
6 tasks
Tracked by #7822 ...
teor2345 opened this issue Mar 9, 2021 · 3 comments
Open
6 tasks
Tracked by #7822 ...

Zebra attempts new peer connections in a fixed, predictable order #1875

teor2345 opened this issue Mar 9, 2021 · 3 comments
Labels
A-network Area: Network protocol updates or fixes C-security Category: Security issues I-invalid-data Zebra relies on invalid or untrusted data, or sends invalid data I-remote-trigger Remote nodes can make Zebra do something bad I-slow Problems with performance or responsiveness

Comments

@teor2345
Copy link
Contributor

teor2345 commented Mar 9, 2021

Motivation

Zebra attempts new peer connections in a fixed, predictable order, based on the untrusted_last_seen_time gossiped by other peers, and the local last_success_time and last_failure_time.

This design is vulnerable to attacks that take advantage of this specific retry order to starve the local node, replace all the peers of the local node, or target remote nodes.

This design is also vulnerable to issues after a network connection failure, where every peer is in a failed state. It can also get into a failure mode where it cycles through disconnected successful peers, but never tries failed previously successful peers.

This issue is mitigated by:

Due to the large number of mitigations, this is a low-priority issue.

Solution

This fix depends on #1848, #1849 and #1871.

Zebra should choose the next peer to connect to using the "power of two choices" algorithm.

In the following functions:

  • CandidateSet::next()

Use the "power of two choices" algorithm:

  • choose two random addresses that are eligible for connection
    • filter the addresses for eligibility before randomly selecting two of them
  • select the best address from the chosen addresses (the best address has the lower index in the address book)
  • attempt to connect to that address
  • if there is only one eligible address, attempt to connect to that address
  • if there are no eligible addresses, return None

This change risks connecting to slightly worse peers, or skipping some peers that would otherwise have been chosen for reconnection.

Ideally, the algorithm should be implemented by the address book. This improves modularity and performance.

Alternatives

We could do nothing, and trust the other fixes to mitigate the worst impacts of this issue.

Context

zcashd does not have this issue.

@teor2345 teor2345 added C-bug Category: This is a bug A-rust Area: Updates to Rust code S-needs-design Status: Needs a design decision S-needs-triage Status: A bug report needs triage P-Medium C-security Category: Security issues I-hang A Zebra component stops responding to requests I-heavy Problems with excessive memory, disk, or CPU usage I-slow Problems with performance or responsiveness I-usability Zebra is hard to understand or use I-remote-node-overload Zebra can overload other nodes on the network I-invalid-data Zebra relies on invalid or untrusted data, or sends invalid data labels Mar 9, 2021
@teor2345 teor2345 added this to No Estimate in Effort Affinity Grouping via automation Mar 9, 2021
@teor2345 teor2345 added this to To Do in 🦓 via automation Mar 9, 2021
@teor2345 teor2345 removed the S-needs-design Status: Needs a design decision label Mar 9, 2021
@teor2345 teor2345 moved this from No Estimate to S - 3 in Effort Affinity Grouping Mar 9, 2021
@mpguerra mpguerra removed the S-needs-triage Status: A bug report needs triage label Mar 10, 2021
@teor2345 teor2345 added P-Low and removed C-bug Category: This is a bug P-Medium I-hang A Zebra component stops responding to requests I-heavy Problems with excessive memory, disk, or CPU usage I-remote-node-overload Zebra can overload other nodes on the network labels Nov 26, 2021
@teor2345
Copy link
Contributor Author

We did this for the initial peers, so the rest of the task is a low priority.

@mpguerra mpguerra added the A-network Area: Network protocol updates or fixes label Dec 16, 2021
@teor2345
Copy link
Contributor Author

teor2345 commented Jun 2, 2022

This is covered by other tickets.

@teor2345 teor2345 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 2, 2022
@teor2345
Copy link
Contributor Author

This could be relevant to some of our current network security bugs.

@teor2345 teor2345 reopened this Oct 26, 2023
🦓 automation moved this from To Do to In progress Oct 26, 2023
@teor2345 teor2345 added I-remote-trigger Remote nodes can make Zebra do something bad and removed A-rust Area: Updates to Rust code I-usability Zebra is hard to understand or use labels Oct 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-network Area: Network protocol updates or fixes C-security Category: Security issues I-invalid-data Zebra relies on invalid or untrusted data, or sends invalid data I-remote-trigger Remote nodes can make Zebra do something bad I-slow Problems with performance or responsiveness
Projects
Status: New
🦓
  
In progress
Development

No branches or pull requests

2 participants