Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security: Stop disconnecting from nodes that send unexpected messages, to prevent disconnection attacks, Credit: Equilibrium #2107

Closed
3 tasks done
teor2345 opened this issue May 4, 2021 · 2 comments · Fixed by #3120
Assignees
Labels
A-network Area: Network protocol updates or fixes A-rust Area: Updates to Rust code C-security Category: Security issues I-hang A Zebra component stops responding to requests I-invalid-data Zebra relies on invalid or untrusted data, or sends invalid data I-slow Problems with performance or responsiveness S-needs-investigation Status: Needs further investigation

Comments

@teor2345
Copy link
Contributor

teor2345 commented May 4, 2021

Motivation

When Zebra receives an unexpected network message from a peer, it disconnects from that peer. This resets a remote zcashd node's internal connection state for Zebra. So Zebra gets the messages it expects once it reconnects.

There are a few issues with this approach:

  1. This is a denial of service risk, because Zcash network protocol messages are unauthenticated.

  2. Disconnecting due to protocol differences could be stopping Zebra from being listed by block explorers and DNS seeders.

Priority

This is a medium-priority issue, because:

Solution

Unrecognised Messages

  • Zcash nodes should drop messages they don't recognise.
    • for Zebra, this includes NODE_BLOOM messages.

Protocol Differences

When Zebra can parse a message that a peer has sent, but doesn't expect that message:

  • stop immediately disconnecting from peers
  • use or drop the message

Be strict with the handshake

  • reject port scanners and new connections speaking the wrong protocol

Be permissive after a successful handshake

  • ignore malformed messages on established connections - the IP header might be fake
  • when is a connection established? After the Version message? After the whole handshake?

Alternatives

Work out how to restore unusable peers without disconnecting:

  • only disconnect due to a specific list of errors? Or allow a specific list of errors, and disconnect for any other errors?
  • mark peers as "low priority" when they send unexpected messages
    • make these peers timeout and become ready?
    • what should we do if there are too many unusable peers? (randomly disconnect some, with a rate-limit?)
  • rate-limit inbound messages from peers that overload the inbound service
    • impose a lower limit for requests containing blocks or transactions, to avoid memory DoS

Context

We might want to follow what zcashd does here pretty closely.

As zcashd's network behaviour becomes better specified, Zebra will have fewer unexpected network messages.

@teor2345 teor2345 added C-bug Category: This is a bug A-rust Area: Updates to Rust code S-needs-triage Status: A bug report needs triage S-needs-investigation Status: Needs further investigation P-High C-security Category: Security issues I-hang A Zebra component stops responding to requests I-slow Problems with performance or responsiveness I-invalid-data Zebra relies on invalid or untrusted data, or sends invalid data A-network Area: Network protocol updates or fixes labels May 4, 2021
@teor2345 teor2345 added this to No Estimate in Effort Affinity Grouping via automation May 4, 2021
@teor2345 teor2345 added this to To Do in 🦓 via automation May 4, 2021
@teor2345 teor2345 moved this from No Estimate to S - 3 in Effort Affinity Grouping May 4, 2021
@mpguerra mpguerra removed the S-needs-triage Status: A bug report needs triage label May 7, 2021
@teor2345 teor2345 changed the title Stop immediately disconnecting from nodes that send unexpected messages Security: Stop immediately disconnecting from nodes that send unexpected messages May 24, 2021
@teor2345 teor2345 changed the title Security: Stop immediately disconnecting from nodes that send unexpected messages Security: Stop immediately disconnecting from nodes that send unexpected messages, Credit: Equilibrium Jun 8, 2021
@teor2345 teor2345 added P-Medium and removed C-bug Category: This is a bug P-High labels Jun 15, 2021
@teor2345 teor2345 changed the title Security: Stop immediately disconnecting from nodes that send unexpected messages, Credit: Equilibrium Security: Stop immediately disconnecting from nodes that send unexpected messages to prevent disconnection attacks, Credit: Equilibrium Oct 25, 2021
@teor2345 teor2345 changed the title Security: Stop immediately disconnecting from nodes that send unexpected messages to prevent disconnection attacks, Credit: Equilibrium Security: Stop immediately disconnecting from nodes that send unexpected messages, to prevent disconnection attacks, Credit: Equilibrium Oct 25, 2021
@teor2345 teor2345 changed the title Security: Stop immediately disconnecting from nodes that send unexpected messages, to prevent disconnection attacks, Credit: Equilibrium Security: Stop disconnecting from nodes that send unexpected messages, to prevent disconnection attacks, Credit: Equilibrium Oct 25, 2021
@teor2345 teor2345 self-assigned this Nov 25, 2021
🦓 automation moved this from To Do to Done Nov 30, 2021
Effort Affinity Grouping automation moved this from S - 3 to Done Nov 30, 2021
@teor2345
Copy link
Contributor Author

teor2345 commented Dec 1, 2021

PR #3120 caused some panics, PR #3124 should fix them.

But we need to implement a permanent fix in this ticket.

@teor2345 teor2345 reopened this Dec 1, 2021
🦓 automation moved this from Done to In progress Dec 1, 2021
@teor2345
Copy link
Contributor Author

teor2345 commented Dec 1, 2021

Actually, the errors re-introduced by #3124 don't close the connection, so this ticket is fixed.

We'll deal with NotFound in #2726.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-network Area: Network protocol updates or fixes A-rust Area: Updates to Rust code C-security Category: Security issues I-hang A Zebra component stops responding to requests I-invalid-data Zebra relies on invalid or untrusted data, or sends invalid data I-slow Problems with performance or responsiveness S-needs-investigation Status: Needs further investigation
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

2 participants