Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validators should not make decisions based on which ports data was received #6672

Open
sagar-solana opened this issue Nov 1, 2019 · 11 comments
Labels
security Pull requests that address a security vulnerability
Milestone

Comments

@sagar-solana
Copy link
Contributor

sagar-solana commented Nov 1, 2019

Problem

Validators make many shred decisions based on which port it was received on.
This is not secure at all. This issue is related to the regular DoS that is possible on any of a validator's ports. However it leaves the window open for an amplification attack.

Proposed Solution

  • For turbine Shreds, validators should do a reverse lookup to see if the Shred should have arrived from where it did (i.e paired node in the layer above). If it did not match up then it shouldn't be sent through Turbine
  • For Repair Shreds, validators should only retransmit them if they came from the layer above. This can be coupled with the regular turbine shred described above.
  • For Repairman, validators should never send these shreds through turbine. No way to detect this right now. Needs a solution.
  • For all shreds received on the window, they should only be retransmitted if they have not already been inserted/recv'd

@aeyakovenko

@aeyakovenko
Copy link
Member

Why isn’t the rule simply:

  • retransmit once and only once if shred is from layer above.

Regardless how it arrived? Where the leader is the layer above per node per shred. It shouldn't matter how it arrived.

@sagar-solana
Copy link
Contributor Author

Yeah that applies to everything except for Repairman. So Turbine doesn't care about where (in terms of slot) your peers are. Repairman is sending really old shreds (from the epoch before) to you and you should definitely not be sending them through turbine. Because your peers might not be as far behind as you are.

@sagar-solana
Copy link
Contributor Author

sagar-solana commented Nov 1, 2019

@aeyakovenko So we need a solution for repairman. Something that isn't port dependent.

Also we send shreds received from neighbors to the children in layers below. So the rules should really be:

  • Retransmit each shred just once. No matter where you received it from. This will involve a blocktree lookup. Layer above or below doesn't matter if you look at blocktree first. But this will have a performance impact (might need a cache)
  • In addition to that, the scope of the retransmit will depend on whether the shred was destined to arrive from the layer above. If it was supposed to come from the layer above (doesn't matter if you repaired it) you should retransmit it neighbors and children. If it wasn't supposed to be delivered from the layer above, only send it to children.

@aeyakovenko
Copy link
Member

@sagar-solana isn’t repairman request/response?

@sagar-solana
Copy link
Contributor Author

@aeyakovenko no its not. It's automatic. Infact I just realized it repairs anyone more than 50 slots behind. Not an epoch anymore. So repairman looks at others' gossip and catches them up. No request. Directly response.

@aeyakovenko
Copy link
Member

@sagar-solana but their request is signed by them. so its a request to be repaired by the network which they can cancel.

@sagar-solana
Copy link
Contributor Author

They can't take something off of gossip once it's put there? The response is not so much a response as much as it is just a flood of shreds on tvu.

Anyway even if we allow nulling the gossipped EpochSlots that doesn't solve the main issue of not being able to differentiate shreds delivered via repairman vs turbine.
A validator never knows how far "behind" it is so it's not okay for it to treat both repairman and turbine shreds as equal. We need to be able to differentiate.

@aeyakovenko
Copy link
Member

how does the flow control work with repairman? does the node keep updating what it needs?

@sagar-solana
Copy link
Contributor Author

@carllin please help clarify if I get this wrong.
@aeyakovenko, validators are always looking at everyone else's gossipped EpochSlots data. Which basically gets updated everytime a new root is made or new slots are completed.
The validators that are able to serve the slots to those nodes that can raise their root divvy up the work amongst each other and catchup the lagging validators.
No flow control really. Just every time a new epochslots shows up, they'll start dealing with it.

@mvines
Copy link
Member

mvines commented Feb 26, 2020

TODO: recover #7774 as a part of fixing this issue 😿

@mvines mvines modified the milestones: v1.1.0, v1.2.0 Mar 30, 2020
@mvines mvines modified the milestones: v1.2.0, v1.3.0 May 21, 2020
@mvines mvines modified the milestones: v1.3.0, v1.4.0 Aug 5, 2020
@mvines mvines modified the milestones: v1.4.0, v1.5.0 Oct 8, 2020
@leoluk leoluk added the security Pull requests that address a security vulnerability label Nov 19, 2020
@mvines mvines modified the milestones: v1.5.0, v1.6.0 Dec 17, 2020
@mvines mvines modified the milestones: v1.6.0, v1.7.0 Mar 11, 2021
@mvines mvines modified the milestones: v1.7.0, v1.8.0 May 10, 2021
behzadnouri added a commit to behzadnouri/solana that referenced this issue Jun 3, 2021
Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
solana-labs#6672
solana-labs#7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Jun 4, 2021
Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
solana-labs#6672
solana-labs#7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Jun 4, 2021
Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
solana-labs#6672
solana-labs#7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.

The metrics added in
solana-labs#17351
shows that the number of times the index does not match the port is very
rare, and therefore this change should be safe.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Jun 7, 2021
Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
solana-labs#6672
solana-labs#7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.

The metrics added in
solana-labs#17351
shows that the number of times the index does not match the port is very
rare, and therefore this change should be safe.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Jun 8, 2021
Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
solana-labs#6672
solana-labs#7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.

The metrics added in
solana-labs#17351
shows that the number of times the index does not match the port is very
rare, and therefore this change should be safe.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Jun 11, 2021
Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
solana-labs#6672
solana-labs#7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.

The metrics added in
solana-labs#17351
shows that the number of times the index does not match the port is very
rare, and therefore this change should be safe.
behzadnouri added a commit to behzadnouri/solana that referenced this issue Jun 13, 2021
Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
solana-labs#6672
solana-labs#7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.

The metrics added in
solana-labs#17351
shows that the number of times the index does not match the port is very
rare, and therefore this change should be safe.
behzadnouri added a commit that referenced this issue Jun 15, 2021
Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
#6672
#7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.

The metrics added in
#17351
shows that the number of times the index does not match the port is very
rare, and therefore this change should be safe.
mergify bot pushed a commit that referenced this issue Jun 15, 2021
Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
#6672
#7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.

The metrics added in
#17351
shows that the number of times the index does not match the port is very
rare, and therefore this change should be safe.

(cherry picked from commit 1618386)
mergify bot added a commit that referenced this issue Jun 15, 2021
…17973)

Turbine retransmit logic is based on which socket it received the packet
from (i.e `packet.meta.forward`):
https://github.com/solana-labs/solana/blob/708bbcb00/core/src/retransmit_stage.rs#L467-L470

This can leave the cluster vulnerable to spoofing and selective
propagation of packets; see
#6672
#7774

This commit identifies if the node is on the "critical path" based on
its index in the shuffled cluster. If so, it forwards the packet to both
neighbors and children; otherwise, the packet is only forwarded to the
children.

The metrics added in
#17351
shows that the number of times the index does not match the port is very
rare, and therefore this change should be safe.

(cherry picked from commit 1618386)

Co-authored-by: behzad nouri <behzadnouri@gmail.com>
@behzadnouri
Copy link
Contributor

Turbine port-based forwarding logic is removed as of #17716 released on v1.7 branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security Pull requests that address a security vulnerability
Projects
None yet
Development

No branches or pull requests

5 participants