Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: allow AutoNAT to dial all IP addresses, without risking amplification attacks #536

Open
marten-seemann opened this issue Apr 3, 2023 · 15 comments

Comments

@marten-seemann
Copy link
Contributor

Our AutoNAT spec currently says:

In order to prevent attacks like the one described in RFC 3489, Section 12.1.1 (see excerpt below), implementations MUST NOT dial any multiaddress unless it is based on the IP address the requesting node is observed as. This restriction as well implies that implementations MUST NOT accept dial requests via relayed connections as one can not validate the IP address of the requesting node.

RFC 3489 12.1.1 Attack I: DDOS Against a Target

In this case, the attacker provides a large number of clients with the same faked MAPPED-ADDRESS that points to the intended target. This will trick all the STUN clients into thinking that their addresses are equal to that of the target. The clients then hand out that address in order to receive traffic on it (for example, in SIP or H.323 messages). However, all of that traffic becomes focused at the intended target. The attack can provide substantial amplification, especially when used with clients that are using STUN to enable multimedia applications.

The intention is to prevent an amplification attack against a target, and was designed with the properties of the STUN protocol in mind.

This is not the only way to prevent amplification attacks though. We can also just make the attack expensive enough such that it becomes unattractive. For example, we could require the requester to send a non-trivial amount of data, on the order of 10-100 kB. This would make AutoNAT completely uninteresting for an amplification attack, since a libp2p handshake is a lot smaller than 10 kB, and thus no amplification can be achieved.

On the other hand, for the AutoNAT server, receiving 10-100 kB is cheap enough to process that we're not placing a too large burden on the server. To further lighten the load on the server, this could be designed as a "retry" mechanism (borrowing QUIC terminology here): Only if the IP address doesn't match would the server request the client to send this additional data.

Obviously, as this is a new design, this would need to be incorporated into AutoNAT v2 (#503).

@sukunrt Thoughts?

@sukunrt
Copy link
Member

sukunrt commented Apr 3, 2023

What benefits do we get by allowing this?

Some preliminary thoughts,
I think this scheme is optimal in some sybil attack sense. For an attacker to get 10 peers to dial the target the attacker will need to spend 10x resources.

This scheme makes amplification attacks difficult but allows for generating large transient load on a target. The way to do this would be to open streams with a lot of autonat servers. give them the ip of the target and then slowly send them 10-100kb of requested data such that all of them get this data at the same time. However, this would necessarily be transient since the attacker would not have the resources to execute this for a long time.

@sukunrt
Copy link
Member

sukunrt commented Apr 3, 2023

Another workaround i can think of is that the attacker uses identify to launch an attack on a target.
In this scenario an attacker has a bunch of peers. In identify the attacker informs the peers that their ip address is the ip address of the target. These peers would now try to verify target's ip address using autonat. If these peers pay the cost of 100k bytes, the attacker would have executed this attack with much lower resources.

@mxinden
Copy link
Member

mxinden commented Apr 5, 2023

What benefits do we get by allowing this?

I have the same question. In other words, in which scenario would the local node want a remote node want to probe a different IP than the one the local node is observed as? In the case where the local node has multiple public facing interfaces? Do we have that use-case anywhere today?

In identify the attacker informs the peers that their ip address is the ip address of the target. These peers would now try to verify target's ip address using autonat.

I don't think this is correct. The "peers" don't actively probe a peer's IP addresses based on the peer's identify payload. Instead the peer (here the "attacker") initiates the AutoNAT process, potentially including the additional anti-amplification payload.

@sukunrt
Copy link
Member

sukunrt commented Apr 5, 2023

I wasn't clear here. I merged two streams of thought one related to having address pipeline here and one related to autonat v2 allowing you to test reachability on a single address.

My assumption here is that with autonat v2, nodes will try to verify reachability on all externally observed addresses. When nodes receive a new observed address in identify, they will try to test its public reachability.

The scenario I describe is this.

  • Attacker A is running a node and is interested in increasing traffic on a target T.
  • Peer B connects to attacker A and in identify A reports to B that B's observed address is the address of target T.
  • Peer B sees this as a new external address and tries to verify its public reachability via AutonatV2, paying the 10-100kb cost that's suggested in this scheme. The autonat server S will dial the target T.
  • So attacker A has gotten autonat server S to dial T without paying the 10kb cost for it.

The attacker can repeat this scheme for other peers that connect to it.

@sukunrt
Copy link
Member

sukunrt commented Apr 5, 2023

In the case where the local node has multiple public facing interfaces? Do we have that use-case anywhere today?

I do see the benefits now for this case. If a node has multiple interfaces this does decouple testing addresses via autonat from the interface we dialled the autonat server to. Without this feature we have to ensure that we make a connection to the autonat server using the same interface that we are interested in testing.

@marten-seemann
Copy link
Contributor Author

What benefits do we get by allowing this?

I have the same question. In other words, in which scenario would the local node want a remote node want to probe a different IP than the one the local node is observed as? In the case where the local node has multiple public facing interfaces? Do we have that use-case anywhere today?

Good question, @sukunrt and @mxinden! Two network interfaces would definitely be a use case, as would NATs that use multiple IP addresses for their mappings.

I'm wondering if we should quantify how many times we're asked to dial addresses that don't match the observed address of the connection we receive the request on. This could probably be added as a simple log statement in the AutoNAT server implementation, logging observed IP => []requested IP.


Another use case for this would be Kubo's AppendAnnounces "feature", where the user can just add IP addresses to the address announcements of the libp2p node. A much cleaner way to solve this (and which the address pipeline would enable) would be to pass these addresses as a hint to the libp2p stack, and have AutoNAT v2 actually try out these addresses.

I don't think this is correct. The "peers" don't actively probe a peer's IP addresses based on the peer's identify payload. Instead the peer (here the "attacker") initiates the AutoNAT process, potentially including the additional anti-amplification payload.

@sukunrt is right, at least for go-libp2p. We have this component called "observed address manager" that lives alongside Identify, and discovers new addresses (once reaching a certain threshold). Don't you have something like that in rust-libp2p, @mxinden? How do you discover your public addresses then?
In practice, this threshold protects us against incorrect reports and symmetric NATs, but not against active attacks. Establishing connections is too cheap to set a threshold that would be meaningful against an active attacker. At the moment, we just trust that we're reachable at a certain address once that threshold is reached, which is obviously not great.

@marten-seemann
Copy link
Contributor Author

The scenario I describe is this.

  • Attacker A is running a node and is interested in increasing traffic on a target T.
  • Peer B connects to attacker A and in identify A reports to B that B's observed address is the address of target T.
  • Peer B sees this as a new external address and tries to verify its public reachability via AutonatV2, paying the 10-100kb cost that's suggested in this scheme. The autonat server S will dial the target T.
  • So attacker A has gotten autonat server S to dial T without paying the 10kb cost for it.

The attacker can repeat this scheme for other peers that connect to it.

That's possible, but I'd argue that this attack is not very interesting.

First of all, the attacker needs to connect to B to tell it the address of T. This means he has to complete one libp2p handshake, to get S to initiate one handshake to T. So there's no amplification here. At the very best, the attacker can hope to achieve some bunching here, temporarily overloading T (since an Identify message is smaller than a libp2p handshake).
Second, since we have the threshold, one connection to B won't be enough. The threshold is 4, so the attacker will have to connect to B four times.

We could also harden Identify and disregard addresses that we only see on inbound connections (we already keep track of the directionality in the observed address manager today). That would make it even harder to pull of this attack, since you'd somehow need to convince B to connect to A first.

@sukunrt
Copy link
Member

sukunrt commented Apr 5, 2023

I think your argument is sound.
There is no amplification here. A makes one handshake for every handshake S makes with T.

I'm sorry if this is a very basic question but what was the reason for disallowing this in the first place?
To make an autonat request you will need to do a handshake. When you provide the autonat server with the ip of the target, you have done one handshake to make the autonat server do one handshake with the target.

I have not read the STUN RFC closely so I might be missing something here but it seems to me like https://www.rfc-editor.org/rfc/rfc5389#section-16.2.1 is about the case where a malicious or compromised STUN server is replying to STUN requests with ip address of the target.

In this attack, the attacker provides one or more clients with the
same faked reflexive address that points to the intended target.
This will trick the STUN clients into thinking that their reflexive
addresses are equal to that of the target. If the clients hand out
that reflexive address in order to receive traffic on it (for
example, in SIP messages), the traffic will instead be sent to the
target. This attack can provide substantial amplification,
especially when used with clients that are using STUN to enable
multimedia applications.

this section talks about ip spoofing https://www.rfc-editor.org/rfc/rfc8489#section-16.1.2 to get the stun server to send a response to a target.

A rogue client may use a STUN server as a reflector, sending it
requests with a falsified source IP address and port. In such a
case, the response would be delivered to that source IP and port.
There is no amplification of the number of packets with this attack
(the STUN server sends one packet for each packet sent by the
client), though there is a small increase in the amount of data,
since STUN responses are typically larger than requests. This attack
is mitigated by ingress source address filtering.

This seems fine for our case since autonat doesn't send any payload to the address it is verifying.

In fact I think the older version of the STUN RFC allowed sending responses to a different address from the one the request was received on
https://www.rfc-editor.org/rfc/rfc3489#section-11.2.2

@marten-seemann
Copy link
Contributor Author

I'm sorry if this is a very basic question but what was the reason for disallowing this in the first place?
To make an autonat request you will need to do a handshake. When you provide the autonat server with the ip of the target, you have done one handshake to make the autonat server do one handshake with the target.

All I can think of is that the attacker could do all the handshakes in advance, and then fire of all the AutoNAT requests at once. I'm not convinced this is actually a problem though.

This was introduced in #369. @mxinden, what made you think that this is necessary?

@marten-seemann
Copy link
Contributor Author

@mxinden, friendly ping.

@sukunrt
Copy link
Member

sukunrt commented Apr 11, 2023

There is an issue with asking a ip4 peer to dial an ip6 address. The peer might not be able to dial the ip6 address. My isp is completely ip4 and when I dial a ip6 address, the error I receive from the networking stack is no route to host.
It will be difficult to accurately identify for the server whether they're unable to dial because the client is unreachable or because they're unable to dial such addresses.

one way to fix this is to only ask for ip4 dials on a ip4 connection and ip6 dials on a ip6 connection.

@mxinden
Copy link
Member

mxinden commented Apr 12, 2023

Below A is the attacker, T the target and S the AutoNAT server.

I'm sorry if this is a very basic question but what was the reason for disallowing this in the first place?
To make an autonat request you will need to do a handshake. When you provide the autonat server with the ip of the target, you have done one handshake to make the autonat server do one handshake with the target.

All I can think of is that the attacker could do all the handshakes in advance, and then fire of all the AutoNAT requests at once. I'm not convinced this is actually a problem though.

This was introduced in #369. @mxinden, what made you think that this is necessary?

In AutoNATv1 (i.e. current AutoNAT) I see two amplification mechanisms:

  • Under the assumption that opening a stream is cheaper than establishing a connection, A can cause S to establish a new connection to T through a single stream from A to S. This amplification is especially important in the case where A can send many requests to S on a single connection within time X where X depends on the rate limiting implementation of S.
  • Within a single AutoNATv1 request A can include multiple addresses of T, thus one request on a stream from A to S can result in multiple connections from S to T.

Relevant section in the AutoNAT v1 specification:

The AutoNAT Protocol uses the Protocol ID /libp2p/autonat/1.0.0. The node wishing to determine its NAT status opens a stream using this protocol ID, and then sends a Dial message. The Dial message contains a list of multiaddresses. Upon receiving this message, the peer starts to dial these addresses.

https://github.com/libp2p/specs/tree/master/autonat

Does this better explain the reasoning for #369?

@sukunrt
Copy link
Member

sukunrt commented Apr 13, 2023

In AutoNATv1 (i.e. current AutoNAT) I see two amplification mechanisms:

  • Under the assumption that opening a stream is cheaper than establishing a connection, A can cause S to establish a new connection to T through a single stream from A to S. This amplification is especially important in the case where A can send many requests to S on a single connection within time X where X depends on the rate limiting implementation of S.
  • Within a single AutoNATv1 request A can include multiple addresses of T, thus one request on a stream from A to S can result in multiple connections from S to T.

These are valid cases. Thanks @mxinden. We definitely cannot remove this restriction without introducing the scheme suggested.

The suggested scheme is resistant to both these attacks.

  • Under the assumption that opening a stream is cheaper than establishing a connection, A can cause S to establish a new connection to T through a single stream from A to S. This amplification is especially important in the case where A can send many requests to S on a single connection within time X where X depends on the rate limiting implementation of S.

A will need to send data that is costlier than a handshake

  • Within a single AutoNATv1 request A can include multiple addresses of T, thus one request on a stream from A to S can result in multiple connections from S to T.

In Autonat v2 S will only dial one address.

The attack I proposed is also easily handled by a smart client implementation.

The scenario I describe is this.

  • Attacker A is running a node and is interested in increasing traffic on a target T.
  • Peer B connects to attacker A and in identify A reports to B that B's observed address is the address of target T.
  • Peer B sees this as a new external address and tries to verify its public reachability via AutonatV2, paying the 10-100kb cost that's suggested in this scheme. The autonat server S will dial the target T.
  • So attacker A has gotten autonat server S to dial T without paying the 10kb cost for it.

Even if the attacker repeatedly uses identify push to get amplification, the addresses suggested by the attacker A will be lower priority addresses for peer B and B will not pay the cost of verifying these addresses. See this comment for a client implementation resistant to such attacks. #539 (comment)

@dhuseby
Copy link

dhuseby commented Jun 21, 2023

@marten-seemann talked with @sukunrt and this has been rolled into the larger autonat spec. are we good to close this?

@marten-seemann
Copy link
Contributor Author

Should we keep this open until the spec is merged? Not a strong preference, but technically this issue isn’t resolved until that point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Triage
Development

No branches or pull requests

4 participants