New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection migration should be indistinguishable from a new connection #203

Open
lucas-clemente opened this Issue Jan 23, 2017 · 22 comments

Comments

@lucas-clemente
Contributor

lucas-clemente commented Jan 23, 2017

One of the major features of QUIC vs TCP is the option of seamless connection migration. One can conceive of a situation where a network middle-man (e.g. an ISP, firewall implementer, etc) might have an interest in blocking this feature, e.g. by detecting connection migrations to their network and dropping those packets. I propose that we think of a way to make a connection migration indistinguishable from a "normal" new connection.

One solution I can think of would be to have the client send a "fake" initial packet with 0-RTT encryption (which would then be discarded by the server as undecryptable). The first few packets sent on the new address would also need to have the version bit set – which the server would have to ignore.

This would also require initial packet number randomization (see #35).

This issue might conflict somewhat with #185.

Thanks to @lcolitti for pointing this issue out.

@ianswett

This comment has been minimized.

Show comment
Hide comment
@ianswett

ianswett Jan 23, 2017

Contributor

Agreed that it should look as much like a new connection as possible.

I've heard middleboxes had to add explicit support for MPTCP, so at the very least, we don't want to be in a situation where middleboxes need to change their behavior to accommodate QUIC connection migration or multipath.

The question becomes how much does it need to look like a new handshake? ie: Is it sufficient to send a fake CHLO followed by forward secure data, or do we need to wait an RTT? How realistic does the CHLO need to be?

Contributor

ianswett commented Jan 23, 2017

Agreed that it should look as much like a new connection as possible.

I've heard middleboxes had to add explicit support for MPTCP, so at the very least, we don't want to be in a situation where middleboxes need to change their behavior to accommodate QUIC connection migration or multipath.

The question becomes how much does it need to look like a new handshake? ie: Is it sufficient to send a fake CHLO followed by forward secure data, or do we need to wait an RTT? How realistic does the CHLO need to be?

@mirjak

This comment has been minimized.

Show comment
Hide comment
@mirjak

mirjak Jan 23, 2017

Contributor

I really don't support this solution because faking something to the network is unnecessary complex and therefore error prone (I think we can only do this wrong because it's fake) and probably also not really feasible given that the sender might not know that anything changed in the network.

My solution would be the other way around: Don't make the handshake look special and try to make all packets look the same for the network as much as possible. This goes back to the idea of "don't expose any data that you don't what the network to use because those will be ossified"and I think we should really follow this principle here as much as possible. (As a side note, this also means that there are cases where it is useful to expose information to the network and ossify this information and its representation such that it can be used for e.g.network diagnosability without changing all network devices for each version of quic.)

Contributor

mirjak commented Jan 23, 2017

I really don't support this solution because faking something to the network is unnecessary complex and therefore error prone (I think we can only do this wrong because it's fake) and probably also not really feasible given that the sender might not know that anything changed in the network.

My solution would be the other way around: Don't make the handshake look special and try to make all packets look the same for the network as much as possible. This goes back to the idea of "don't expose any data that you don't what the network to use because those will be ossified"and I think we should really follow this principle here as much as possible. (As a side note, this also means that there are cases where it is useful to expose information to the network and ossify this information and its representation such that it can be used for e.g.network diagnosability without changing all network devices for each version of quic.)

@marten-seemann

This comment has been minimized.

Show comment
Hide comment
@marten-seemann

marten-seemann Jan 23, 2017

Contributor

I agree with @mirjak in principle, the best solution would be if a handshake is indistinguishable from a connection migration. However, when using TLS, the first packet of a handshake always contains unencrypted data, which is something trivial to detect for a middle-man.
Given this constraint, the solution @lucas-clemente proposed might unfortunately be the best we can do.

Contributor

marten-seemann commented Jan 23, 2017

I agree with @mirjak in principle, the best solution would be if a handshake is indistinguishable from a connection migration. However, when using TLS, the first packet of a handshake always contains unencrypted data, which is something trivial to detect for a middle-man.
Given this constraint, the solution @lucas-clemente proposed might unfortunately be the best we can do.

@igorlord

This comment has been minimized.

Show comment
Hide comment
@igorlord

igorlord Jan 25, 2017

Contributor

@marten-seemann But @mirjak makes a valid point. The sender may be unaware. It could be a case of a NAT rebinding (new outgoing UDP port) or you are on a WiFi spot provided by a cell-connected router that just migrated to a different operator.

Trying to confuse the network, you are confusing the server as well (unless your server is a single lonely box on a rack somewhere). CDNs are complex networks with multiple machines coordinating to present you with a nice "server" abstraction. If you are successful in confusing the network, you will confuse CDNs and as a result end users will get suboptimal performance and reliability for their connections.

Contributor

igorlord commented Jan 25, 2017

@marten-seemann But @mirjak makes a valid point. The sender may be unaware. It could be a case of a NAT rebinding (new outgoing UDP port) or you are on a WiFi spot provided by a cell-connected router that just migrated to a different operator.

Trying to confuse the network, you are confusing the server as well (unless your server is a single lonely box on a rack somewhere). CDNs are complex networks with multiple machines coordinating to present you with a nice "server" abstraction. If you are successful in confusing the network, you will confuse CDNs and as a result end users will get suboptimal performance and reliability for their connections.

@lucas-clemente

This comment has been minimized.

Show comment
Hide comment
@lucas-clemente

lucas-clemente Jan 25, 2017

Contributor

You're right of course, the client may not know about a network change.

Then I still think we should think more about making the first packet undistinguishable from the rest of the connection. Of course this will not strictly be possible (since you can always trial-decrypt it), but we can at least make it harder.

Maybe one possibility would be to randomly send public headers similar to the initial packet, e.g. set the version flag every Nth packet or so.

Contributor

lucas-clemente commented Jan 25, 2017

You're right of course, the client may not know about a network change.

Then I still think we should think more about making the first packet undistinguishable from the rest of the connection. Of course this will not strictly be possible (since you can always trial-decrypt it), but we can at least make it harder.

Maybe one possibility would be to randomly send public headers similar to the initial packet, e.g. set the version flag every Nth packet or so.

@MikeBishop

This comment has been minimized.

Show comment
Hide comment
@MikeBishop

MikeBishop Nov 14, 2017

Contributor

@lucas-clemente said in #227 that using a fixed AEAD on the handshake packets would help address his concern here. We now do.

If we want to do more here, I would argue that further obfuscation can be a v2 feature. @ianswett has some thoughts.

Contributor

MikeBishop commented Nov 14, 2017

@lucas-clemente said in #227 that using a fixed AEAD on the handshake packets would help address his concern here. We now do.

If we want to do more here, I would argue that further obfuscation can be a v2 feature. @ianswett has some thoughts.

@MikeBishop MikeBishop closed this Nov 14, 2017

@MikeBishop

This comment has been minimized.

Show comment
Hide comment
@MikeBishop

MikeBishop Nov 14, 2017

Contributor

@ianswett suggested keeping this open until we can verify that we're not making this difficult in v1.

Contributor

MikeBishop commented Nov 14, 2017

@ianswett suggested keeping this open until we can verify that we're not making this difficult in v1.

@MikeBishop MikeBishop reopened this Nov 14, 2017

@martinthomson

This comment has been minimized.

Show comment
Hide comment
@martinthomson

martinthomson Jan 15, 2018

Member

Leaving aside the migration that happens without the knowledge of the migrating endpoint (like NAT rebinding), this is tricky, if not impossible. And it's not a property of versions, but of our invariants.

The most obvious problem here is that a handshake uses the long header, which we don't use at any time afterwards. So we have the migrating endpoint understand the need to make its packets indistinguishable and use the long header when it knows that it has migrated.

Firstly, as Lucas says, it would be trivial for a middlebox to apply the static packet protection keys for that version of QUIC to determine that the packet isn't protected with those keys. We know from experience with TLS that middleboxes do exactly that sort of thing.

So what do we gain by doing this? It seems like we just create a bunch of hoops for these middleboxes to jump through, none of which are especially challenging.

Member

martinthomson commented Jan 15, 2018

Leaving aside the migration that happens without the knowledge of the migrating endpoint (like NAT rebinding), this is tricky, if not impossible. And it's not a property of versions, but of our invariants.

The most obvious problem here is that a handshake uses the long header, which we don't use at any time afterwards. So we have the migrating endpoint understand the need to make its packets indistinguishable and use the long header when it knows that it has migrated.

Firstly, as Lucas says, it would be trivial for a middlebox to apply the static packet protection keys for that version of QUIC to determine that the packet isn't protected with those keys. We know from experience with TLS that middleboxes do exactly that sort of thing.

So what do we gain by doing this? It seems like we just create a bunch of hoops for these middleboxes to jump through, none of which are especially challenging.

@ianswett

This comment has been minimized.

Show comment
Hide comment
@ianswett

ianswett Jan 16, 2018

Contributor

I'm happy to close this if others are, but there was a lot of interest in it previously.

I agree that making this work is tricky, but I'm a bit concerned if this becomes impossible due to our invariants. My largest practical concern is that we'll end up in a situation where some middleboxes starting doing DPI and end up intentionally or unintentionally breaking connection migration and/or multipath.

Contributor

ianswett commented Jan 16, 2018

I'm happy to close this if others are, but there was a lot of interest in it previously.

I agree that making this work is tricky, but I'm a bit concerned if this becomes impossible due to our invariants. My largest practical concern is that we'll end up in a situation where some middleboxes starting doing DPI and end up intentionally or unintentionally breaking connection migration and/or multipath.

@mikkelfj

This comment has been minimized.

Show comment
Hide comment
@mikkelfj

mikkelfj Jan 16, 2018

Contributor

Would a specialized 0-RTT handshake be suitable for migration?

Contributor

mikkelfj commented Jan 16, 2018

Would a specialized 0-RTT handshake be suitable for migration?

@mirjak

This comment has been minimized.

Show comment
Hide comment
@mirjak

mirjak Jan 18, 2018

Contributor

So I think for a some middleboxes it would actually be beneficial if a migration would look like a new flow/handshake because, as the path may have changed, you might end up at a new middlebox that does not have state yet, and it would be easier to set up new state if you can identify a QUIC flow by its handshake pattern.

However, I agree that this might be hard as you not only would need to send long headers but actually run a whole TLS handshake. Or could it even be beneficial in some cases to re-do the TLS handshake if you are on a new path?

Contributor

mirjak commented Jan 18, 2018

So I think for a some middleboxes it would actually be beneficial if a migration would look like a new flow/handshake because, as the path may have changed, you might end up at a new middlebox that does not have state yet, and it would be easier to set up new state if you can identify a QUIC flow by its handshake pattern.

However, I agree that this might be hard as you not only would need to send long headers but actually run a whole TLS handshake. Or could it even be beneficial in some cases to re-do the TLS handshake if you are on a new path?

@huitema

This comment has been minimized.

Show comment
Hide comment
@huitema

huitema Feb 5, 2018

Contributor

Who says that we have to send fake packets?

The plausible solution is to treat the connection migration as a special form of session resume. The resume ticket would need to encode that this really is a continuation of the existing QUIC session, but that seems reasonably OK to engineer. These would have to be passed to the client somehow, maybe as "MIGRATION RESUME TOKEN" frames, replacing the "NEW CONNECTION ID" mechanism. The connection setup would provide an implicit path validation, removing the need for the PATH CHALLENGE and RESPONSE frames. Not sure whether the result would be more or less complex than the present mechanism, probably similar in complexity.

Contributor

huitema commented Feb 5, 2018

Who says that we have to send fake packets?

The plausible solution is to treat the connection migration as a special form of session resume. The resume ticket would need to encode that this really is a continuation of the existing QUIC session, but that seems reasonably OK to engineer. These would have to be passed to the client somehow, maybe as "MIGRATION RESUME TOKEN" frames, replacing the "NEW CONNECTION ID" mechanism. The connection setup would provide an implicit path validation, removing the need for the PATH CHALLENGE and RESPONSE frames. Not sure whether the result would be more or less complex than the present mechanism, probably similar in complexity.

@larseggert

This comment has been minimized.

Show comment
Hide comment
@larseggert

larseggert Feb 5, 2018

Member

For MPTCP, we had to make a new MPTCP subflow look like a new TCP connection, because otherwise middleboxes wouldn't pass it. My guess is that we should do this for QUIC too, so that middleboxes can stay as simple as possible, i.e., don't need to distinguish between the a first and additional subflows. (Because some will get that wrong -> ossification.)

Member

larseggert commented Feb 5, 2018

For MPTCP, we had to make a new MPTCP subflow look like a new TCP connection, because otherwise middleboxes wouldn't pass it. My guess is that we should do this for QUIC too, so that middleboxes can stay as simple as possible, i.e., don't need to distinguish between the a first and additional subflows. (Because some will get that wrong -> ossification.)

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 5, 2018

Contributor

If ossification is the concern, I think we should eliminate the distinction between long and short packet headers.

There is no need for INITIAL and HANDSHAKE to have their packet type and version number sent in cleartext. The fields can be sent as part of the AEAD-encrypted payload. We can always use short header, and when seeing an unidentified CID, do a trial decryption to see if it is a pre-1-RTT packet that contains a type and version number.

If privacy is a concern, we need to generate and emit a new ECDH keyshare every time the connection migrates. I am not sure if we want to do that.

Contributor

kazuho commented Feb 5, 2018

If ossification is the concern, I think we should eliminate the distinction between long and short packet headers.

There is no need for INITIAL and HANDSHAKE to have their packet type and version number sent in cleartext. The fields can be sent as part of the AEAD-encrypted payload. We can always use short header, and when seeing an unidentified CID, do a trial decryption to see if it is a pre-1-RTT packet that contains a type and version number.

If privacy is a concern, we need to generate and emit a new ECDH keyshare every time the connection migrates. I am not sure if we want to do that.

@mikkelfj

This comment has been minimized.

Show comment
Hide comment
@mikkelfj

mikkelfj Feb 5, 2018

Contributor

I imagine CID can be used to reject or prioritise packets before full AEAD in some scenarios. For this to work, it is probably helpful to have visible headers. Scrambling could be done by extending packet number encryption to cover the entire header.

Contributor

mikkelfj commented Feb 5, 2018

I imagine CID can be used to reject or prioritise packets before full AEAD in some scenarios. For this to work, it is probably helpful to have visible headers. Scrambling could be done by extending packet number encryption to cover the entire header.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Feb 5, 2018

Contributor

@mikkelfj

Scrambling could be done by extending packet number encryption to cover the entire header.

Yes. IMO that is also a good approach.

Anyways, I think that this might be a good opportunity to consider what properties we want to expose to on-path observers when starting to use a new path (i.e. 5 tuple).

With variants, packet number encryption and spin-bits, etc., it seems to me that we are starting to explicitly determine what should be observable, at the same time trying to scramble all other properties in order to avoid possible ossification. The same approach can be applied to how we use a path.

One option is to expose nothing. If that is the case, pre-1RTT and 1RTT packets should not be easily distinguishable. There could be various approaches in accomplishing that, as @mikkelfj and I have pointed out.

Another option is to have a signal that notifies the start of using a new path. In such case, we should expose 1 bit that can be used to distinguish start-of-use vs. in-use. I am not sure if we need (or want to) expose the internals of the handshake (i.e. INITIAL, HANDSHAKE, RETRY).

My tendency goes to exposing nothing. The primary reason is that switching to a new path can happen due to NAT rebinding. In such case, it is impossible for a client to mark the first packet that is sent using a new path differently from other packets.

Therefore, I believe that we should merge long packet header and short packet header so that pre-1RTT packets would not be easily distinguishable from 1-RTT packets, if ossification is a concern.

Contributor

kazuho commented Feb 5, 2018

@mikkelfj

Scrambling could be done by extending packet number encryption to cover the entire header.

Yes. IMO that is also a good approach.

Anyways, I think that this might be a good opportunity to consider what properties we want to expose to on-path observers when starting to use a new path (i.e. 5 tuple).

With variants, packet number encryption and spin-bits, etc., it seems to me that we are starting to explicitly determine what should be observable, at the same time trying to scramble all other properties in order to avoid possible ossification. The same approach can be applied to how we use a path.

One option is to expose nothing. If that is the case, pre-1RTT and 1RTT packets should not be easily distinguishable. There could be various approaches in accomplishing that, as @mikkelfj and I have pointed out.

Another option is to have a signal that notifies the start of using a new path. In such case, we should expose 1 bit that can be used to distinguish start-of-use vs. in-use. I am not sure if we need (or want to) expose the internals of the handshake (i.e. INITIAL, HANDSHAKE, RETRY).

My tendency goes to exposing nothing. The primary reason is that switching to a new path can happen due to NAT rebinding. In such case, it is impossible for a client to mark the first packet that is sent using a new path differently from other packets.

Therefore, I believe that we should merge long packet header and short packet header so that pre-1RTT packets would not be easily distinguishable from 1-RTT packets, if ossification is a concern.

@mikkelfj

This comment has been minimized.

Show comment
Hide comment
@mikkelfj

mikkelfj Feb 5, 2018

Contributor

Another option is to have a signal that notifies the start of using a new path. In such case, we should expose 1 bit that can be used to distinguish start-of-use vs. in-use. I am not sure if we need (or want to) expose the internals of the handshake (i.e. INITIAL, HANDSHAKE, RETRY).

This somewhat relates to the retry concern of asymmetric CID's
#1089

Contributor

mikkelfj commented Feb 5, 2018

Another option is to have a signal that notifies the start of using a new path. In such case, we should expose 1 bit that can be used to distinguish start-of-use vs. in-use. I am not sure if we need (or want to) expose the internals of the handshake (i.e. INITIAL, HANDSHAKE, RETRY).

This somewhat relates to the retry concern of asymmetric CID's
#1089

@MikeBishop

This comment has been minimized.

Show comment
Hide comment
@MikeBishop

MikeBishop Jul 10, 2018

Contributor

This proposes a change to the invariants. By definition, then, it can't be v2. My feeling is that we've sufficiently stabilized our choices in the invariants and we should close this with no further action.

Contributor

MikeBishop commented Jul 10, 2018

This proposes a change to the invariants. By definition, then, it can't be v2. My feeling is that we've sufficiently stabilized our choices in the invariants and we should close this with no further action.

@MikeBishop MikeBishop removed the quicv2 label Jul 10, 2018

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Jul 11, 2018

Contributor

This proposes a change to the invariants. By definition, then, it can't be v2. My feeling is that we've sufficiently stabilized our choices in the invariants and we should close this with no further action.

+1.

FWIW, my understanding is that the Invariants draft does not require long header packets to be used for connection establishment. Therefore, my understanding is that we will still have the chance to fix the issue in v2 or later versions, by starting to send everything in short packets.

Contributor

kazuho commented Jul 11, 2018

This proposes a change to the invariants. By definition, then, it can't be v2. My feeling is that we've sufficiently stabilized our choices in the invariants and we should close this with no further action.

+1.

FWIW, my understanding is that the Invariants draft does not require long header packets to be used for connection establishment. Therefore, my understanding is that we will still have the chance to fix the issue in v2 or later versions, by starting to send everything in short packets.

@MikeBishop

This comment has been minimized.

Show comment
Hide comment
@MikeBishop

MikeBishop Jul 11, 2018

Contributor

Actually, that is in the invariants:

A QUIC endpoint that receives a packet with a long header and a version it either does not understand or does not support might send a Version Negotiation packet in response. Packets with a short header do not trigger version negotiation and are always associated with an existing connection.

It's called out that long headers might be used post-handshake by some versions of QUIC, but explicitly stated that short headers won't be used during connection establishment. So if we want to change this, we need to change it now. And I haven't heard any appetite to swallow that change at this point in the process.

Contributor

MikeBishop commented Jul 11, 2018

Actually, that is in the invariants:

A QUIC endpoint that receives a packet with a long header and a version it either does not understand or does not support might send a Version Negotiation packet in response. Packets with a short header do not trigger version negotiation and are always associated with an existing connection.

It's called out that long headers might be used post-handshake by some versions of QUIC, but explicitly stated that short headers won't be used during connection establishment. So if we want to change this, we need to change it now. And I haven't heard any appetite to swallow that change at this point in the process.

@kazuho

This comment has been minimized.

Show comment
Hide comment
@kazuho

kazuho Jul 11, 2018

Contributor

Thank you for pointing that out. I had missed that. Created PR #1550.

Contributor

kazuho commented Jul 11, 2018

Thank you for pointing that out. I had missed that. Created PR #1550.

@martinthomson

This comment has been minimized.

Show comment
Hide comment
@martinthomson

martinthomson Jul 31, 2018

Member

This is in the really hard bucket right now. Other than genuinely doing a handshake (maybe with a PSK), this is easily detectable, if not trivially so. Given the late stage of the process, unless a serious proposal emerges, this is not likely to change.

It's also questionable as to whether this is something we want to do. The QUIC handshake tends to leak information in ways that a connection migration currently does not.

Member

martinthomson commented Jul 31, 2018

This is in the really hard bucket right now. Other than genuinely doing a handshake (maybe with a PSK), this is easily detectable, if not trivially so. Given the late stage of the process, unless a serious proposal emerges, this is not likely to change.

It's also questionable as to whether this is something we want to do. The QUIC handshake tends to leak information in ways that a connection migration currently does not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment