-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Description of the use of Preferred Address is unclear #3353
Comments
This text precedes a clear definition of handshake confirmed, so I strongly agree with your first suggested clarification. I can imagine cases when the CID should be based on the path, so I'm less clear on the correct behavior for the second point. By definition, if the CID provided in the transport param has been retired it should not be used to initiate path validation. But I'm unclear if the preferred behavior is don't migrate if you haven't already initiated migration or to use one of the newer CIDs. |
@ianswett Thank you for your comments.
First of all, let me state that we do not need to forbid such a design. However, in a design that expects CIDs to be specific to the server address being used, a server cannot issue a new CID until the migration to the preferred address completes. This is because if a server sends a NCID frame from the original server address before the client completes migration to the preferred address, the server cannot tell if the client would use that issued CID on the original path (this happen when the client fails to migrate to the preferred address), or if it uses that CID on the migrated path. Therefore, this issue does not have any effect on such a design. The question at stake is the client behavior we want to recommend, when the server sends a new CID (or retires CIDs) before migration to the preferred address completes. My view is that TP.preferred_address is a mechanism of specifying an alternative IP address, that "happens" to also carry a new CID, so that the client can always have an unused CID in hand when it initiates migration to the preferred address. It is actually simple to implement in such way. What you would do is this:
As stated above, such a design would work perfectly fine with servers issuing CIDs specific to server addresses, because a client would have only one unused CID to pick from when talking with such a server. The design is also simpler than having special case code that associates preferred_address.IP_address and preferred_address.CID, and handles retirement cleanly. |
It's a little trickier for servers. Let's say, hypothetically, that the server knows it's using a different CID pool on the alternative address; it picks one for the public address in the handshake, then gives one from the preferred address's CID pool in the TP. It doesn't send NCID frames right away. That's fine. But let's say the client doesn't migrate successfully. At some point, the server wants to start issuing the client new CIDs for one or the other of the paths, but QUICv1 doesn't have a way to identify path-bound CIDs. So the server has to give up on the preferred address if it can't handle CIDs coming to either interface. There isn't a defined cut-off for when the server should give up expecting the connection attempt, so the best way to do this is to put RPT=2 in the NCID frames. It seems cleaner initially to special-case the CID from the Preferred Address, and say that you have to use that specific CID to do the migration. But even with that code, the server has to do the same thing: wait to see whether the client migrates or not before issuing new CIDs. I think the piece we're missing is that when a client declares a migration unsuccessful, it MUST/SHOULD retire the CID it used to attempt the migration. That gives the server a clear signal, succeed or fail; it can then proceed to issue CIDs for the surviving server address. |
Good suggestion on retiring the CID immediately if the migration was unsuccessful. That does provide a clearer signal than the ambiguity we have today. |
That's a keen observation. I personally favor the idea of using the RETIRE_CONNECTION_ID frame to indicate if the client has finished migration to the preferred address or if it would continue using the original address. However, such change would require every client to recognize TP.preferred_address, in sense that even a client lacking support intentional migration would be required to signal the retirement of the CID associated to TP.preferred_address. I think we have considered until now that a client that does not implement migration (or migration to preferred address) to simply ignore this transport parameter. We need to be clear about that. |
I think this is only particularly relevant to SPA, since that's the only time that a server is expecting a migration.
Definitely worth being clear about what to do if you also disable migration, but implementations that disable migration still need 99% of the rest of the machinery for other purposes, requiring them to immediately send retire for the TP.preferred_address seems okay, if slightly less convoluted than having some "ignore this if that" statement around the whole thing. |
As requested, I'm forwarding this comment regarding:
This implies a semantic to the retirement of connection IDs that is not already defined. It says that in addition to releasing the resource, the server can say definitively that those other network paths won't be used. But this is misleading because retiring CID 1 does not prevent CID 4 from being used on that path. Nothing says that the connection IDs sent in NEW_CONNECTION_ID have to be used on one or other path. Better to keep the requirement where it is: don't migrate back if you use a preferred address. Yes, that means that servers can't be sure of behaviour of clients here, but they can use the destination address to confirm acceptance of the preferred address or not. That should suffice. |
I think that the point being missed here is that under a given condition the server can determine the path being used for CID4. If the CID4 is issued prior to the client choosing the path, it is true the server would not be able to determine the path on which CID4 will be used. But if the server withholds NCID frames until the client chooses the path and notifies its choice by retiring one of the first two CIDs, then the server can tell for sure. Because either of the first or the second CID would be retired first, and that tells the server the path the client has chosen. As we agree, we already state that a client cannot come back to the original address, once it migrates to the preferred address. The benefit of requiring such retirement is that the server can issue CIDs specific to the server address, as pointed out by #3353 (comment). Assuming that such server deployment would be within our design scope, I think requiring such retirement is not a bad idea. |
If the migration is unsuccessful, the client must never try the preferred address again (MUST continue sending all future packets....). So as @kazuho says, the server can issue address specific (not path-specific) CIDs once it knows whether the client's migration succeeded or failed. But @martinthomson is correct that this is implicitly sending a signal about failed migration. If the server sees a CID retired, apparently never used, it might be able to infer that the CID was used for a failed migration, but this case is unique in that the server needs to take some action for a failed migration and uses this signal to do it. |
So far, we haven't tied the server's knowledge of the client's address to anything about a CID, and a lot of the attacks against migration involve an actor on the network changing the path & address from which the server thinks the packets are coming, even if the client does not actually change anything. I haven't thought this all the way through yet, but it seems as though trying to have the server issue "address-specific" CIDs leads us into some potentially tricky territory. |
@erickinnear made me realize the nature of the hazard here. It seems fairly natural for an server to bind connection IDs to the current socket address. That is part of why we have new connection IDs attached to the preferred address and the forced retirement. After all, if the server uses a different target address, the hash at a load balancer could change and cause packets to be badly routed. Not including the target address in the hash would work, but it would force connection IDs to be bigger. But this all assumes that you never expect the server to migrate. If the server ever wanted to migrate, then the client is stuck with a bunch of unusable connection IDs for its own migrations. We decided that only clients can initiate migration in this version, which might mean that we don't need to worry about this case. We decided not to allow server migration primarily to avoid having to deal with the complex problems that ICE handles. But it was only the lack of mechanisms, not a structural constraint. Now it seems like we have a structural constraint that would prevent server migration. At a minimum, it seems like we should probably acknowledge this constraint somehow. |
The summary of our discussion over dinner last night is that there are various mechanisms a server could employ, but they all depend on one key point: Both endpoints are under joint control, because they're cooperating to handle the connection. Therefore, it's possible to generate a CID that each endpoint will be able to work with. (There are various approaches to generating such a CID, which are implementation-specific; one or more approaches might be described in the QUIC-LB draft.) That means we don't need to separate CIDs by endpoint; if the migration is successful, the server could choose to issue CIDs that aren't valid on the handshake endpoint, which is now out of the picture. |
Seems like a great task for V2: define a policy for handling CID. Or we could try do that in V1 and ship in 2021. |
Discussed in ZRH. Proposed resolution is to close with no action. New Editorial issue to be opened to explain the intended use of |
An important point to capture here is that the server is responsible for ensuring that the connection IDs are not bound to one or other address at this stage (if migration is successful, later ones might be). Thus, the connection ID in the transport parameter is no different to any other connection ID; it only exists here because using the new address is unusable without more connection IDs being available. |
Please do note that this issue points out other editorial concerns (see the original problem statement). I'd prefer seeing them being addressed to (and we have a WIP text in #3354). |
My hope was that the changes in #3354 would help with the editorial parts of the resolution we agreed in Zurich. However, that seems to be stalled. I am going to mark this as proposal ready and will open an editorial issue to track the remaining work on clarifying the text. To provide abundant clarity: the proposed resolution to the design issue is to make no substantive changes to the protocol. |
At the moment, section 9.6.1 states:
I think these paragraphs have two issues:
The text was updated successfully, but these errors were encountered: