New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow connection reuse for request without credentials when TLS client auth is not in use #341

Open
annevk opened this Issue Jul 25, 2016 · 25 comments

Comments

@annevk
Member

annevk commented Jul 25, 2016

The connection separation we have today is the result of TLS client auth which is a property of the connection, rather than the request.

The argument has been made that we should simply tag connections with respect to whether TLS client auth is used. If it is, a request without credentials cannot use that connection.

If it is not, that connection should be up for reuse by both requests with credentials and without.

We might have to cater for connections being made as a result of a request without credentials. If you later do a request with credentials, it might not be able to reuse that connection since that would prevent TLS client auth? (Do we know whether the server tried to use TLS client auth even if the client doesn't want it, then we might be able to optimize this even more.)

@sleevi

This comment has been minimized.

Show comment
Hide comment
@sleevi

sleevi Jul 25, 2016

NTLM and Kerberos are also connection-level auth methods, and Microsoft has expressed repeated interest in exploring a TLS-CCA 'like' HTTP method. So the need for socket independence is not just limited to TLS-CCA.

sleevi commented Jul 25, 2016

NTLM and Kerberos are also connection-level auth methods, and Microsoft has expressed repeated interest in exploring a TLS-CCA 'like' HTTP method. So the need for socket independence is not just limited to TLS-CCA.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Feb 27, 2017

Member

Can the client detect that NTLM and Kerberos are used?

As for TLS, perhaps we can let the server explicitly opt-out of these connection-sharing-preventing difficulties through an extension? That way CDNs can share connections for fonts and HTML resources.

Member

annevk commented Feb 27, 2017

Can the client detect that NTLM and Kerberos are used?

As for TLS, perhaps we can let the server explicitly opt-out of these connection-sharing-preventing difficulties through an extension? That way CDNs can share connections for fonts and HTML resources.

@sleevi

This comment has been minimized.

Show comment
Hide comment
@sleevi

sleevi Feb 27, 2017

@annevk Can you clarify what you mean about client? Do you mean JS in the page or the receiving server?

And can you clarify whether you're talking TLS extension, fetch extension, or something else?

I do wonder if these suggestions are perhaps thinking about it inverted, in part, because we discussed these pools in context with @igrigorik as part of Resource Hints, and the logic for separation was to ensure that a non-credentialed request is not sent over a credentialed connection. I think the suggestion that was technically sound (just complex to implement) was to treat the pools as a 'common' pool for purposes of preconnect, and then assign them to 'credentialed' or 'non-credentialed' based on both how the connection was established (e.g. if it apriori sent credentials) and upon the disposition of the first request received over the connection.

Your remark about CDNs sharing connections for fonts and HTML resources makes me think the priority of constituencies is wrong - we don't share uncredentialed requests with credentialed requests because our privacy team does not want these to be linked (ignoring all the other ways that they can already be linked). So I don't think we would consider allowing a way for the server to say to send non-credentialed requests over credentialed connections, because that puts the server over the user.

sleevi commented Feb 27, 2017

@annevk Can you clarify what you mean about client? Do you mean JS in the page or the receiving server?

And can you clarify whether you're talking TLS extension, fetch extension, or something else?

I do wonder if these suggestions are perhaps thinking about it inverted, in part, because we discussed these pools in context with @igrigorik as part of Resource Hints, and the logic for separation was to ensure that a non-credentialed request is not sent over a credentialed connection. I think the suggestion that was technically sound (just complex to implement) was to treat the pools as a 'common' pool for purposes of preconnect, and then assign them to 'credentialed' or 'non-credentialed' based on both how the connection was established (e.g. if it apriori sent credentials) and upon the disposition of the first request received over the connection.

Your remark about CDNs sharing connections for fonts and HTML resources makes me think the priority of constituencies is wrong - we don't share uncredentialed requests with credentialed requests because our privacy team does not want these to be linked (ignoring all the other ways that they can already be linked). So I don't think we would consider allowing a way for the server to say to send non-credentialed requests over credentialed connections, because that puts the server over the user.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Feb 27, 2017

Member

By client I mean the browser engine. And the extension I was thinking of would be to TLS, though open to other suggestions.

With regards to priority of constituencies, I guess I would only argue for sharing if the same document was responsible for both requests. That's the case that worries folks. Where images from the CDN go over one connection, and fonts from the CDN go over another, just because of different defaults.

Member

annevk commented Feb 27, 2017

By client I mean the browser engine. And the extension I was thinking of would be to TLS, though open to other suggestions.

With regards to priority of constituencies, I guess I would only argue for sharing if the same document was responsible for both requests. That's the case that worries folks. Where images from the CDN go over one connection, and fonts from the CDN go over another, just because of different defaults.

@sleevi

This comment has been minimized.

Show comment
Hide comment
@sleevi

sleevi Feb 27, 2017

@annevk Right, I can understand why for same-origin, non-credentialed loads, this is not ideal. If I understand your proposal correctly, the idea is that it would be safe to send same-origin, non-credentialed loads on the same underlying transport iff that transport did not bear ambient authority?

If we implemented that, my thought on the risks would be:

  • Cross-origin, non-credentialed loads would be distinguishable
    • We still need cross-origin loads to go over a distinct connection, because of our privacy stance with respect to things like 3P cookie blocking. That is, if you had a same-origin load for google.com, and send a cookie over it that connection, then even though a 3P request for google.com might be both HTTP-credential-less and transport-credential-less, you can associate that cookie with the transport connection, ergo undermining some of the intent of 3P cookie blocking.
  • Between socket late binding (as implemented in Chrome) and H/2's multiple streams, is there a risk of TOCTOU issues in which the socket is assigned because it's 'untained', but then 'tainted' before the request it sent?
    • I'm not sure how NTLM/Kerberos/Negotiate behave in an H/2 world, and this might already already be addressed in H/2

sleevi commented Feb 27, 2017

@annevk Right, I can understand why for same-origin, non-credentialed loads, this is not ideal. If I understand your proposal correctly, the idea is that it would be safe to send same-origin, non-credentialed loads on the same underlying transport iff that transport did not bear ambient authority?

If we implemented that, my thought on the risks would be:

  • Cross-origin, non-credentialed loads would be distinguishable
    • We still need cross-origin loads to go over a distinct connection, because of our privacy stance with respect to things like 3P cookie blocking. That is, if you had a same-origin load for google.com, and send a cookie over it that connection, then even though a 3P request for google.com might be both HTTP-credential-less and transport-credential-less, you can associate that cookie with the transport connection, ergo undermining some of the intent of 3P cookie blocking.
  • Between socket late binding (as implemented in Chrome) and H/2's multiple streams, is there a risk of TOCTOU issues in which the socket is assigned because it's 'untained', but then 'tainted' before the request it sent?
    • I'm not sure how NTLM/Kerberos/Negotiate behave in an H/2 world, and this might already already be addressed in H/2
@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Feb 27, 2017

Member

What I meant was an example.com document loading cdn.example resources, both with (images) and without (fonts) credentials.

I understand better now that this gets harder though if you involve multiple browsing contexts. How does that work today with 3p cookie blocking? If I load example.com in one tab, and another tab that is not example.com loads an image from example.com. Will those use separate connections if 3p cookie blocking is enabled? What if you two tabs that are not example.com loading an image from example.com?

Member

annevk commented Feb 27, 2017

What I meant was an example.com document loading cdn.example resources, both with (images) and without (fonts) credentials.

I understand better now that this gets harder though if you involve multiple browsing contexts. How does that work today with 3p cookie blocking? If I load example.com in one tab, and another tab that is not example.com loads an image from example.com. Will those use separate connections if 3p cookie blocking is enabled? What if you two tabs that are not example.com loading an image from example.com?

@sleevi

This comment has been minimized.

Show comment
Hide comment
@sleevi

sleevi Feb 27, 2017

@annevk OK, so you're talking about H/2 coalescing, not same-origin resources, just to confirm?

The story of 3P cookie blocking is... complicated... and I suspect @mikewest can speak more to it. My most recent examination of the code was that when a request is identified as a 3P one for which cookies should block, then it (effectively) ends up as an uncredentialed request. As a consequence, it goes to our 'uncredentialed' pool for dispatch over the network. However, this is complicated by some of our renderer and memory cache behaviours, so I'm going to narrowly focus on the "Resource wasn't cached" scenario (in memory or on disk) for these examples:

  • example.com loading cdn.example.com resources

    • If the request is no-credentials (fonts), then it will be dispatched to a dedicated socket pool for non-credentialed requests, always.
      • If the socket pool has an already-established connection to cdn.example.com, it'll use that connection. This includes H/2 connections asserting multiple origin identities.
      • Otherwise, it goes off to a new connection.
    • If the request is credentials, then it will be dispatched to the credentialed socket pool.
      • Because the (current) connection is credentialed, since it loaded example.com, then if the H/2 connection also asserts origin identity for cdn.example.com, the request will be dispatched over the current connection.
      • Otherwise (for example, the connection was terminated after the example.com resource loaded), a new connection to cdn.example.com will be established.
  • example.com loading tracker.example, which has cookies associated but triggers the 3P-cookie-blocking:

    • If it meets our criteria as a 3P load / our 3P-cookie blocker says they're not equivalent, then the underlying request is dispatched to a non-credentialed socket pool.
      • While this means that such 3P loads will also never have ambient authority associated with them (because they go through the dedicated 'non-credentialed' pool), this is the intent/desire from a privacy perspective.
    • If it didn't trigger 3P cookie blocking, then it would be dispatched to the 'credentialed' pool, similar to the cdn.example case above.

sleevi commented Feb 27, 2017

@annevk OK, so you're talking about H/2 coalescing, not same-origin resources, just to confirm?

The story of 3P cookie blocking is... complicated... and I suspect @mikewest can speak more to it. My most recent examination of the code was that when a request is identified as a 3P one for which cookies should block, then it (effectively) ends up as an uncredentialed request. As a consequence, it goes to our 'uncredentialed' pool for dispatch over the network. However, this is complicated by some of our renderer and memory cache behaviours, so I'm going to narrowly focus on the "Resource wasn't cached" scenario (in memory or on disk) for these examples:

  • example.com loading cdn.example.com resources

    • If the request is no-credentials (fonts), then it will be dispatched to a dedicated socket pool for non-credentialed requests, always.
      • If the socket pool has an already-established connection to cdn.example.com, it'll use that connection. This includes H/2 connections asserting multiple origin identities.
      • Otherwise, it goes off to a new connection.
    • If the request is credentials, then it will be dispatched to the credentialed socket pool.
      • Because the (current) connection is credentialed, since it loaded example.com, then if the H/2 connection also asserts origin identity for cdn.example.com, the request will be dispatched over the current connection.
      • Otherwise (for example, the connection was terminated after the example.com resource loaded), a new connection to cdn.example.com will be established.
  • example.com loading tracker.example, which has cookies associated but triggers the 3P-cookie-blocking:

    • If it meets our criteria as a 3P load / our 3P-cookie blocker says they're not equivalent, then the underlying request is dispatched to a non-credentialed socket pool.
      • While this means that such 3P loads will also never have ambient authority associated with them (because they go through the dedicated 'non-credentialed' pool), this is the intent/desire from a privacy perspective.
    • If it didn't trigger 3P cookie blocking, then it would be dispatched to the 'credentialed' pool, similar to the cdn.example case above.
@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk Feb 27, 2017

Member

I wasn't necessarily even talking about H/2 connection coalescing. More whether that example could use two rather than three connections and whether that would leak more somehow. But one or two or three is also interesting to consider of course.

As for tracker.example, I don't quite understand the connection reuse for the non-credentialed socket pool. It seems you could track users than across origin visits by carefully keeping track of the connection or maybe TLS session identifiers (assuming you're embedded often enough). I guess the tradeoff is that it's only for a session and not as long as a cookie.

Member

annevk commented Feb 27, 2017

I wasn't necessarily even talking about H/2 connection coalescing. More whether that example could use two rather than three connections and whether that would leak more somehow. But one or two or three is also interesting to consider of course.

As for tracker.example, I don't quite understand the connection reuse for the non-credentialed socket pool. It seems you could track users than across origin visits by carefully keeping track of the connection or maybe TLS session identifiers (assuming you're embedded often enough). I guess the tradeoff is that it's only for a session and not as long as a cookie.

@sleevi

This comment has been minimized.

Show comment
Hide comment
@sleevi

sleevi Feb 27, 2017

@annevk Yup. I'm not defending it as a good policy, if only because I'm not sure I agree with it given http://www.chromium.org/Home/chromium-security/client-identification-mechanisms

Basically, any request that explicitly opts to not include cookies goes over a connection guaranteed to have never sent cookies (modulo any bugs) or authentication information, while any connection that 'could' or 'has' cookies or authentication information goes over a different connection.

sleevi commented Feb 27, 2017

@annevk Yup. I'm not defending it as a good policy, if only because I'm not sure I agree with it given http://www.chromium.org/Home/chromium-security/client-identification-mechanisms

Basically, any request that explicitly opts to not include cookies goes over a connection guaranteed to have never sent cookies (modulo any bugs) or authentication information, while any connection that 'could' or 'has' cookies or authentication information goes over a different connection.

@mikewest

This comment has been minimized.

Show comment
Hide comment
@mikewest

mikewest Apr 20, 2017

Member

Sorry I missed @sleevi's ping earlier. I'm willing to believe that we're making the wrong tradeoff here, and I think there's some justification to considering the implicit correlation of socket connections outside the scope of "credentials" explicitly sent along with requests. There's a bit of a grey area here, since we consider connection-level concepts like channel ID, token binding, TLS session information, etc. to be fairly explicit cookie-like things, but it's possible we're erring too far on the side of caution.

@battre and @msramek from Chrome's privacy team might have more informed opinions.

Member

mikewest commented Apr 20, 2017

Sorry I missed @sleevi's ping earlier. I'm willing to believe that we're making the wrong tradeoff here, and I think there's some justification to considering the implicit correlation of socket connections outside the scope of "credentials" explicitly sent along with requests. There's a bit of a grey area here, since we consider connection-level concepts like channel ID, token binding, TLS session information, etc. to be fairly explicit cookie-like things, but it's possible we're erring too far on the side of caution.

@battre and @msramek from Chrome's privacy team might have more informed opinions.

@jakearchibald

This comment has been minimized.

Show comment
Hide comment
@jakearchibald

jakearchibald Apr 20, 2017

Collaborator

@sleevi

Because the (current) connection is credentialed, since it loaded example.com, then if the H/2 connection also asserts origin identity for cdn.example.com, the request will be dispatched over the current connection.

Does this involve a DNS lookup for cdn.example.com to ensure it points to the same IP?

Collaborator

jakearchibald commented Apr 20, 2017

@sleevi

Because the (current) connection is credentialed, since it loaded example.com, then if the H/2 connection also asserts origin identity for cdn.example.com, the request will be dispatched over the current connection.

Does this involve a DNS lookup for cdn.example.com to ensure it points to the same IP?

@sleevi

This comment has been minimized.

Show comment
Hide comment
@sleevi

sleevi Apr 20, 2017

@jakearchibald Depends on whether it's the first connection or not :) Fetch covers the answer to that (which is yes) =)

sleevi commented Apr 20, 2017

@jakearchibald Depends on whether it's the first connection or not :) Fetch covers the answer to that (which is yes) =)

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk May 8, 2017

Member

I finally started a discussion with other folks at Mozilla: https://groups.google.com/d/topic/mozilla.dev.tech.network/glqron0mRko/discussion. Updates from the Google side still appreciated of course. I wonder if @johnwilander or @hober could shed a light on Apple's thoughts here, and maybe @travisleithead on those of Microsoft?

Member

annevk commented May 8, 2017

I finally started a discussion with other folks at Mozilla: https://groups.google.com/d/topic/mozilla.dev.tech.network/glqron0mRko/discussion. Updates from the Google side still appreciated of course. I wonder if @johnwilander or @hober could shed a light on Apple's thoughts here, and maybe @travisleithead on those of Microsoft?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk May 9, 2017

Member

Firefox might change what it does here: https://bugzilla.mozilla.org/show_bug.cgi?id=1363284.

Member

annevk commented May 9, 2017

Firefox might change what it does here: https://bugzilla.mozilla.org/show_bug.cgi?id=1363284.

@mikewest

This comment has been minimized.

Show comment
Hide comment
@mikewest

mikewest May 9, 2017

Member

I've pointed our privacy folks at that bug and thread, and asked them to comment. Thanks for following up on this, @annevk!

Member

mikewest commented May 9, 2017

I've pointed our privacy folks at that bug and thread, and asked them to comment. Thanks for following up on this, @annevk!

@yoavweiss

This comment has been minimized.

Show comment
Hide comment
@yoavweiss

yoavweiss May 12, 2017

Collaborator

Just to comment a bit on the pain that this would solve:

  1. No need for crossorigin attributes on <link rel=preconnect>
  2. Enable the use of H2 pushed no-credentials resources (currently often pushed on the "wrong" H2 connection)
  3. Enable improved priority handling of no-credentials resources, preventing their contention with higher-priority, credentialed resources.

While 1 & 2 can potentially be solved by (significantly) changing the implementations' handling of connection pools as well as the H2 push cache, 3 seems inherent to the use of multiple connections.
Such bandwidth contention is already something I see often, and it will become more common as more resources (e.g. ES6 modules) are added as no-credentials.

Collaborator

yoavweiss commented May 12, 2017

Just to comment a bit on the pain that this would solve:

  1. No need for crossorigin attributes on <link rel=preconnect>
  2. Enable the use of H2 pushed no-credentials resources (currently often pushed on the "wrong" H2 connection)
  3. Enable improved priority handling of no-credentials resources, preventing their contention with higher-priority, credentialed resources.

While 1 & 2 can potentially be solved by (significantly) changing the implementations' handling of connection pools as well as the H2 push cache, 3 seems inherent to the use of multiple connections.
Such bandwidth contention is already something I see often, and it will become more common as more resources (e.g. ES6 modules) are added as no-credentials.

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk May 12, 2017

Member

The one thing I have not seen addressed anywhere yet is whether multiple connections still have advantages over a single connection as long as TCP is used. It seems that there would be less head-of-line blocking, or is the overhead of an additional connection really high somehow?

Member

annevk commented May 12, 2017

The one thing I have not seen addressed anywhere yet is whether multiple connections still have advantages over a single connection as long as TCP is used. It seems that there would be less head-of-line blocking, or is the overhead of an additional connection really high somehow?

@annevk

This comment has been minimized.

Show comment
Hide comment
@annevk

annevk May 12, 2017

Member

It also seems that #325 will lead us straight back to more authenticated connections. Which seems problematic?

Member

annevk commented May 12, 2017

It also seems that #325 will lead us straight back to more authenticated connections. Which seems problematic?

@igrigorik

This comment has been minimized.

Show comment
Hide comment
@igrigorik

igrigorik May 12, 2017

Member

The one thing I have not seen addressed anywhere yet is whether multiple connections still have advantages over a single connection as long as TCP is used. It seems that there would be less head-of-line blocking, or is the overhead of an additional connection really high somehow?

FWIW, I think this is an orthogonal discussion and shouldn't be a factor in this context.

Member

igrigorik commented May 12, 2017

The one thing I have not seen addressed anywhere yet is whether multiple connections still have advantages over a single connection as long as TCP is used. It seems that there would be less head-of-line blocking, or is the overhead of an additional connection really high somehow?

FWIW, I think this is an orthogonal discussion and shouldn't be a factor in this context.

@achristensen07

This comment has been minimized.

Show comment
Hide comment
@achristensen07

achristensen07 May 23, 2017

We reuse connections basically whenever we consider it advantageous to do so. If someone were to observe our networking stack, they might conclude that connections are dropped seemingly randomly. In reality, complicated heuristics are used to determine when to keep a connection open based on how long it's been unused, how many of various resources we have left, etc. It is unfortunate that this leaks some information across domains. Work is also currently being done to leak less such information. We should work towards a better solution in the long term, but it will be especially hard without compromising performance too much.

achristensen07 commented May 23, 2017

We reuse connections basically whenever we consider it advantageous to do so. If someone were to observe our networking stack, they might conclude that connections are dropped seemingly randomly. In reality, complicated heuristics are used to determine when to keep a connection open based on how long it's been unused, how many of various resources we have left, etc. It is unfortunate that this leaks some information across domains. Work is also currently being done to leak less such information. We should work towards a better solution in the long term, but it will be especially hard without compromising performance too much.

@cramforce

This comment has been minimized.

Show comment
Hide comment
@cramforce

cramforce Aug 14, 2017

screen shot 2017-08-14 at 4 34 24 pm

See screenshot for the real-world impact of Chrome's current behavior. The font downloads in rows 8-11 are delayed by that 1 second connection setup (simulated 3G).

cramforce commented Aug 14, 2017

screen shot 2017-08-14 at 4 34 24 pm

See screenshot for the real-world impact of Chrome's current behavior. The font downloads in rows 8-11 are delayed by that 1 second connection setup (simulated 3G).

@Drawaes

This comment has been minimized.

Show comment
Hide comment
@Drawaes

Drawaes Aug 26, 2017

My nagging thought here is, with the coming of TLS 1.3, will export material be available to clients in some programmable fashion? If so would this allow some "other" code that is reusing the connection to gain access to material that is potentially used to key/protect secret data? Would it basically make the "connection based" export key material "public" to anything running in the context?

Drawaes commented Aug 26, 2017

My nagging thought here is, with the coming of TLS 1.3, will export material be available to clients in some programmable fashion? If so would this allow some "other" code that is reusing the connection to gain access to material that is potentially used to key/protect secret data? Would it basically make the "connection based" export key material "public" to anything running in the context?

@pmeenan

This comment has been minimized.

Show comment
Hide comment
@pmeenan

pmeenan Sep 7, 2017

I know we're not explicitly factoring H2 into this discussion but the other problem it causes is that the resources can no longer be prioritized against each other using H2 priorities. The credentialed and anonymous requests end up competing with each other for bandwidth unless the browser holds requests back (even in the case where connections are coalesced and the site is all served from a single origin).

pmeenan commented Sep 7, 2017

I know we're not explicitly factoring H2 into this discussion but the other problem it causes is that the resources can no longer be prioritized against each other using H2 priorities. The credentialed and anonymous requests end up competing with each other for bandwidth unless the browser holds requests back (even in the case where connections are coalesced and the site is all served from a single origin).

@jeisinger

This comment has been minimized.

Show comment
Hide comment
@jeisinger

jeisinger Sep 7, 2017

Member

IIRC the reason why we introduced this behavior is because we consider part of the connection state (ChannelID) a cookie, and wanted to ensure that blocking cookies (in the broad sense as implemented in chrome) also blocks those.

Using two socket pools for credentialed and uncredentialed requests was a trade-off between teaching the network stack about per-site cookie settings and privacy requirements.

I assume that the measurements cited here are done in a browser without cookie blocking configured, so maybe we should revisit this decision. @sleevi maybe we should move that discussion to a crbug, as this is really more about chrome's implementation than the actual spec in the end?

Member

jeisinger commented Sep 7, 2017

IIRC the reason why we introduced this behavior is because we consider part of the connection state (ChannelID) a cookie, and wanted to ensure that blocking cookies (in the broad sense as implemented in chrome) also blocks those.

Using two socket pools for credentialed and uncredentialed requests was a trade-off between teaching the network stack about per-site cookie settings and privacy requirements.

I assume that the measurements cited here are done in a browser without cookie blocking configured, so maybe we should revisit this decision. @sleevi maybe we should move that discussion to a crbug, as this is really more about chrome's implementation than the actual spec in the end?

@sleevi

This comment has been minimized.

Show comment
Hide comment
@sleevi

sleevi Sep 7, 2017

sleevi commented Sep 7, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment