Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transport: new stream with actual server name #5748

Merged
merged 10 commits into from
Nov 18, 2022
Merged

transport: new stream with actual server name #5748

merged 10 commits into from
Nov 18, 2022

Conversation

holdno
Copy link
Contributor

@holdno holdno commented Oct 28, 2022

When the resolver-address backend is an HAProxy, we invoked gRPC-methods and got 'code = Internal desc = stream terminated by RST_STREAM with error code: PROTOCOL_ERROR', the reason is header value of ': authority' is not matched in HAProxy.

Example: resolver scheme is 'xxx:///serviceName/methodName', resolver return an HAProxy address like 'grpc-proxy.xxx.com:80', but current transport used to target 'serviceName/methodName' as ':authority' header value, when do invoke, HAProxy is not known which endpoint we need to request.

RELEASE NOTES:

  • transport: ensure value of :authority header matches server name used in TLS handshake when the latter is overridden by the name resolver

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 28, 2022

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: holdno / name: wby (40ec788)

@dfawley
Copy link
Member

dfawley commented Oct 28, 2022

Could you take this, please, @easwars?

This seems to implement what was mentioned here (and referenced in a couple follow-up comments).

See also #5360 and #5361, which are related.

@easwars
Copy link
Contributor

easwars commented Nov 3, 2022

I agree with the spirit of these changes, but I'd like to propose an alternate implementation.

A little background:

  • cc.authority contains the authority value for the client channel. This is set via a call to determineAuthority().
  • Once, the authority is determined for the client channel, the only way to override it is by the resolver setting resolver.Address.ServerName.
    • And currently, if this authority value is being overridden by the resolver, we only change the value used for server authentication in the TLS handshake (because the resolver.Address returned by the resolver is used here) and do not change the :authority header during HTTP/2 stream creation (because cc.authority is used here).

My proposed fix is as follows:

  • Add a field to the http2Client struct to store the address passed to it as part of NewClientTransport. This is similar to what you have done already. But rename this field to address, instead of usedAddress.
  • Modify the NewStream method on the http2Client to do the following:
    • if http2Client.address.ServerName is not the same as callHdr.Host:
      • make a copy of the passed in callHdr, overwrite the Host field and pass it to createHeaderFields
  • Add a test for this scenario.

The only reason I want to make a copy of the callHdr before modifying the Host field is because I'm not super sure about any other possible downstream effects. @dfawley : Do you think it might be OK to modify the passed in callHdr.

Making the change in NewStream ensures that all code paths which create a stream will use the correct :authority header. Your change as it stands now only changes the code path from newClientStreamWithParams. It does not handle the code path from newnonRetryClientStream.

@holdno
Copy link
Contributor Author

holdno commented Nov 3, 2022

thx for your advice @easwars , I agree that the field name is better to use address , and I will amend it accordingly. In the terms of copy of callHdr, I would like to know @dfawley 's opinion as well.

…r matches server name used in TLS handshake when the latter is overridden by the name resolver
internal/transport/http2_client.go Outdated Show resolved Hide resolved
internal/transport/http2_client.go Outdated Show resolved Hide resolved
internal/transport/http2_client.go Outdated Show resolved Hide resolved
internal/transport/http2_client.go Outdated Show resolved Hide resolved
internal/transport/transport_test.go Outdated Show resolved Hide resolved
internal/transport/transport_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look mostly good to me. Just some minor nits.

test/authority_test.go Outdated Show resolved Hide resolved
test/authority_test.go Outdated Show resolved Hide resolved
test/authority_test.go Outdated Show resolved Hide resolved
test/authority_test.go Outdated Show resolved Hide resolved
Comment on lines 711 to 719
dupCallHdr := *callHdr
// ServerName field of the resolver returned address takes precedence over
// Host field of CallHdr to determine the :authority header. This is because,
// the ServerName field takes precedence for server authentication during
// TLS handshake, and the :authority header should match the value used
// for server authentication.
if t.address.ServerName != "" {
dupCallHdr.Host = t.address.ServerName
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dfawley : I asked @holdno to make this copy here, because I wasn't completely sure of the downstream effects of changing the CallHdr.Host field. Could you please take a look. Thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's the right call; otherwise if you reuse callHdr across multiple attempts for one RPC, and one transport has an authority set and another doesn't, then you could end up with the wrong behavior.

Can we avoid making this copy, though, if there is no change?

if t.address.ServerName != "" {
	newCallHdr := *callHdr
	newCallHdr.Host = t.address.ServerName
	callHdr = &newCallHdr
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your advice, it looks better.

Copy link
Member

@dfawley dfawley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution!

@dfawley dfawley merged commit 0238b6e into grpc:master Nov 18, 2022
jronak pushed a commit to jronak/grpc-go that referenced this pull request Nov 21, 2022
ahrtr added a commit to ahrtr/etcd that referenced this pull request Jan 17, 2023
ServerName field of the resolver returned address takes precedence
over Host field of CallHdr to determine the :authority header.
Refer to grpc/grpc-go#5748

Signed-off-by: Benjamin Wang <wachao@vmware.com>
@serathius
Copy link

Hey, this change popped up in etcd e2e testing as upgrading grpc to v1.52 changed authority sent by etcd client https://github.com/etcd-io/etcd/pull/15131/files#r1071931813.

Context: Etcd uses grpc resolver to implement loadbalancing and we had issues in grpc < v1.51 as authority didn't work with resolver. We implemented a hack to fix this #4717.

The difference we noticed is fact that authority header sent in v1.52 doesn't include port even though URI contains it. This is described in https://www.rfc-editor.org/rfc/rfc3986#section-3.2. For http://127.0.0.1:80 I would expect authority to be 127.0.0.1:80.

I'm not an expert in this area, so please advise what should be the correct authority header.

@dfawley
Copy link
Member

dfawley commented Jan 17, 2023

@serathius what are you setting resolver.Address.ServerName to in your custom resolver in this case? We should be using that verbatim now (if it's set), even if it contains a :port. Can you link to your resolver implementation?

@serathius
Copy link

@dfawley
Copy link
Member

dfawley commented Jan 17, 2023

Sorry, I'm not really able to follow everything, here, as there is a lot of conditional logic going into setting the ServerName (including determining what endpoints are fed into it in the first place). What's most likely happening is your resolver is setting ServerName to the host only, not host:port. But this field used to be disregarded when setting the :authority header, and only the target's endpoint portion would be used instead. Please read the docstring on resolver.Address.ServerName and make sure you are properly using it. I believe the change made here is correct, in that it uses a consistent :authority header and server name for the credentials used on that connection. So most likely in your situation, one was right and one was wrong, but now at least they match. (I'm unsure whether the right or wrong one for you is now being used after this change.)

Also, note that these APIs are documented as being experimental and unstable, and I'm worried that any changes we make here in the future will break you and your users and cause problems just like the ones we had in the past. We would strongly recommend against using any of these APIs (resolver & balancer being the primary ones) in any packages that are expected to be stable.

@ahrtr
Copy link

ahrtr commented Jan 18, 2023

Previously etcd was depending on gRPC 1.51, and the authority is endpoint from dial target of the form "scheme://[authority]/endpoint", and usually it contains both host and port, such as localhost:2379,

// - endpoint from dial target of the form "scheme://[authority]/endpoint"

But the resolver.Address.ServerName returned by the etcd customized resolver doesn't contain the port, such as "localhost". After bumping gRPC 1.52, the authority changes (e.g from localhost:2379 to localhost). So some test cases in etcd are broken because the authority isn't expected value.

We (etcd) have two options for now,

  1. change the test cases just I did in dependency: bump google.golang.org/grpc from 1.51.0 to 1.52.0 etcd-io/etcd#15131;
  2. update the customized resolver to make sure resolver.Address.ServerName has the same value as the endpoint used in "scheme://[authority]/endpoint"`. But since the serverName is used for certificate validation, so it might break the some logic related to certificate validation.

I followed option 1 above for now. Please let me know if you have any concerns.

@dfawley
Copy link
Member

dfawley commented Jan 18, 2023

I'm not sure which way is the right one for you, but I am 90% sure that using a different :authority header from what is given to the credentials via ServerName is wrong and can lead to security problems, so having it be consistent is the primary goal.

I'm still concerned about your usage of these APIs. Maybe one day we can chat about your use cases and see if there's some other way to accomplish what you're doing using stable APIs. We will be changing these APIs, and unless something is done ahead of time, or in a coordinated fashion, your users will be broken again.

@ahrtr
Copy link

ahrtr commented Jan 18, 2023

Thanks @dfawley .

I just raised a ticket in etcd to track the task of removing the dependency on the experimental APIs. Could you share more info/doc on this, such as what's the alternative solution?

@dfawley
Copy link
Member

dfawley commented Jan 18, 2023

Could you share more info/doc on this, such as what's the alternative solution?

I would need to know more about what your goals and requirements are in order to suggest a solution. We may be able to provide a higher level API that you could use if we don't have something that meets your needs already.

@ahrtr
Copy link

ahrtr commented Jan 18, 2023

I would need to know more about what your goals and requirements are in order to suggest a solution. We may be able to provide a higher level API that you could use if we don't have something that meets your needs already.

We will think about this and get back to you when needed. Thanks again.

@holdno holdno deleted the feat/NewStreamWithActualServerName branch January 19, 2023 07:03
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants