Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A29: xDS-Based Security for gRPC Clients and Servers #243

Merged
merged 27 commits into from
Aug 30, 2021

Conversation

sanjaypujare
Copy link
Contributor

ejona86 added a commit to ejona86/proposal that referenced this pull request Jun 11, 2021
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved

##### Server Authorization aka Subject-Alt-Name Checks

If [match_subject_alt_names](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/common.proto#L373)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if it isn't present? Is no verification of the server performed? If so, it seems we have to guarantee we aren't missing other important configuration that might be used instead, like verify_certificate_hash, verify_certificate_spki.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know how I missed this comment earlier.

What if it isn't present? Is no verification of the server performed?

The server cert trust verification is performed of course. But there is no verification (or rather matching) of the ID presented in the certificate because we don't know what to match it to.

... verify_certificate_hash, verify_certificate_spki....

verify_certificate_hash, verify_certificate_spki are purely cryptographic verification of the peer certificate based on out of band values supplied by the control plane and I don't think is a substitute for either hostname verification in Web PKI or the subject-alt-name check here.

However I should probably add here that we should NACK an update that has these fields populated. Thanks for pointing that out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. PTAL

[`FilterChain`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/config/listener/v3/listener_components.proto#L193)s.
The `FilterChain` contains the `DownstreamTlsContext` that is applied to an incoming connection
after selecting the best-matching `FilterChain` as described in
[FilterChainMatch](A36-xds-for-servers.md#filterchainmatch). If the server is using `XdsServerCredentials`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The server doesn't know if it is using XdsServerCredentials.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I understand what you are saying and our implementation does incorporate that. But let me know how this should be reworded here.

service mesh requires the security features described in this proposal.

The Certificate Provider Plugin Framework does not depend on an agent (SDS server). The
SDS server/agent based solution compromised security (by sharing the private key with the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still an agent here, in order to dump files on disk, so this seems questionable. The security property here is the same independent of whether it dumps files on disk or a UDS socket. I can understand if we say we want to allow for systems that don't share the private key, although I think we believe the best security is for the workload to not have the private key at all and keep it solely within an agent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, your comment applies to the file_watcher plugin but the "Certificate Provider Plugin Framework" itself does not assume the existence of an agent. For example, a plugin that mints CSRs and gets them signed by a CA. But I will modify the text to address your comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

A29-xds-tls-security.md Outdated Show resolved Hide resolved
Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great start!

Please let me know if you have any questions. Thanks!

A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
The caller of the framework registers a watcher with the plugin to receive
certificate updates as described below.

### Implementing Security in the xDS Flow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we really need to rehash the overall flow here; that should already be covered earlier in the document. Instead, I think this section should be converted into an "Implementation Details" section, which includes things like the internals of the CertificateProvider API (e.g., flow from CertificateProvider to distributor to TLS creds) or how data flows through LB policies in various languages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, good point and I will convert this into an "Implementation Details" section. Regarding your suggestion

...or how data flows through LB policies in various languages.

I thought it would be better to leave out language specific implementation details from this gRFC similar to how other gRFCs have done. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of a gray area. I think that for a gRFC in the A category (as opposed to those in the L category), the overall design and functional specification needs to be described in a language-agnostic way. However, it's also fine to include some language-specific details to help implementors understand how things need to work, as long as those details are not the core of the document.

In this case, I do think it would be useful to include a high-level description of how data flows through the LB policies. The comments you have earlier in the doc about how C-core uses channel args and Java uses an SslContextProvider are appropriate to mention as part of that.

A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
the CertificateProvider plugin framework enables gRPC to acquire the certificates necessary for the TLS handshake.

The field [`tls_certificate_certificate_provider_instance`][TCCPI] is used for the local certificate and the private
key. And either [`validation_context_certificate_provider_instance`][VCCPI] or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just support one way of providing validation_context_certificate_provider_instance and that should be the one inside of combined_validation_context. Having multiple way unnecessarily expands the API surface. That being said, I see that @markdroth added this field to the proto in https://github.com/envoyproxy/envoy/pull/12237/files so I might be missing some context here.

(Core only supports validation_context_certificate_provider_instance inside of combined_validation_context at the moment.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, I agree with you. I'll change the text to only support validation_context_certificate_provider_instance inside of combined_validation_context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking with @ejona86 about this just last week. I think we misunderstood how the xDS validation config worked when we added those fields. The validation_context_certificate_provider and validation_context_certificate_provider_instance fields don't actually make any sense, because cert providers don't return a complete CertificateValidationContext proto; they just provide the CA certs, which are basically just the trusted_ca field within that proto.

What we probably should have done was simply add the new cert provider fields in CommonTlsContext but not inside of the validation_context_type oneof. The idea would be that the CertificateValidationContext is obtained from one of validation_context, validation_context_sds_secret_config, or combined_validation_context, fully populated except for the trusted_ca field. Then, as a second step, the cert provider is used to populate the trusted_ca field.

I've put together envoyproxy/envoy#17201 for this. We'll need to update all 3 languages and TD.

CC @jaychenatr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the open issue of potential changes to the CertificateProvider fields in xDS is the biggest open issue in this document. We really need to get this nailed down before we finalize this gRFC.

[`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L259)
or [`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L200)
inside [`combined_validation_context`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L251)
is used for the root certificate-chain for validating peer certificates. For mTLS both the local certificate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
Core's behavior: If root certificates (validation context) are present, then client certificates are requested. If additionally, require_client_certificate is set, client certificates are mandated. https://github.com/grpc/grpc/blob/69a130c6ecc1353689b903cf5785a9dc296b39cc/src/core/lib/security/credentials/xds/xds_credentials.cc#L211

If the gRFC adopts this behavior, a question arises on whether we NACK a config where require_client_certificate is true, but no validation context is provided.

and the client needs only the root certificate
(the [`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L200)).

gRPC silently ignores the other fields in `CommonTlsContext` pertaining to other ways of fetching
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
Core's behavior:
On the server side, NACK if no tls_certificate_certificate_provider_instance is found.
On the client side, NACK if no validation_context_certificate_provider_instance is found.

[`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L259)
or [`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L200)
inside [`combined_validation_context`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L251)
is used for the root certificate-chain for validating peer certificates. For mTLS both the local certificate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably also mention the behavior if the instance_name is not recognized, i.e., not present in the bootstrap file.
Core's behavior - Don't check at parsing time, so no NACKing. It's checked later when it needs to be used, at which point an error is propagated. I believe that results in a TransientFailurePicker on the client side and closing the connection on the server side.

A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Show resolved Hide resolved
A29-xds-tls-security.md Show resolved Hide resolved
server := xds.NewGRPCServer(grpc.Creds(creds))
```

### xDS Protocol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late to the party. But the sub-sections underneath and hard to read since they are structured as paragraphs. What do you think about using bullet points instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that because you are trying to view it as a raw text file? How about using the markdown mode of the browser? Also what's the exact issue of "hard to read" due to paragraphs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was viewing the markdown, the issue is not that. I feel that each paragraph is just a stream of words, with references to a lot of protos and fields within them. If we split them into points, and show relevant proto snippets with relevant fields, I feel it would make it easier to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, had the same feeling. It's not a blocker, but bullet points would've been easier to follow.

[A27:CDS][] describes the flow. The field [`transport_socket`][CL-TS] is used to extract the
[`UpstreamTlsContext`][UTC] as described [here][CL-TS-comment].
Note that we don't (currently) support [`transport_socket_matches`][CL-TS-matches].
The `UpstreamTlsContext` thus obtained is passed down to all the child policies and connections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual UpstreamTlsContext message is not passed down to the child policies. Instead whatever information is of use to us is extracted and that is what is passed down. Now sure if that distinction is useful to highlight here.

A29-xds-tls-security.md Show resolved Hide resolved
[`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L259)
or [`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L200)
inside [`combined_validation_context`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L251)
is used for the root certificate-chain for validating peer certificates. For mTLS both the local certificate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gRPC-Go NACKs the resource if require_client_certificate is set and a certificate provider instance name for the root certificate is not provided. I think we discussed this long back and decided to NACK in this case.

[`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L259)
or [`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L200)
inside [`combined_validation_context`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L251)
is used for the root certificate-chain for validating peer certificates. For mTLS both the local certificate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably also mention the behavior if the instance_name is not recognized, i.e., not present in the bootstrap file.

In gRPC-Go, we do something similar to Core. In fact, we cannot verify at parsing time as to whether the given certificate provider instance name is present in the bootstrap config or not. So, we simply accept it. But when the update it given to the CDS balancer, it is able to verify if the instance name is present in the bootstrap config or not. If the instance name is not found in the bootstrap config, the CDS balancer propagates such errors down to its child policy, which eventually leads to the channel being put in transient failure.

On the server side, since we are not required to put the server in not-serving mode for bad configuration (we are only required to do that when the configuration is deleted), we simply fail the connection at handshake time if the provider instance is not found in the bootstrap file.

and the client needs only the root certificate
(the [`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L200)).

gRPC silently ignores the other fields in `CommonTlsContext` pertaining to other ways of fetching
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same behavior in Go as in Core.

A29-xds-tls-security.md Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Show resolved Hide resolved
the CertificateProvider plugin framework enables gRPC to acquire the certificates necessary for the TLS handshake.

The field [`tls_certificate_certificate_provider_instance`][TCCPI] is used for the local certificate and the private
key. And either [`validation_context_certificate_provider_instance`][VCCPI] or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking with @ejona86 about this just last week. I think we misunderstood how the xDS validation config worked when we added those fields. The validation_context_certificate_provider and validation_context_certificate_provider_instance fields don't actually make any sense, because cert providers don't return a complete CertificateValidationContext proto; they just provide the CA certs, which are basically just the trusted_ca field within that proto.

What we probably should have done was simply add the new cert provider fields in CommonTlsContext but not inside of the validation_context_type oneof. The idea would be that the CertificateValidationContext is obtained from one of validation_context, validation_context_sds_secret_config, or combined_validation_context, fully populated except for the trusted_ca field. Then, as a second step, the cert provider is used to populate the trusted_ca field.

I've put together envoyproxy/envoy#17201 for this. We'll need to update all 3 languages and TD.

CC @jaychenatr

[`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L259)
or [`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L200)
inside [`combined_validation_context`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L251)
is used for the root certificate-chain for validating peer certificates. For mTLS both the local certificate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we validate that the certificate provider instance is valid at config parsing time? I think that would be a trivial change in C-core, and it seems like the right thing to do. We basically do the same thing for supported HTTP filters (the only difference being that those are not configured in the bootstrap file, they're set at compile time).

@sanjaypujare
Copy link
Contributor Author

All the reviewers: thanks for your comments. I have addressed all your comments. Pls take a look.

Copy link
Member

@sergiitk sergiitk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

nit: I'd recommend to put it though a spell checker. My Grammarly shows a bunch of complaints on missing commas, etc.

A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
provider whereas a client running in cloud may use a cloud-vendor provided implementation to obtain
the certificate.

gRPC provides a `CertificateProvider` plugin API that can support multiple implementations. The plugin API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

editorial nit: tautology

Suggested change
gRPC provides a `CertificateProvider` plugin API that can support multiple implementations. The plugin API
gRPC offers a `CertificateProvider` plugin API that can support multiple implementations. The plugin API

A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
Co-authored-by: Sergii Tkachenko <hi@sergii.org>
Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is moving in the right direction!

Please let me know if you have any questions. Thanks!

always validates all security configurations in CDS or LDS regardless of whether an XdsChannelCredentials
or XdsServerCredentials is in force. If validation fails, the update is NACKed. When an update is
accepted the correponding security configuration in CDS or LDS is used if the corresponding
Xds*Credentials is in force.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of "if the corresponding Xds*Credentials is in force", I suggest saying "if the application used Xds*Credentials".

field which is optional. The client certificate is sent to the server if the server requests
or requires it in the TLS handshake. If the server requires the client certificate but is not
present then the TLS handshake fails.
in the TLS mode. As part of validating all CDS updates, if this field is not set in a CDS,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/in a CDS/in a CDS resource/


If any of the following fields are present, gRPC will NACK the CDS update:
As part of validating all CDS updates, if any of the following fields are present, gRPC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashykt Doesn't look like C-core is checking these fields.

What's the rationale for NACKing if these are set, as opposed to just ignoring them, as we do with every other xDS field we don't yet support?

To be clear, I'm not necessarily objecting to NACKing these fields -- there are cases where we do that, usually because not doing so could cause a security problem or a forward-compatibility issue in the future. But our default approach should be to ignore fields unless that kind of situation applies. So if there is a reason to NACK if these fields are set, let's document why that behavior is justified. If there isn't any such reason, then we should ignore these fields instead of NACKing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me that ignoring these fields causes a security concern, so I'll wait for a resolution before making the change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eric and I talked a bit about this: we can ignore a field (instead of NACKint it as unsupported) if that doesn't make the implementation any more insecure. Will re-evaluate the list with that goal.


If any of the following fields are present, gRPC will NACK the LDS update:
If any of the following fields are present, gRPC will NACK an LDS update as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashykt Looks like C-core isn't checking these fields either.

My comments above about needing a justification for NACKing fields applies here as well.

[`combined_validation_context`][CVC] is validated in any CDS/LDS update as follows:
[default_validation_context][] is accepted only on the client side i.e. inside `UpstreamTlsContext`.
And inside that field only [match_subject_alt_names][] is processed as described [above][server-authz].
If any of the other fields (listed below) are present, the update is NACKed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yashykt C-core isn't checking these fields.

My comments above about needing a justification for NACKing fields applies here as well.

A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
The caller of the framework registers a watcher with the plugin to receive
certificate updates as described below.

### Implementing Security in the xDS Flow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of a gray area. I think that for a gRFC in the A category (as opposed to those in the L category), the overall design and functional specification needs to be described in a language-agnostic way. However, it's also fine to include some language-specific details to help implementors understand how things need to work, as long as those details are not the core of the document.

In this case, I do think it would be useful to include a high-level description of how data flows through the LB policies. The comments you have earlier in the doc about how C-core uses channel args and Java uses an SslContextProvider are appropriate to mention as part of that.

the CertificateProvider plugin framework enables gRPC to acquire the certificates necessary for the TLS handshake.

The field [`tls_certificate_certificate_provider_instance`][TCCPI] is used for the local certificate and the private
key. And either [`validation_context_certificate_provider_instance`][VCCPI] or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the open issue of potential changes to the CertificateProvider fields in xDS is the biggest open issue in this document. We really need to get this nailed down before we finalize this gRFC.

[`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L259)
or [`validation_context_certificate_provider_instance`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L200)
inside [`combined_validation_context`](https://github.com/envoyproxy/envoy/blob/main/api/envoy/extensions/transport_sockets/tls/v3/tls.proto#L251)
is used for the root certificate-chain for validating peer certificates. For mTLS both the local certificate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sanjay, please clarify in the text that we will NACK if the specified certificate provider instance name is not one of the instances configured in the bootstrap file.


A secure client at a minimum requires [`validation_context_certificate_provider_instance`][VCCPI1]
inside [`combined_validation_context`][CVC] to be able to validate the server certificate
in the TLS mode. If this field is not set, gRPC will NACK the CDS update. The client identity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the language could imply that validation_context_certificate_provider_instance is mandatory all the time, so if combined_validation_context is not present that would still mean to NACK. The tweaked language "if this field is not set in a CDS" seems to make that even stronger.

* if there is no leaf server certificate, or there are no SAN entries in the certificate
the check fails.

* each SAN entry of type DNS, URI and IP address is considered for the below match
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@sanjaypujare sanjaypujare Jul 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Envoy being a general purpose proxy and supporting other protocols (http1 etc) I thought they see more use cases for email names etc. I am not aware of use-cases or usage of email names in gRPC (specifically in the context of proxyless service mesh with security) so I thought it's okay to not support it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed separately, I wasn't saying we should support it but think we should either support it or call out that we are purposefully not supporting (or requiring support for) this. It seems very easy to support, so I'd just assume to support it, but I accept it isn't much of a mainstream feature and if we didn't support it we may never hear from a user noticing it isn't available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's almost identical to URI matching so I don't see any reason to exclude support since the implementation cost is almost zero. Will update the text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added support for email type

A29-xds-tls-security.md Show resolved Hide resolved
* `trust_chain_verification`
* `custom_validator_config`

Only [`validation_context_certificate_provider_instance`][VCCPI1] is accepted. When that value is present
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...on server-side? default_validation_context is accepted on client-side above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified that the field references are for combined_validation_context

A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Show resolved Hide resolved

If any of the following fields are present, gRPC will NACK the CDS update:
As part of validating all CDS updates, if any of the following fields are present, gRPC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me that ignoring these fields causes a security concern, so I'll wait for a resolution before making the change.

A29-xds-tls-security.md Outdated Show resolved Hide resolved
Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending what I have

A29-xds-tls-security.md Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting closer, but I think we still need to make some of the wording clearer, and I think we need to ditch all of the references to the deprecated fields.

Please let me know if you have any questions. Thanks!

A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
@sanjaypujare
Copy link
Contributor Author

This is getting closer, but I think we still need to make some of the wording clearer, and I think we need to ditch all of the references to the deprecated fields.

Please let me know if you have any questions. Thanks!

I have addressed all the comments. PTAL

Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor changes remaining here.

Please let me know if you have any questions. Thanks!

A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
A29-xds-tls-security.md Outdated Show resolved Hide resolved
@sanjaypujare
Copy link
Contributor Author

Just a few minor changes remaining here.

Please let me know if you have any questions. Thanks!

Addressed the latest comments

Copy link
Member

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for doing this!

@markdroth
Copy link
Member

I've just announced this on the Envoy slack instance, so I'll let this sit open for another couple of days, in case we get some feedback from there. But otherwise, I think this is ready to go!

@sanjaypujare
Copy link
Contributor Author

I've just announced this on the Envoy slack instance, so I'll let this sit open for another couple of days, in case we get some feedback from there. But otherwise, I think this is ready to go!

Thanks Mark! I'll plan on merging this on Monday (unless there is feedback).

@sanjaypujare
Copy link
Contributor Author

I've just announced this on the Envoy slack instance, so I'll let this sit open for another couple of days, in case we get some feedback from there. But otherwise, I think this is ready to go!

Thanks Mark! I'll plan on merging this on Monday (unless there is feedback).

I'll be merging this today as planned assuming there are no further comments.

@sanjaypujare sanjaypujare merged commit deaf1bc into grpc:master Aug 30, 2021
@sanjaypujare sanjaypujare deleted the xds-security-new branch August 30, 2021 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants