Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdctl: set TLS servername on discovery #6084

Merged
merged 5 commits into from
Aug 4, 2016

Conversation

heyitsanthony
Copy link
Contributor

@gyuho
Copy link
Contributor

gyuho commented Aug 2, 2016

lgtm. /cc @xiang90

@xiang90
Copy link
Contributor

xiang90 commented Aug 2, 2016

I think the srv discovery has two layers of indirections. First we reach the srv domain, then resolve the services domain, and then resolve the actual IP. We should trust the actual service domain, not the srv domain, right?

An example is here:
https://github.com/coreos/etcd/blob/master/Documentation/op-guide/clustering.md#create-dns-srv-records

@heyitsanthony
Copy link
Contributor Author

@xiang90 I believe the problem is if there's a MITM on the dns server then a SRV record, when discovering on example.com, can be forged to point to infra1.evil.com. If TLS is using root CAs, then etcdctl will connect to infra1.evil.com, expecting the cert to match *.evil.com (legitimately granted by the CA) when really it should expect *.example.com (unavailable to the owner of evil.com).

@lclarkmichalek
Copy link
Contributor

The user specifies that they want to connect to the SRV record, so that is the place where the checked servername needs to come from. For each level of indirection, you need to consider if you trust it or not. In this case, you do not trust the SRV lookup, and then you do not trust the lookup of the services domain. As such, you need to use the SRV domain as the servername.

An alternate case might be the indirection of connecting to one etcd node and then discovering other nodes from there. As we trust the first etcd node, we trust our method of indirection, it is OK to simply verify the certs of the other nodes discovered in that way against their IPs.

This is all a nice example of the end to end principle, fwiw.

@heyitsanthony
Copy link
Contributor Author

@bluepeppers can you confirm this patch works as intended?

@lclarkmichalek
Copy link
Contributor

Yeah, let me just rebuild the cluster exhibiting this bug. I'll try and get back to you some time today.

@lclarkmichalek
Copy link
Contributor

Yup, that works

@xiang90
Copy link
Contributor

xiang90 commented Aug 2, 2016

@heyitsanthony @bluepeppers

With this solution, we force users to use the same domain for both discovery service and etcd service when TLS is set. User cannot use a.com for discovery, but *.b.com for the actual clusters.

We can make this better by not setting serverName if a cert is provided explicitly.

@xiang90
Copy link
Contributor

xiang90 commented Aug 2, 2016

Also we probably need to change the clustering doc accordingly.

@heyitsanthony
Copy link
Contributor Author

@xiang90 I think we need to fix the peer discovery path in etcdmain/embed too in case there's no explicit CA?

@xiang90
Copy link
Contributor

xiang90 commented Aug 2, 2016

@heyitsanthony Yes. Peer discovery, gateway discovery. Fix docs for them.

@heyitsanthony heyitsanthony force-pushed the srv-servername branch 4 times, most recently from 53fe7fb to 545e8da Compare August 3, 2016 03:55
@@ -81,6 +85,37 @@ func startGateway(cmd *cobra.Command, args []string) {
os.Exit(1)
}
plog.Infof("discovered the cluster %s from %s", eps, gatewayDNSCluster)

// confirm TLS connections are good
tlsInfo := transport.TLSInfo{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this part to netutil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would tlsutil be better? not sure about how to draw the lines between transport/tlsutil/netutil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. probably tlsutil is better.

@@ -81,6 +86,21 @@ func startGateway(cmd *cobra.Command, args []string) {
os.Exit(1)
}
plog.Infof("discovered the cluster %s from %s", eps, gatewayDNSCluster)
// confirm TLS connections are good
if !gatewayInsecureDiscovery {
tlsInfo := transport.TLSInfo{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably log that we are verifying discovered endpoints a,b,c here

@xiang90
Copy link
Contributor

xiang90 commented Aug 3, 2016

LGTM after adding logging informations when we skip any endpoints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants