check SSH host keys before progressing #6857

jameinel · 2017-01-23T11:36:08Z

Rather than just checking if we can get to port 22 on hosts that we'd like to connect to, when we know the remote hosts keys we can do the ssh handshake and assert that the key we see is in our list of acceptable host keys.

This should address LP:#1646329

To test this, you can use an old Juju to

bootstrap an lxd controller to work in

$ juju bootstrap lxd test-lxd
$ juju switch controller

create a device on your host machine that has the same IP address as a device in the container

$ sudo brctl addbr br-xxx
$ sudo ip a add 172.19.0.1 dev br-xxx

go into the container and configure its device in the same way
you must also restart the jujud agent for it to report the new IP address

$ juju ssh 0
$ sudo brctl addbr br-yyy
$ sudo ip a add 172.19.01 dev br-yyy
$ sudo brctl addbr br-zzz
$ sudo ip a add 172.18.0.1 dev br-zzz
$ sudo service jujud-machine-0 restart

now from the outside

$ juju show-machine 0

should show 3 IP addresses. Usually a 10.* address and now a 172.19.0.1 address and a 172.18.0.1

$ ip a s

from outside the machine should also show a 172.19.0.1 address which means we have one address we can't talk to and one that is duplicated with our host machine

With an old Juju trying to do

$ juju ssh 0

Should sometimes fail because it sees it can get to 172.19.0.1 but that is not the actual host we're looking for.
With the new Juju you can do:

$ juju ssh --debug 0

And you should see something like:

15:57:21 DEBUG juju.network.ssh reachable.go:140 dialing "172.19.0.1:22" to check host keys
15:57:21 DEBUG juju.network.ssh reachable.go:140 dialing "10.139.15.152:22" to check host keys
15:57:21 DEBUG juju.network.ssh reachable.go:153 connected to "172.19.0.1:22", initiating ssh handshake
15:57:21 DEBUG juju.network.ssh reachable.go:153 connected to "10.139.15.152:22", initiating ssh handshake
15:57:21 DEBUG juju.network.ssh reachable.go:140 dialing "172.18.0.1:22" to check host keys
15:57:21 DEBUG juju.network.ssh reachable.go:99 host key for 172.19.0.1:22 not in our accepted set:  use --debug --log-level=TRACE to see actual key
15:57:21 DEBUG juju.network.ssh reachable.go:86 accepted host key for: 10.139.15.152:22
15:57:21 INFO  juju.network.ssh reachable.go:200 found 10.139.15.152:22 has an acceptable ssh key
15:57:21 DEBUG juju.cmd.juju.commands ssh_common.go:369 using target "0" address "10.139.15.152"
...
15:57:22 DEBUG juju.network.ssh reachable.go:143 dial "172.18.0.1:22" failed with: dial tcp 172.18.0.1:22: i/o timeout

I tried to find a fair trade on the strings so that it is useful. I'm not 100% sold on dumping the public key information by default, but I figured if it is failing, it is likely to be the most helpful information we can dump. If you do use:
juju ssh --debug --log-level=TRACE 0
You'll see all of the public keys that we've found, as well as some of the other API call results.

We want to integrate with the golang crypto ssh library so that we not only check that we can get a TCP connection to the port, but also so that we can check that there is a valid SSH that is presenting the right public key on the other side. Also, our code was causing the goroutines to block indefinitely, as they'd never be able to send on the channel once we find one that is correct, so close a done channel to signal they have nothing to do.

adding a front-end script to test the validation against live SSH servers. Messing up something about how we're passing the data around, need to fix that.

The test suites are broken, because now we need an actual SSH server running on the remote side, since we aren't doing a trivial Dial test. However, the underlying 'Reachable' primative has been tested against real servers and does what we'd like it to do.

…n SSH host.

We have an SSH service that can run on a port, and actually properly does a key exchange handshake.

We already needed almost all the information on our struct, so by turning it into a method we were able to write a nicer function that just did the host key lookup. Making the logic easier to follow.

…e probing

Change the Command tests to just use a ReachableChecker that returns a valid host from the list that is supplied. This means we can avoid all of the Dial semantics. We have solid testing around whether Reachable does its job in the reachable tests.

jameinel · 2017-01-26T07:02:43Z

!!build!!

jameinel · 2017-01-26T18:00:09Z

!!build!!

jameinel · 2017-01-29T11:23:37Z

!!build!!

jameinel · 2017-01-30T03:30:10Z

!!build!!

mjs

Looks good. Just lots of small stuff.

If you haven't already, please review what the logs look like at DEBUG level in real world situations. We don't want a bunch of new log lines for each SSH related command.

mjs · 2017-01-30T07:40:34Z

cmd/juju/commands/ssh_common.go

+	if !c.noHostKeyChecks {
+		publicKeys, err = c.apiClient.PublicKeys(entity)
+		if err != nil {
+			// We ignore NotFound errors, as we may not have finished registering


Would you mind clarifying who/what is the second "we" here? Do you mean the machine agent on the target?

IIRC, the issue was that a test was expecting a particular context around what entity we were missing keys for. I can fix that by just changing the returned error instead of skipping NotFound.

mjs · 2017-01-30T07:41:48Z

cmd/juju/commands/ssh_unix_test.go

-
-	s.testSSHCommandHostAddressRetry(c, true)
-}
+/// XXX(jam): 2017-01-25 do we need these functions anymore? We don't really


I think ditch these

I realized when looking again, only half of them are testing v1. I can mimic what they were doing by changing it to force failing all ssh addresses, which gets them to pass instead. (which I've done)

mjs · 2017-01-30T07:43:44Z

cmd/juju/commands/ssh_unix_test.go

+/// 	s.setForceAPIv1(false)
+///
+/// 	s.testSSHCommandHostAddressRetry(c, true)
+/// }

 func (s *SSHSuite) testSSHCommandHostAddressRetry(c *gc.C, proxy bool) {


If the above are removed then I think this is in use either

mjs · 2017-01-30T07:44:09Z

network/ssh/package_test.go

@@ -0,0 +1,14 @@
+// Copyright 2016 Canonical Ltd.


mjs · 2017-01-30T07:44:18Z

network/ssh/reachable.go

@@ -0,0 +1,216 @@
+// Copyright 2014 Canonical Ltd.


This was actually moved from network/reachable.go, so I think the original copyright applies.

mjs · 2017-01-30T07:59:13Z

network/ssh/reachable.go

+		// in hostKeyCallback
+		if !strings.Contains(err.Error(), hostKeyAccepted.Error()) &&
+			!strings.Contains(err.Error(), hostKeyNotInList.Error()) {
+			logger.Debugf("%v", err)


Trace? It's likely there will be a few of these per SSH command right?

So either it got to the HostKeyCheck that we do above, and we get nice messages, or we don't get this far. The other messages are already at DEBUG level, so you don't see any of them unless you explicitly ask for --debug.
I gave an example of what 'juju ssh --debug 0' looks like in the pull request. Feedback on whether that is too verbose or not is more than welcome. It seemed a reasonable amount of information to actually be able to debug, without being so verbose as to be clutter/actually hard to parse.

mjs · 2017-01-30T07:59:29Z

network/ssh/reachable.go

+	timeout time.Duration
+}
+
+func (r *reachableChecker) FindHost(hostPorts []network.HostPort, publicKeys []string) (network.HostPort, error) {


Doc string?

mjs · 2017-01-30T08:00:29Z

network/ssh/reachable_test.go

@@ -0,0 +1,197 @@
+// Copyright 2014 Canonical Ltd.


mjs · 2017-01-30T08:01:22Z

network/ssh/reachable_test.go

+	c.Assert(err, jc.ErrorIsNil)
+	c.Logf("listening on %q", hostPort)
+
+	shutdown := make(chan struct{}, 0)


the 0 is unnecessary

mjs · 2017-01-30T08:07:24Z

network/ssh/testing/sshserver.go

+// do Key exchange to set up the encrypted conversation.
+// We return the address where the SSH service is listening, and a channel
+// callers must close when they want the service to stop.
+func CreateSSHServer(c *gc.C, privateKeys ...string) (string, chan struct{}) {


This is quite similar in structure to testTCPServer. Could this take a flag or something to make it not do any key exchange so that it can be used for both SSH and plain TCP testing? (avoiding some duplicated code)

I had considered sharing the TCP side of the connection, went ahead and finished that per your advice. Turned the TCP version into a callback function, and SSH just uses that callback to negotiate an SSH session.

jameinel · 2017-01-30T08:16:48Z

I actively cleaned up a lot of the "SSH" related dialing statements, and trimmed it down to something nice. You can check with "juju ssh --debug 0" as mentioned earlier. It does list out what we are dialing, and when we are trying SSH handshaking, etc, but the messages are generally much more understandable than they used to be, and I avoided a lot of things like IP addresses being repeated 2x in the same message, etc.

Lots of tweaks suggested by Menno, should clean things up.

jameinel · 2017-01-30T09:18:42Z

I'm pretty sure Menno intended that I should land it as long as I addressed his questions, so $$merge$$

jujubot · 2017-01-30T09:20:14Z

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

jujubot · 2017-01-30T09:23:03Z

Build failed: Generating tarball failed
build url: http://juju-ci.vapour.ws:8080/job/github-merge-juju/10145

jameinel · 2017-01-30T09:24:49Z

$$merge$$

jujubot · 2017-01-30T09:26:14Z

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

jujubot · 2017-01-30T09:55:47Z

Build failed: Tests failed
build url: http://juju-ci.vapour.ws:8080/job/github-merge-juju/10146

jameinel · 2017-01-30T10:47:53Z

$$merge$$ Ping test failure seems Inconsistent

jujubot · 2017-01-30T10:48:15Z

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

jujubot · 2017-01-30T11:15:13Z

Build failed: Tests failed
build url: http://juju-ci.vapour.ws:8080/job/github-merge-juju/10147

jameinel · 2017-01-30T11:42:52Z

$$merge$$ the only failure appears to be in 'grant' which seems we aren't prompting for a password, which seems a known bug in the representative-tests suite:

    # This scenario is pre-macaroon.
    # See https://bugs.launchpad.net/bugs/1621532
    child.expect('(?i)password')
    child.sendline(user.name + '_password_2')
    # end non-macaroon.

jujubot · 2017-01-30T11:44:15Z

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

jameinel added 13 commits January 23, 2017 00:02

(broken) very close to validating real ssh keys.

76cd938

adding a front-end script to test the validation against live SSH servers. Messing up something about how we're passing the data around, need to fix that.

If we don't know any public keys, we just confirm that it really is a…

5126956

…n SSH host.

Write a helper to start an SSH server-side connection.

09e3580

The tests are now running.

c7ab748

We have an SSH service that can run on a port, and actually properly does a key exchange handshake.

Cleanup the function a lot by creating a method.

648530a

We already needed almost all the information on our struct, so by turning it into a method we were able to write a nicer function that just did the host key lookup. Making the logic easier to follow.

fix a typo

7d8bc77

Start looking at implementing more of fakeConn so that we can test th…

1e0124a

…e probing

typo

3016ac8

instead of a loose func, turn it into an interface{}

e0f4af6

go fmt

3803664

Get the test suite happy again.

2060ec8

Change the Command tests to just use a ReachableChecker that returns a valid host from the list that is supplied. This means we can avoid all of the Dial semantics. We have solid testing around whether Reachable does its job in the reachable tests.

jameinel changed the title ~~WIP: check SSH host keys before progressing~~ check SSH host keys before progressing Jan 25, 2017

clean up the debugging messages so that they are maximally useful.

c0ed836

jameinel added 2 commits January 26, 2017 16:37

leftover import

899a7a6

Merge remote-tracking branch 'upstream/2.1' into 2.1-ssh-keyscan-1646329

63edd7d

jameinel closed this Jan 29, 2017

jameinel deleted the 2.1-ssh-keyscan-1646329 branch January 29, 2017 14:27

jameinel restored the 2.1-ssh-keyscan-1646329 branch January 29, 2017 14:27

jameinel reopened this Jan 30, 2017

remove 0

e5bb962

mjs reviewed Jan 30, 2017

View reviewed changes

mjs approved these changes Jan 30, 2017

View reviewed changes

Review feedback.

6a5b1f4

Lots of tweaks suggested by Menno, should clean things up.

Remove the compatibility helper, since nobody is using it now.

ba42b27

go fmt

79bcb90

jujubot merged commit c0643f8 into juju:2.1 Jan 30, 2017

jameinel deleted the 2.1-ssh-keyscan-1646329 branch April 22, 2017 16:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

check SSH host keys before progressing #6857

check SSH host keys before progressing #6857

jameinel commented Jan 23, 2017 •

edited

jameinel commented Jan 26, 2017

jameinel commented Jan 26, 2017

jameinel commented Jan 29, 2017

jameinel commented Jan 30, 2017

mjs left a comment

mjs Jan 30, 2017

jameinel Jan 30, 2017

mjs Jan 30, 2017

jameinel Jan 30, 2017

mjs Jan 30, 2017

mjs Jan 30, 2017

jameinel Jan 30, 2017

mjs Jan 30, 2017

jameinel Jan 30, 2017

mjs Jan 30, 2017

jameinel Jan 30, 2017

mjs Jan 30, 2017

jameinel Jan 30, 2017

mjs Jan 30, 2017

mjs Jan 30, 2017

mjs Jan 30, 2017

jameinel Jan 30, 2017

jameinel commented Jan 30, 2017

jameinel commented Jan 30, 2017

jujubot commented Jan 30, 2017

jujubot commented Jan 30, 2017

jameinel commented Jan 30, 2017

jujubot commented Jan 30, 2017

jujubot commented Jan 30, 2017

jameinel commented Jan 30, 2017

jujubot commented Jan 30, 2017

jujubot commented Jan 30, 2017

jameinel commented Jan 30, 2017

jujubot commented Jan 30, 2017

check SSH host keys before progressing #6857

check SSH host keys before progressing #6857

Conversation

jameinel commented Jan 23, 2017 • edited

jameinel commented Jan 26, 2017

jameinel commented Jan 26, 2017

jameinel commented Jan 29, 2017

jameinel commented Jan 30, 2017

mjs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jameinel commented Jan 30, 2017

jameinel commented Jan 30, 2017

jujubot commented Jan 30, 2017

jujubot commented Jan 30, 2017

jameinel commented Jan 30, 2017

jujubot commented Jan 30, 2017

jujubot commented Jan 30, 2017

jameinel commented Jan 30, 2017

jujubot commented Jan 30, 2017

jujubot commented Jan 30, 2017

jameinel commented Jan 30, 2017

jujubot commented Jan 30, 2017

jameinel commented Jan 23, 2017 •

edited