New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
check SSH host keys before progressing #6857
Conversation
We want to integrate with the golang crypto ssh library so that we not only check that we can get a TCP connection to the port, but also so that we can check that there is a valid SSH that is presenting the right public key on the other side. Also, our code was causing the goroutines to block indefinitely, as they'd never be able to send on the channel once we find one that is correct, so close a done channel to signal they have nothing to do.
adding a front-end script to test the validation against live SSH servers. Messing up something about how we're passing the data around, need to fix that.
The test suites are broken, because now we need an actual SSH server running on the remote side, since we aren't doing a trivial Dial test. However, the underlying 'Reachable' primative has been tested against real servers and does what we'd like it to do.
We have an SSH service that can run on a port, and actually properly does a key exchange handshake.
We already needed almost all the information on our struct, so by turning it into a method we were able to write a nicer function that just did the host key lookup. Making the logic easier to follow.
Change the Command tests to just use a ReachableChecker that returns a valid host from the list that is supplied. This means we can avoid all of the Dial semantics. We have solid testing around whether Reachable does its job in the reachable tests.
!!build!! |
!!build!! |
1 similar comment
!!build!! |
!!build!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just lots of small stuff.
If you haven't already, please review what the logs look like at DEBUG level in real world situations. We don't want a bunch of new log lines for each SSH related command.
if !c.noHostKeyChecks { | ||
publicKeys, err = c.apiClient.PublicKeys(entity) | ||
if err != nil { | ||
// We ignore NotFound errors, as we may not have finished registering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind clarifying who/what is the second "we" here? Do you mean the machine agent on the target?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, the issue was that a test was expecting a particular context around what entity we were missing keys for. I can fix that by just changing the returned error instead of skipping NotFound.
|
||
s.testSSHCommandHostAddressRetry(c, true) | ||
} | ||
/// XXX(jam): 2017-01-25 do we need these functions anymore? We don't really |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ditch these
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized when looking again, only half of them are testing v1. I can mimic what they were doing by changing it to force failing all ssh addresses, which gets them to pass instead. (which I've done)
/// s.setForceAPIv1(false) | ||
/// | ||
/// s.testSSHCommandHostAddressRetry(c, true) | ||
/// } | ||
|
||
func (s *SSHSuite) testSSHCommandHostAddressRetry(c *gc.C, proxy bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the above are removed then I think this is in use either
@@ -0,0 +1,14 @@ | |||
// Copyright 2016 Canonical Ltd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2017
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -0,0 +1,216 @@ | |||
// Copyright 2014 Canonical Ltd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2017?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually moved from network/reachable.go, so I think the original copyright applies.
// in hostKeyCallback | ||
if !strings.Contains(err.Error(), hostKeyAccepted.Error()) && | ||
!strings.Contains(err.Error(), hostKeyNotInList.Error()) { | ||
logger.Debugf("%v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trace? It's likely there will be a few of these per SSH command right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So either it got to the HostKeyCheck that we do above, and we get nice messages, or we don't get this far. The other messages are already at DEBUG level, so you don't see any of them unless you explicitly ask for --debug.
I gave an example of what 'juju ssh --debug 0' looks like in the pull request. Feedback on whether that is too verbose or not is more than welcome. It seemed a reasonable amount of information to actually be able to debug, without being so verbose as to be clutter/actually hard to parse.
timeout time.Duration | ||
} | ||
|
||
func (r *reachableChecker) FindHost(hostPorts []network.HostPort, publicKeys []string) (network.HostPort, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -0,0 +1,197 @@ | |||
// Copyright 2014 Canonical Ltd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2017?
c.Assert(err, jc.ErrorIsNil) | ||
c.Logf("listening on %q", hostPort) | ||
|
||
shutdown := make(chan struct{}, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the 0
is unnecessary
// do Key exchange to set up the encrypted conversation. | ||
// We return the address where the SSH service is listening, and a channel | ||
// callers must close when they want the service to stop. | ||
func CreateSSHServer(c *gc.C, privateKeys ...string) (string, chan struct{}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite similar in structure to testTCPServer. Could this take a flag or something to make it not do any key exchange so that it can be used for both SSH and plain TCP testing? (avoiding some duplicated code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had considered sharing the TCP side of the connection, went ahead and finished that per your advice. Turned the TCP version into a callback function, and SSH just uses that callback to negotiate an SSH session.
I actively cleaned up a lot of the "SSH" related dialing statements, and trimmed it down to something nice. You can check with "juju ssh --debug 0" as mentioned earlier. It does list out what we are dialing, and when we are trying SSH handshaking, etc, but the messages are generally much more understandable than they used to be, and I avoided a lot of things like IP addresses being repeated 2x in the same message, etc. |
Lots of tweaks suggested by Menno, should clean things up.
I'm pretty sure Menno intended that I should land it as long as I addressed his questions, so |
Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju |
Build failed: Generating tarball failed |
|
Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju |
Build failed: Tests failed |
|
Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju |
Build failed: Tests failed |
|
Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju |
Rather than just checking if we can get to port 22 on hosts that we'd like to connect to, when we know the remote hosts keys we can do the ssh handshake and assert that the key we see is in our list of acceptable host keys.
This should address LP:#1646329
To test this, you can use an old Juju to
from outside the machine should also show a 172.19.0.1 address which means we have one address we can't talk to and one that is duplicated with our host machine
With an old Juju trying to do
Should sometimes fail because it sees it can get to 172.19.0.1 but that is not the actual host we're looking for.
With the new Juju you can do:
And you should see something like:
I tried to find a fair trade on the strings so that it is useful. I'm not 100% sold on dumping the public key information by default, but I figured if it is failing, it is likely to be the most helpful information we can dump. If you do use:
juju ssh --debug --log-level=TRACE 0
You'll see all of the public keys that we've found, as well as some of the other API call results.