Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
overlord/devicestate: switch to ssh-keygen for device key generation #3130
Conversation
vosst
added some commits
Apr 3, 2017
| + deviceAPIBase = deviceAPIBaseURL() | ||
| + requestIDURL = deviceAPIBase + "request-id" | ||
| + serialRequestURL = deviceAPIBase + "devices" | ||
| + sshKeyFile = "/tmp/snapd.key.tmp" |
pedronis
Apr 3, 2017
Contributor
we need to think where to best put these, though we know ssh-keygen will make sure they are root only accessible
| ) | ||
| +func sshGenerateKey() (*rsa.PrivateKey, error) { |
pedronis
Apr 3, 2017
•
Contributor
an approach we use when we have things that invoke external commands is having:
sshGenerateKeyImpl this function
var sshGenerateKey = sshGenerateKeyImpl
MockSshGenarateKey
so most tests would Mock this function with something that doesn't fork/exec,
and a couple would test the real thing. It's worth only if the whole devicestate suite is much slower than on master
vosst
Apr 3, 2017
Contributor
I did a couple of test runs and both master and this branch take mostly the same time, differences are in the range of milliseconds.
| + | ||
| + _, err := cmd.CombinedOutput() | ||
| + if err != nil { | ||
| + return nil, err |
pedronis
Apr 3, 2017
Contributor
we probably want to use osutil.OutputErr here:
out, err := cmd.CombinedOutput()
if err != nil {
return nil, osutil.OutputErr(out, err)
}
| + } | ||
| + | ||
| + blk, _ := pem.Decode(d) | ||
| + if err != nil { |
pedronis
Apr 3, 2017
Contributor
this is a left over from before noticing that Decode doesn't return an error actually
|
Hello - I was asked to take a look at this and was told that the change was for first-boot performance reasons on rpi2 and others. I'm not a fan of this change since ssh-keygen is an external program and has a configuration file that we're not in control of. I feel like we should enable khwrng since most of these ARM chips have hardware random number generators and then take a better look into why Go's RSA keygen routine is taking so long. I hear that @vosst is working on gathering some performance numbers between Go and OpenSSH keygen routines. |
|
@tyhicks Thanks for your remarks. We are putting this PR up for review early as we treat it as a fallback right now. We already did the work of enabling hwrng and have it participate in the kernel's entropy gathering process. I'm deep diving into Go's rsa key generation algorithm and compare it to the one used by ssh-keygen. |
|
@tyhicks even with the khwrng enabled the key generation in Go takes minutes while ssh-keygen takes seconds ... apparently there are some deep changes in Go needed to fix this as i understand. the ssh-keygen switch is only thought as an interim solution til Go itself can be fixed. |
|
testable images with this change included are at http://people.canonical.com/~ogra/snappy/all-snaps/ssh-keygen-test/ |
setharnold
commented
Apr 3, 2017
|
Is there any reason why 25519 keys aren't being generated instead? IIRC those are just 32 bytes of randomness with a few bits set specifically to ensure that the resulting key is on the curve. No primality testing, no probabilistic generation, etc. Thanks |
vosst
added some commits
Apr 3, 2017
My understanding is that it's just less sophisticated. IIRC (and this maybe wrong), the performance of generating RSA keys is dominated by the performance of running the Miller-Rabin primality test which is dominated by the performance of Montgomery multiplication. openssl has hyper-optimized assembly montgomery multiplication routines, Go doesn't. |
|
@mwhudson yup, I'm taking cpu profiles right now, and Go's BigInt operations that are repeatedly used in the primality tests seem to be the bottleneck here. I will follow up with more data. See for example golang/go#15833. Simple profiling is here: https://gist.github.com/vosst/cb3d16e1f1596baf6ccb989d6b1897a7 |
|
http://paste.ubuntu.com/24312862/ has a set of captured data from 16 installs in a row (freshly flashed SD card each) with an image using this patch, the key generation with this takes between 10 and 20 seconds (compared to 5-15 minutes using the Go routine) |
|
There's a failure in one of the tests. That test is pretty racy and I saw it fail many times recently:
|
|
There's some formatting errors as well:
I suspect you want to try |
|
Applied gofmt -s (did not know about -s). |
zyga
and others
added some commits
Apr 5, 2017
| - out, err := cmd.CombinedOutput() | ||
| - if err != nil { | ||
| - return nil, osutil.OutputErr(out, err) | ||
| + switch C.rsa_generate_key(C.uint64_t(keyLength), C.CString(sshKeyFile), C.CString(sshPublicKeyFile)) { |
chipaca
Apr 7, 2017
Member
does rsa_generate_key take ownership of the strings? otherwise you're leaking them
| + os.Remove(sshKeyFile) | ||
| + os.Remove(sshPublicKeyFile) | ||
| + | ||
| + switch C.rsa_generate_key(C.uint64_t(keyLength), C.CString(sshKeyFile), C.CString(sshPublicKeyFile)) { |
vosst
Apr 7, 2017
Contributor
Mainly because building up an rsa.PrivateKey is non-trivial and I'm using the parsing functions to get the job done. I haven't looked too deeply if I could leverage the openssl PEM encoding functions to place the encoded key into memory.
| + case C.RSA_KEY_GENERATION_IO_FAILURE: | ||
| + return nil, errors.New("Failed to generate RSA key: Could not persist keys.") | ||
| + case C.RSA_KEY_GENERATION_SUCCESS: | ||
| + break |
vosst
Apr 7, 2017
Contributor
I modelled the enumeration as exhaustive as possible, aiming to capture all possible error conditions. For that, I left out the generic error code.
| +#include <stdint.h> | ||
| +#include <stdio.h> | ||
| + | ||
| +RSAKeyGenerationResult rsa_generate_key(uint64_t bits, const char *private_key_file, const char *public_key_file) { |
zyga
Apr 7, 2017
Contributor
How about calling this snapd_... so that various snapd C functions have a common prefix?
| - -- Thomas Voß <thomas.voss@canonical.com> Mon, 03 Apr 2017 16:49:34 +0200 | ||
| - | ||
| -snapd (2.23.6) xenial; urgency=medium | ||
| +napd (2.23.6) xenial; urgency=medium |
| +#include <stdint.h> | ||
| + | ||
| +typedef enum { | ||
| + RSA_KEY_GENERATION_SUCCESS, |
| + goto free_all; | ||
| + } | ||
| + | ||
| + bp_public = BIO_new_file(public_key_file, "w+"); |
vosst
Apr 7, 2017
Contributor
Indeed, I adjusted the code to use it and avoid any sort of out-of-band file operations.
|
thomas asked me to do another round of tests after switching to direct openssl calls from shelling out to ssh-keygen ... http://paste.ubuntu.com/24334125/ has the results of 15 installs from the "Generating device key" step to "Mark system seeded". numbers do not seem much different from the ssh-keygen case, in average the key generation is around 10sec with 4sec being the fastest and about 30sec being the slowest. |
vosst
added some commits
Apr 7, 2017
vosst
changed the title from
overlord/devicestate: switch to keygen for device key generation
to
overlord/devicestate: switch to lib(open)ssl for device key generation
Apr 10, 2017
| ) | ||
| +func rsaGenerateKey() (*rsa.PrivateKey, error) { |
pedronis
Apr 10, 2017
Contributor
I would call this one generateRSAKey(keyLength int) ...
and at this point probably better to split this out also to a generate_rsa_key.go file
| + | ||
| + switch C.snapd_rsa_generate_key(C.uint64_t(keyLength), &privateKey) { | ||
| + case C.SNAPD_RSA_KEY_GENERATION_SEED_FAILURE: | ||
| + return nil, errors.New("Failed to generate RSA key: RNG not seeded.") |
| + case C.SNAPD_RSA_KEY_GENERATION_SEED_FAILURE: | ||
| + return nil, errors.New("Failed to generate RSA key: RNG not seeded.") | ||
| + case C.SNAPD_RSA_KEY_GENERATION_ALLOCATION_FAILURE: | ||
| + return nil, errors.New("Failed to generate RSA key: Could not allocate memory.") |
| + defer C.free(unsafe.Pointer(privateKey.memory)) | ||
| + blk, _ := pem.Decode(C.GoBytes(unsafe.Pointer(privateKey.memory), C.int(privateKey.size))) | ||
| + if blk == nil { | ||
| + return nil, errors.New("Failed to decode PEM block") |
| + BIO_clear_flags(bp_private, BIO_FLAGS_MEM_RDONLY); | ||
| + | ||
| + if (PEM_write_bio_RSAPrivateKey(bp_private, rsa, empty_cipher, empty_passphrase, empty_passphrase_len, NULL, NULL) != 1) { | ||
| + result = SNAPD_RSA_KEY_GENERATION_IO_FAILURE; |
vosst
Apr 10, 2017
Contributor
We cannot really tell. Might fail due to an issue in allocating memory by the underlying BIO memory writer.
| + // allocation functions and just free'ing it might not be correct. | ||
| + BUF_MEM *buf_mem = NULL; | ||
| + BIO_get_mem_ptr(bp_private, &buf_mem); | ||
| + private_key->memory = malloc(buf_mem->length); |
| + } | ||
| + | ||
| + BIO_set_mem_buf(bp_private, BUF_MEM_new(), BIO_CLOSE); | ||
| + BIO_clear_flags(bp_private, BIO_FLAGS_MEM_RDONLY); |
vosst
added some commits
Apr 10, 2017
| oldKeyLength := keyLength | ||
| + if n < 1024 { | ||
| + n = 1024 |
| + "unsafe" | ||
| +) | ||
| + | ||
| +func generateRSAKey(keyLength uint64) (*rsa.PrivateKey, error) { |
| + case C.SNAPD_RSA_KEY_GENERATION_KEY_GENERATION_FAILURE: | ||
| + return nil, errors.New("cannot generate RSA key") | ||
| + case C.SNAPD_RSA_KEY_GENERATION_IO_FAILURE: | ||
| + return nil, errors.New("cannot generate RSA key: could not persist keys") |
pedronis
Apr 10, 2017
Contributor
s/IO/MARSHAL/ s/persist/marshal/
otherwise it sounds this is about disk which is not the case anymore
| @@ -85,7 +82,7 @@ func (m *DeviceManager) doGenerateDeviceKey(t *state.Task, _ *tomb.Tomb) error { | ||
| return nil | ||
| } | ||
| - keyPair, err := rsa.GenerateKey(rand.Reader, keyLength) | ||
| + keyPair, err := generateRSAKey(keyLength) |
pedronis
Apr 10, 2017
Contributor
this is something we should have done before, we can unlock around this:
st.Unlock()
keyPari, err := ...
st.Lock()
| + BIO_set_mem_buf(bp_private, BUF_MEM_new(), BIO_CLOSE); | ||
| + | ||
| + if (PEM_write_bio_RSAPrivateKey(bp_private, rsa, empty_cipher, empty_passphrase, empty_passphrase_len, NULL, NULL) != 1) { | ||
| + result = SNAPD_RSA_KEY_GENERATION_IO_FAILURE; |
vosst
added some commits
Apr 10, 2017
vosst
changed the title from
overlord/devicestate: switch to lib(open)ssl for device key generation
to
overlord/devicestate: switch to ssh-keygen for device key generation
Apr 10, 2017
| +var ( | ||
| + sshKeyFile = "snapd.key.tmp" | ||
| + sshPublicKeyFile = sshKeyFile + ".pub" | ||
| +) |
tyhicks
Apr 10, 2017
Contributor
What directory are we in when creating this? Should an absolute path be specified instead of blindly going with the current working directory?
tyhicks
Apr 10, 2017
Contributor
I don't have a specific suggestion as I'm not sure what directory snapd typically uses for this sort of thing. Ideally, it should only be readable/writable by root.
@niemeyer any suggestions?
tyhicks
Apr 10, 2017
Contributor
/run/snapd/ feels like the right place but I noticed that it isn't always created. For example, a test system of mine with no snaps installed does not have /run/snapd/ but it is created once a snap is installed.
I see it is referenced in dirs/dirs.go with the SnapRunNsDir variable. It is probably best that someone more active in snapd development advises you on if it is appropriate to use for your case.
ogra1
Apr 10, 2017
•
Contributor
this is moot, this function is used during the firstboot process of Ubuntu Core IoT images to generate the device key ... i dont think when using snapd on a desktop this function is used at all, you can be sure that /run/snapd exists and snapd is running at that point.
vosst
added some commits
Apr 11, 2017
pedronis
dismissed
their
stale review
Apr 11, 2017
PR changed again
| + | ||
| + os.Remove(sshKeyFile) | ||
| + os.Remove(sshPublicKeyFile) | ||
| + |
pedronis
Apr 11, 2017
•
Contributor
what ogra said is not fully true, we will use/might use this also on classic at some point, it's safer to create that dir here as well if it doesn't exist
| - return nil, errors.New("cannot generate RSA key: could not marshal key") | ||
| - case C.SNAPD_RSA_KEY_GENERATION_SUCCESS: | ||
| - break | ||
| + cmd := exec.Command("ssh-keygen", "-t", "rsa", "-b", strconv.FormatInt(int64(keyLength), 10), "-N", "", "-f", sshKeyFile) |
vosst
Apr 11, 2017
Contributor
Both runtime and build time dependency list openssh-client (for Ubuntu at least).
|
@morphis FYI: potential new dependency for packaging |
vosst
added some commits
Apr 11, 2017
niemeyer
requested changes
Apr 11, 2017
A few trivial comments below.
Please get a review by @tyhicks on this one as well.
| @@ -28,6 +28,9 @@ import ( | ||
| func MockKeyLength(n int) (restore func()) { | ||
| oldKeyLength := keyLength | ||
| + if n < 1024 { | ||
| + n = 1024 |
niemeyer
Apr 11, 2017
Contributor
That's misleading. If we can't take < 1024, let's panic instead and force the test to read correctly.
| @@ -0,0 +1,77 @@ | ||
| +// -*- Mode: Go; indent-tabs-mode: t -*- |
niemeyer
Apr 11, 2017
Contributor
This file would be better as crypto.go or similar, so it implies a better scope than this one function.
| +) | ||
| + | ||
| +func generateRSAKey(keyLength int) (*rsa.PrivateKey, error) { | ||
| + sshKeyFile := filepath.Join(dirs.SnapRunDir, "snapd.key.tmp") |
niemeyer
Apr 11, 2017
Contributor
It would be better to have this as a dynamically named temporary file instead of a fixed one. Otherwise we can get hard to debug issues because of concurrently running logic.
| + os.Remove(sshKeyFile) | ||
| + os.Remove(sshPublicKeyFile) | ||
| + | ||
| + cmd := exec.Command("ssh-keygen", "-t", "rsa", "-b", strconv.FormatInt(int64(keyLength), 10), "-N", "", "-f", sshKeyFile) |
| + | ||
| + blk, _ := pem.Decode(d) | ||
| + if blk == nil { | ||
| + return nil, errors.New("cannot decode PEM block") |
niemeyer
Apr 11, 2017
Contributor
Is the second parameter there an error?
If so, might be useful to have the hint of what was broken:
if err != nil {
return nil, fmt.Errorf("cannot decode PEM block: %v", err)
}
vosst
Apr 11, 2017
Contributor
The second parameter is the remainder of the input after decoding one block.
vosst
added some commits
Apr 11, 2017
niemeyer
approved these changes
Apr 12, 2017
A couple of trivials, but LGTM anyway.
Thanks for all the work on this fix!
| + | ||
| + defer func() { | ||
| + os.RemoveAll(tempDir) | ||
| + }() |
| @@ -71,7 +72,10 @@ type deviceMgrSuite struct { | ||
| restoreOnClassic func() | ||
| } | ||
| -var _ = Suite(&deviceMgrSuite{}) | ||
| +var ( | ||
| + _ = Suite(&deviceMgrSuite{}) |
niemeyer
Apr 12, 2017
Contributor
This reads slightly odd. Having them in their own var lines might be nicer in this particular case.
vosst
added some commits
Apr 12, 2017
zyga
approved these changes
Apr 13, 2017
Looks good, I'll review remaining discussion and merge if all feedback is addressed.
| @@ -134,7 +135,8 @@ func SetRootDir(rootdir string) { | ||
| SnapMetaDir = filepath.Join(rootdir, snappyDir, "meta") | ||
| SnapBlobDir = filepath.Join(rootdir, snappyDir, "snaps") | ||
| SnapDesktopFilesDir = filepath.Join(rootdir, snappyDir, "desktop", "applications") | ||
| - SnapRunNsDir = filepath.Join(rootdir, "/run/snapd/ns") | ||
| + SnapRunDir = filepath.Join(rootdir, "/run/snapd") | ||
| + SnapRunNsDir = filepath.Join(SnapRunDir, "/ns") |
| @@ -85,7 +82,9 @@ func (m *DeviceManager) doGenerateDeviceKey(t *state.Task, _ *tomb.Tomb) error { | ||
| return nil | ||
| } | ||
| - keyPair, err := rsa.GenerateKey(rand.Reader, keyLength) | ||
| + st.Unlock() |
| @@ -58,6 +59,7 @@ Depends: adduser, | ||
| apparmor (>= 2.10.95-0ubuntu2.2), | ||
| ca-certificates, | ||
| gnupg1 | gnupg, | ||
| + openssh-client, | ||
| squashfs-tools, |
vosst commentedApr 3, 2017
•
Edited 1 time
-
vosst
Apr 3, 2017
Switch to ssh-keygen for device key generation.
Please note that placement of key files in /tmp is temporary and we are awaiting advice on where to put the files.