Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's Encrypt certs using DNS challenges #337

Merged
merged 14 commits into from
Jul 20, 2018

Conversation

jvperrin
Copy link
Member

@jvperrin jvperrin commented Jul 9, 2018

This uses dehydrated to request certs with a hook to use a DNS challenge and update bind9. I used these two because after messing around with them I was able to get them to work without too many problems and because they are both already packaged for Debian (need to use stretch-backports to get new enough versions, but they work fine).

This also removes ocf_ssl and refactors it into ocf::ssl instead, where there are both ocf::ssl::default (a bundle using LE certs and a DNS challenge) and ocf::ssl::default_incommon (a bundle using our previous method of using incommon certs), so it's quite easy to switch between the two.

One problem was making sure that when certs are renewed that the services that depend on those certs are restarted properly to re-read the new cert. I believe this is done for all modules, but let me know if you see a place where a subscribe isn't added or something.

Unfortunately I had to add a couple exec resources to actually request certs, but I didn't want it to run every time, so instead I use a bit of a hacky solution with two execs and an unless to make sure that neither exec executes unless the cert is close to expiring or the list of domains has changed. I'd be happy to explain more if you'd like an explanation, but it was a bit of a mess and I'd be open to suggestions if they are cleaner and meet the requirements.

NOTE: One problem is that the packages needed for this aren't available until stretch (meaning that death, and tsunami won't work with this, werewolves doesn't currently use it), so I don't think this should be merged until after we upgrade all hosts to stretch, meaning that ocf.io will probably have to be manually renewed one final time unless we migrate everything to stretch in the next ~10 days.

Copy link
Member

@abizer abizer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! You're on a roll :D

@jvperrin
Copy link
Member Author

Alright, I backported the necessary packages for jessie from stretch and they worked for tsunami and after a few tries I got death working too (the regex for the dehydrated hook wasn't quite correct and the DNS entry for the root domain wasn't present, but those have been fixed now). I think this is ready to be shipped, I'd like some more people to look at the exec stuff to make sure that nobody else has a better way to do it, because I imagine there could be something nicer perhaps.

Copy link
Member

@kpengboy kpengboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this done, it's quite a large undertaking.

Regarding the two execs, I can neither think of a better solution. The other alternative I thought of is to have a fact that gathers expiration dates of certs under /var/lib/lets-encrypt, but that doesn't sound much better.

Or, of course, you could do it all with systemd...


file {
default:
group => 'ssl-cert',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we wish to specify the owner here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good idea

[$::hostname],
delete($::dnsA, '@'),
$::dnsCname,
), ''), '.ocf.io') + $extra_domains
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...would suggest some code deduplication here....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only way this could be really deduplicated is to use puppet functions, I'll try that and see how it goes, but it might be too complicated for our use here, idk

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't assign the result of delete(concat([$::hostname], delete($::dnsA, '@'), $::dnsCname), '') to some variable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that would work too lol


$letsencrypt_ddns_key = assert_type(Stdlib::Base64, hiera('letsencrypt::ddns::key'))

# TODO: Move these somewhere else so this defined resource can be used
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/var/lib/lets-encrypt/$title perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, like the certs directory can't really be moved unless we move the whole base directory, I meant to move this inside puppet to a different manifest so it could be included or something instead of redefined every time. This stuff only needs to be declared once, although I'm not sure how the domains list will work, it'll probably need to use concat or something.

File['/var/lib/lets-encrypt/domains.txt'],
File['/etc/dehydrated/config'],
File['/etc/dehydrated/dehydrated-hook-ddns-tsig.conf'],
File['/etc/ssl/lets-encrypt/le-account.key'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these four File resource references redundant, because of their notify statements?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably

# This exec can be notified to get it to check anyway, for instance, if the
# dehydrated config changes, or the domain list changes then this should
# run again, even if the cert will not expire soon.
exec { $title:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're gonna get unspecific resource titles like Exec[www] at this rate; I would strongly recommend changing the title to be more specific here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be more like Exec[death.ocf.berkeley.edu], but sure, changing it would probably be fine, it should probably mention "cert" or maybe "ssl" somewhere in there, idk

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like Exec[Renew $title cert] would work well IMO

# itself, but this should help with puppet error spam a bit (for instance
# if Let's Encrypt is down for any reason) and mean that services can
# subscribe to this resource to refresh after new certs are obtained.
exec { "${title}_check_expiration":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exec titles can contain spaces.

file {
default:
# the ssl-cert package creates the ssl-cert group
require => Package['ssl-cert'];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems redundant, given that the one file that actually uses this group already has the require set properly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this was migrated from ocf_ssl, it's necessary because I don't believe /etc/ssl/certs exists without the ssl-cert package already being installed, but you're right that the file at the bottom of this block can have the require removed, I copy/pasted it from another manifest so that's why the require is duplicated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on this needed updating though, since it didn't explain the directories being created, so I see now why you would have thought it was just needed for the ssl-cert group.

[DEFAULT]
name_server_ip = 169.229.226.22
verbosity = 1
# XXX: Do not put the key name in quotes, or it will not work
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were me I would use something more specific like NOTE here. That's my personal preference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, the XXX was meant to be more of a warning than NOTE would be, since I've seen it like that before, but I don't mind changing it either

@jvperrin
Copy link
Member Author

@kpengboy I thought about doing this via cron/systemd/whatever, but the issue with using some other solution is that when certs are renewed, daemons don't necessarily read in the new contents, so I wanted to have some kind of way for us to be able to reload/restart existing programs using the certs when they are renewed. This could also be done through cron/systemd, but we'd have to specify the exact program to reload when we already have those services specified in puppet, so it seemed like a natural fit to put this into puppet instead.

@kpengboy
Copy link
Member

Or, of course, you could do it all with systemd...

@kpengboy I thought about doing this via cron/systemd/whatever

Lol that was sarcasm. I was referring to that you could theoretically make systemd path units for your four files, then use those units to trigger a systemd timer that renews the cert. I guess you could then use that to reload the service, although I didn't think that far. Of course, it'd be much easier to do this via Puppet anyway.

@jvperrin
Copy link
Member Author

Ah, fair enough, I just meant this whole PR, not just the four files and the exec, but fair enough, I hadn't thought of using systemd path units lol

@jvperrin
Copy link
Member Author

I just realized I haven't done anything with this for things hosted by Marathon, so I'll try and add that soon, once the DNS changes in ocf/dns@1cb09fa have propagated a bit more.

# If there's a host with hostname 'death' and DNS A records 'dev-vhost' and DNS
# CNAME records 'www', then this function (with the suffix 'ocf.io' would
# return ['death.ocf.io', 'dev-vhost.ocf.io', 'www.ocf.io']
function ocf::get_host_fqdns(String $suffix = 'ocf.berkeley.edu') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is our typename policy these days? Do we care to add a return type too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't add a return type because it doesn't work with older versions of puppet, and overheat doesn't have puppet available past 4.8, although maybe it could get a package from buster or something?

I'd ultimately like to add more types to this repo since they seem reasonably useful, but I'm not sure if we've really talked about it at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, they actually work with 4.8, I thought it was with above 4.8, not 4.8 and above, so adding a return type to this sounds like a good idea.

[$::hostname],
delete($::dnsA, '@'),
$::dnsCname,
), ''), ".${suffix}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to refactor this into a function, we might as well reduce the number of lines it takes up. I honestly think it's easier to read as 1 or 2 lines than as 5.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose so, my hesitation to put it on one line is because it exceeds the line length limit, but with variables that might be ok.

$ssl = false,
$ssl_dir = $title,
$ssl = true,
$ssl_domains = [$server_name],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add by default the server aliases which end in our domains?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would make sense, I'm just waiting on testing this since last time I was rate-limiting when working on it. I'll try again soon though

Copy link
Member

@kpengboy kpengboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this done! It's good to finally start being able to use automated LE to request certs for more of our servers.

@kpengboy
Copy link
Member

@jvperrin bump? our cert's expired again

@jvperrin
Copy link
Member Author

Ah, yeah I meant to merge this before today, sorry about that, I'll merge this now (after resolving conflicts)

@jvperrin jvperrin merged commit 7d46f23 into ocf:master Jul 20, 2018
@jvperrin jvperrin deleted the lets-encrypt-dns branch July 20, 2018 22:37
@@ -42,6 +42,14 @@
include apache::mod::fcgid
include apache::mod::rewrite

$cname = $::host_env ? {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a duplicate declaration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I got some merge conflicts when rebasing this, and it looks like I didn't resolve this correctly. I've fixed it in f3fdd59.

@ethanhs
Copy link
Member

ethanhs commented Jul 21, 2018

Congrats! And thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants