Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renewal failure should include instructions to remove lineage #8542

Closed
jsha opened this issue Dec 16, 2020 · 7 comments
Closed

Renewal failure should include instructions to remove lineage #8542

jsha opened this issue Dec 16, 2020 · 7 comments

Comments

@jsha
Copy link
Contributor

jsha commented Dec 16, 2020

I logged onto my personal web server, and noticed that I had 1600 keys in /etc/letsencrypt/keys, two every day for the last many days. I checked /var/log/letsencrypt/ and I think the root cause is that I had a lineage listed for a domain I no longer own. I was able to figure that out pretty quickly from the logs:

2020-12-15 14:05:01,278:ERROR:certbot._internal.renewal:All renewals failed. The following certs could not be renewed:
2020-12-15 14:05:01,278:ERROR:certbot._internal.renewal:  /etc/letsencrypt/live/lastbart.at-0002/fullchain.pem (failure)

But it took a little reading to remember how to remove that lineage so it's no longer retried. It would be helpful for the log output to say "To delete this, run XXX".

As an aside: It looks like I'm missing some logs?

# ls -ltr /etc/letsencrypt/keys/
...
-rw------- 1 root root 1704 Dec  7 05:49 1296_key-certbot.pem
-rw------- 1 root root 1704 Dec  7 17:41 1297_key-certbot.pem
-rw------- 1 root root 1704 Dec  7 21:44 1298_key-certbot.pem
-rw------- 1 root root 1704 Dec  8 00:50 1299_key-certbot.pem
-rw------- 1 root root 1704 Dec  8 22:03 1300_key-certbot.pem
-rw------- 1 root root 1704 Dec  9 06:58 1301_key-certbot.pem
-rw------- 1 root root 1704 Dec  9 17:59 1302_key-certbot.pem
-rw------- 1 root root 1704 Dec 10 01:22 1303_key-certbot.pem
-rw------- 1 root root 1704 Dec 10 17:33 1304_key-certbot.pem
-rw------- 1 root root 1704 Dec 11 04:06 1305_key-certbot.pem
-rw------- 1 root root 1704 Dec 11 08:42 1306_key-certbot.pem
-rw------- 1 root root 1704 Dec 11 23:48 1307_key-certbot.pem
-rw------- 1 root root 1704 Dec 12 09:01 1308_key-certbot.pem
-rw------- 1 root root 1704 Dec 12 19:17 1309_key-certbot.pem
-rw------- 1 root root 1708 Dec 13 17:03 1310_key-certbot.pem
-rw------- 1 root root 1704 Dec 14 05:43 1311_key-certbot.pem
-rw------- 1 root root 1704 Dec 14 19:59 1312_key-certbot.pem
-rw------- 1 root root 1704 Dec 15 02:45 1313_key-certbot.pem
-rw------- 1 root root 1704 Dec 15 14:04 1314_key-certbot.pem
# ls -ltr /var/log/letsencrypt/
...
-rw-r--r-- 1 root root   7876 Jun 10  2018 letsencrypt.log.4
-rw-r--r-- 1 root root   7876 Jun 10  2018 letsencrypt.log.3
-rw-r--r-- 1 root root   7876 Jun 11  2018 letsencrypt.log.2
-rw-r--r-- 1 root root  33028 Sep 26 12:08 letsencrypt.log.12.gz
-rw-r--r-- 1 root root  65235 Oct  4 20:59 letsencrypt.log.11.gz
-rw-r--r-- 1 root root  34647 Oct 11 05:06 letsencrypt.log.10.gz
-rw-r--r-- 1 root root  49128 Oct 18 23:07 letsencrypt.log.9.gz
-rw-r--r-- 1 root root  37766 Oct 25 00:27 letsencrypt.log.8.gz
-rw-r--r-- 1 root root  80066 Nov  2 00:57 letsencrypt.log.7.gz
-rw-r--r-- 1 root root  34855 Nov  7 14:23 letsencrypt.log.6.gz
-rw-r--r-- 1 root root  49337 Nov 15 14:14 letsencrypt.log.5.gz
-rw-r--r-- 1 root root  47362 Nov 22 05:03 letsencrypt.log.4.gz
-rw-r--r-- 1 root root  64934 Nov 30 03:54 letsencrypt.log.3.gz
-rw-r--r-- 1 root root  56497 Dec  5 13:26 letsencrypt.log.2.gz
-rw-r--r-- 1 root root  53404 Dec 14 05:43 letsencrypt.log.1.gz
-rw-r--r-- 1 root root  98773 Dec 15 23:50 letsencrypt.log

Version certbot 1.11.0.dev0

@alexzorin
Copy link
Collaborator

alexzorin commented Dec 16, 2020

It would be helpful for the log output to say "To delete this, run XXX".

I'm hesitant to do something like this because the only other advice we (will) print is "go ask on the forums, try -v or check in /var/log/letsencrypt".

What I'm afraid of is that a user will reach for certbot delete as a solution to a generic certificate renewal problem, in the mindset of "reset things and start again". That's pretty drastic and deleting certs poses a risk to their running webserver.

If certbot certificates had a small cheat sheet at the bottom of its output, would that have helped you?

As an aside: It looks like I'm missing some logs?

Do you mean between mid-June and September? Did you change where you installed Certbot from? Maybe there was some change in max-log-backups (cli.ini) + /etc/logrotate.d/certbot.

One more aside: I had no idea Certbot saved keys and CSRs to disk earlier than the order finalization stage. That's a bit unfortunate given there's no automatic cleanup. #4635.

@jsha
Copy link
Contributor Author

jsha commented Dec 16, 2020

What I'm afraid of is that a user will reach for certbot delete as a solution to a generic certificate renewal problem, in the mindset of "reset things and start again". That's pretty drastic and deleting certs poses a risk to their running webserver.

This is a good point. Perhaps "if you no longer control this domain, do X." But I recognize there's still a risk people do that unthinking.

If certbot certificates had a small cheat sheet at the bottom of its output, would that have helped you?

Yep! Though I happened to know I should run certbot certificates. I'm not sure everyone would.

Do you mean between mid-June and September? Did you change where you installed Certbot from? Maybe there was some change in max-log-backups (cli.ini) + /etc/logrotate.d/certbot.

I mean from Dec 7 to Dec 15, I have two keys per day in /etc/letsencrypt/keys, but in the logs directory, in the month of December, I only have logs for Dec 5, Dec 14 and Dec 15

@osirisinferi
Copy link
Collaborator

Aren't there 1001 ways a renewal can fail? Not owning a domain any longer is just one of many possible reasons.

@jsha
Copy link
Contributor Author

jsha commented Dec 21, 2020

This is a good point. Perhaps a more useful thing would be for Certbot to try and convey "here are the N lineages you have that have been failing for >50 days; you might want to delete them."

@osirisinferi
Copy link
Collaborator

A certificate can also fail because one out of many hostnames is failing due to e.g. loss of ownership. It would be careless to delete the whole certificate because of that, using --allow-subset-of-names would be more prudent in that case.

Therefore, I believe any generic "hint" might not actually encompass all the possible reasons the certificate might fail.

Also, I'm not seeing much threads on the Community regarding such issues (failing certs b/c loss of ownership). I'm inclined to believe this isn't a very big issue for most users. And with that, I think it doesn't warrent adding warnings/hints which might lead the user into acting the wrong way for a whole different issue to certbot.

@alexzorin
Copy link
Collaborator

I mean from Dec 7 to Dec 15, I have two keys per day in /etc/letsencrypt/keys, but in the logs directory, in the month of December, I only have logs for Dec 5, Dec 14 and Dec 15

It looks to me like Certbot's internal log rotation is disabled (maybe because you previously had the Certbot .deb installed) and the OS is doing log rotation instead.

As I understand it, a side-effect of this is that multiple invocations of Certbot will keep appending to /var/log/letsencrypt/letsencrypt.log, until the OS log rotation kicks in. It would also explain why your logfiles are so large.

@jsha
Copy link
Contributor Author

jsha commented Dec 21, 2020

Ah, excellent point! I did in fact previously have the .deb installed, and I do have max-log-backups = 0 in /etc/letsencrypt/cli.ini.

I'm satisfied on both counts here, so closing.

@jsha jsha closed this as completed Dec 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants