Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACME (Lets Encrypt) Can't Validate Certs HTTP (Create domain key error) #1192

Closed
Zewwy opened this Issue Feb 14, 2019 · 17 comments

Comments

Projects
None yet
4 participants
@Zewwy
Copy link

Zewwy commented Feb 14, 2019

I seem to be having an issue validating certificates via HTTP validation on OPNsense 19.1

Log on First Cert creation, and issue attempt: https://pastebin.com/7rWWHRXW

Select Cert and Click Issue Certificate again: https://pastebin.com/xx9pRMpB

My IP address is correct, DNS to my OPNsense Public IP is correct.
WAN has rule to allow HTTP port 80.

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 14, 2019

WELP!

@fichtner

This comment has been minimized.

Copy link
Member

fichtner commented Feb 14, 2019

Question may be faster on the forums, I'm sure @fraenki will have a look but it won't be quick.

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 14, 2019

Thanks, I'd like to help more directly myself, maybe I can take a look at the source script, but I'm unsure which one OPNsense is currently using in it's active branch.

Also the logs don't seem to:

  1. Report anything in the /var/log/acme.sh.log (i think this was the path, off-memory) when the service is enabled/disabled, when an account is created, pretty much nothing until you actually click a cert and then click Issue/renew.

You'd expect all these events to show in the log, service enabled/disabled, CSR created but not sent, etc.

  1. Not script line call backs in the log, I'm basically going to have to search the source script for the line "Create domain key error" and then reverse engineer what logic gets there.

Just food for thought on my first couple attempts using this plugin...

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 17, 2019

I found two other issues under the acme.sh repo with similar issues unresolved as of Mar, 2018

Neilpang/acme.sh#1356

and

Neilpang/acme.sh#1428

@Neilpang

This comment has been minimized.

Copy link

Neilpang commented Feb 18, 2019

@Zewwy fixed. please try again.

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 18, 2019

Thanks Neilpang! I'm assuming I simply run check for updates on my OPNsense server?

I'll give a try right now. :D

Alright, this does appear to be resolved, as I now get more detailed information in the logs... however I'm still not sure why it's failing now: here are the lines I get now when it is failing:

[Mon Feb 18 22:48:09 UTC 2019] opn.zewwy.ca:Verify error:Fetching (edited to remove hyperlink)h[t]tp://opn.zewwy.ca/.well-known/acme-challenge/Ls0lVWwDMThzC3kHqES05gI2Yo7yW2sODLXJ4XuvQRU: Timeout during connect (likely firewall problem)

alrighty, attempt to access the OPNsense webserver for validation...
https://i.imgur.com/9YQnlGQ.png

ugh.... DNS rebind attack .... what?

OMG... I forgot to change the MGMT UI port!!! changing via System -> Administration...

Trying again...

YEAH!!!! FINALLY!!! WOOOOOOOOOOOOO!

Thank you NEILPANG!

@fichtner

This comment has been minimized.

Copy link
Member

fichtner commented Feb 19, 2019

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 19, 2019

Sorry didn't quite follow that. :)

From my testing of a regular OPNsense server with a direct public IP address. The Create Domain Key error was passed. I wasn't sure if this was due to an actual change in the script, or me trying a different account email address under the accounts area.

I'll be doing a bit more testing with my other OPNsense VM that behind a NAT. I will report my results.

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 20, 2019

Alright, so again, I dunno why my initial attempts on the OPNsense that had a direct public IP failed. Which now succeeded.

however the OPNsense that has a HTTP NAT rule to it for Lets encrypt validation, does not seem to work?

https://pastebin.com/HtEC83EG

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 20, 2019

Ugh... so I figured it might have been my firewall that is doing the NATing, and I decided to quickly create a HAproxy backend server pointing to a very basic IIS server. I was able to access it via the WAN IP of the OPNsense server (before the firewalls NAT), but I was unable to access it from the internet.

So I adjusted my firewall access rule (security rule, as the NAT rule was fine), and I was able to access the IIS server behind the OPNserver via the firewall NAT. Which meant the internet access form the outside world to my OPNsense was again perfectly fine at this point.

So I figured I had it covered. Re-confired the OPNsense server just like I did the one that had a Public IP and is working) but it still fails with the exact same vague error:

Can not create domain Key.... I've tried this now 100 times!

https://pastebin.com/eyHGEsqM

@Neilpang

This comment has been minimized.

Copy link

Neilpang commented Feb 20, 2019

@Zewwy

[Wed Feb 20 01:08:28 UTC 2019] Domain key exists, do you want to overwrite the key?
[Wed Feb 20 01:08:28 UTC 2019] Add '--force', and try again.

The domain certificate already exists.

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 20, 2019

well I don't have them or I deleted (cause it kept saying validation failed) them and am trying again. how do I get past this?

Just an FYI: I ran CertBot for my Wordpress against my main domain (zewwy.ca) which has a different IP address then the other certs I did above.

Then I attempted my NATed OPNsense which failed, then I attempted a Non NATed, directly public IP based OPNsense, which was the first one I reported that failed at the beginning of this post (opn.zewwy.ca). Which always does the same thing, first click the log goes up to "ACCOUNT_THUMBPRINT=", then second click went up to domain key failed. So even my first attempts for this still failed the exact same way, it wasn't till you told me to try again, it amazingly worked, and I have no idea what changed.... (clicking the valid cert, and clicking re-issue works)

Then I tried again with my NATed OPNsense behind my firewall with yet a different Public IP, again NATed (on port 80) to the OPNsense server (sync.zewwy.ca). Which was my most recent log posts you told me the Certificate already exists. What do I have to do? Create a whole new DNS record for this now? Test2.zewwy.ca? When is there certificate collisions? any records under the same domain? E.G test1.zewwy.ca, test2.zewwy.ca can't be made if a cert already exists for zewwy.ca?

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 20, 2019

I'm going to create a couple new records on my DNS provider portal right now (all pointing to the Public IP address on my Firewall that has a port 80 NAT rule to send those HTTP requests to the OPNsense's WAN IP), I hope the 3 hours will be enough. This way they should be "all new requests" and there shouldn't be an existing domain certificate?

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 21, 2019

https://pastebin.com/E9DAqs8R

ok... so it turns out it was my firewal!! the one doing the NAT, and the security rules.

So basically it turns out:

if in the acme script log (running "tail -f /var/log/acme.sh.log") you'll notice:

The script hangs @ [DATESTAMP] ACCOUNT_THUMBPRINT=
and clicking Issue Re-issue causes it to continue and fail @ Create Domain Key Part

If this happens there's a Firewall issue preventing the Lets encrypt servers from accessing the required web services created by the service.

What i did to resolve this (even though I had opened up the firewall rule I had initially and had it working with a IIS web site and the HAproxy plugin and figured this was good enough for the validations to succeed) at this point my firewall was literally the only thing I could think of as the culprit... and it was, by opening up the security rule (completely open wayyyyyyyy more than I would have ever wanted) and attempting to create and validate another cert finally worked!

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 21, 2019

This issue can be closed.

@Zewwy

This comment has been minimized.

Copy link
Author

Zewwy commented Feb 22, 2019

One last thing I noticed about this.... I had my rule created like I usually did (that would cause domain key error and cert validation failure).

Then when I opened the rule to allow the traffic it succeeded in the certificate validation when i was monitoring the logs. Which is awesome.

However the cert in the OPNsense UI still says validation failed? I know the cert actually succeeded per the acme.sh.log file.

Any idea why the UI doesn't correct itself?

@fraenki

This comment has been minimized.

Copy link
Member

fraenki commented Feb 26, 2019

FWIW, the initial error reported by acme.sh was "Create domain key error". I've seen this a lot. In my tests it only happens when using acme.sh 2.7.x. The upcoming OPNsense release includes amce.sh 2.8.0, which is not affected by this issue.

@fraenki fraenki closed this Feb 26, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.