Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS queries for *.my.local-ip.com fail causing development instances to be unreachable #8100

Closed
mrjones-plip opened this issue Feb 23, 2023 · 11 comments
Labels
Type: Technical issue Improve something that users won't notice

Comments

@mrjones-plip
Copy link
Contributor

Describe the issue
We have 3 projects that heavily rely on the *.my.local-ip.com TLS certs:

Because TLS requires DNS and because the DNS server for my.local-ip.com often fails, it feels like a developers instances is broken, when in fact DNS is failing to resolve for their my.local-ip.com IP.

Describe the improvement you'd like
Provide a new solution that offers the same conveniences as my.local-ip.com but has stable DNS. Either a commercial offering or a self hosted option based on open source

Describe alternatives you've considered

  • Hope that my.local-ip.com resolves their DNS issue
  • Stop offering a valid TLS cert from a real CA and revert back to only https://localhost self signed certs
@mrjones-plip mrjones-plip added the Type: Technical issue Improve something that users won't notice label Feb 23, 2023
@mrjones-plip mrjones-plip changed the title DNS queries for *.my.local-ip.com fail causing development instances to unacceptable DNS queries for *.my.local-ip.com fail causing development instances to be unreachable Feb 23, 2023
@mrjones-plip
Copy link
Contributor Author

mrjones-plip commented Feb 23, 2023

Research findings

Commercial

I did not find any commercial offerings :(

OSS/Free services

  • local-ip.co - has everything we want, what we're currently using, massive DNS failures
  • xip.io - no long exists
  • localtls GitHub repo - one liner python to do everything we want - promising
  • https://nip.io - public wildcard IP DNS resolution, but no TLS cert
  • NIP.IO GitHub repo - OSS wildcard IP DNS resolution, but no TLS solution
  • sslip.io - public wildcard IP DNS resolution, but no TLS cert
  • sslip.io GitHub repo - OSS wildcard IP DNS resolution, but no TLS solution. Does have nice Docker image though! Good backup candidate after localtls
  • local-ip.sh GitHub repo - OSS wildcard IP DNS resolution, but no TLS solution

Related articles

@mrjones-plip
Copy link
Contributor Author

Two months ago I was in touch with Quad9 thinking that the failure to resolve local-ip.co was with their servers. This is what I got back:

Hello,

This is a bit of a strange one.

When I query the authoritative nameservers for the NS record of my.local-ip.co, it returns:
ns1.local-ip.co

When I query ns1.local-ip.co for the NS record of my.local-ip.co, it returns:
ns-1.my.local-ip.co

When I query either ns-1.my.local-ip.co or ns1.local-ip.co for the A record of itself or the other delegated nameserver, I get a REFUSED response.

DNSViz sort of shows the issue

I'm not entirely sure why it resolves sometimes but not others; I am unable to replicate the issue when querying our Palo Alto or Miami servers directly, though I'm not surprised by the inability to resolve the FQDN. It's possible that other open DNS resolvers have built-in behavior to avoid failing due this zone delegation issue, but Quad9 does not.

This seems to be a zone delegation issue which will need to be resolved by the DNS administrator. I'd recommend forwarding this information to them so they can take a look.

  $ whois local-ip.co | grep "Name Server"
  Name Server: dns18.ovh.net
  Name Server: ns18.ovh.net

  $ dig ns my.local-ip.co @ns18.ovh.net

  ; <<>> DiG 9.18.9 <<>> ns my.local-ip.co @ns18.ovh.net
  ;; global options: +cmd
  ;; Got answer:
  ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4247
  ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 2
  ;; WARNING: recursion requested but not available

  ;; OPT PSEUDOSECTION:
  ; EDNS: version: 0, flags:; udp: 4096
  ;; QUESTION SECTION:
  ;my.local-ip.co.			IN	NS

  ;; AUTHORITY SECTION:
  my.local-ip.co.		60	IN	NS	ns1.local-ip.co.

  ;; ADDITIONAL SECTION:
  ns1.local-ip.co.	60	IN	A	54.37.130.80

  ;; Query time: 16 msec
  ;; SERVER: 2001:41d0:1:198a::1#53(ns18.ovh.net) (UDP)
  ;; WHEN: Sun Dec 18 15:07:26 CET 2022
  ;; MSG SIZE  rcvd: 77

  $ dig ns1.local-ip.co @ns1.local-ip.co

  ; <<>> DiG 9.18.9 <<>> ns1.local-ip.co @ns1.local-ip.co
  ;; global options: +cmd
  ;; Got answer:
  ;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 63297
  ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
  ;; WARNING: recursion requested but not available

  ;; OPT PSEUDOSECTION:
  ; EDNS: version: 0, flags:; udp: 1680
  ;; QUESTION SECTION:
  ;ns1.local-ip.co.		IN	A

  ;; Query time: 13 msec
  ;; SERVER: 54.37.130.80#53(ns1.local-ip.co) (UDP)
  ;; WHEN: Sun Dec 18 15:18:27 CET 2022
  ;; MSG SIZE  rcvd: 44

  $ dig a ns1-my.local-ip.co @ns-1.my.local-ip.co

  ; <<>> DiG 9.18.9 <<>> a ns1-my.local-ip.co @ns-1.my.local-ip.co
  ;; global options: +cmd
  ;; Got answer:
  ;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 13262
  ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1
  ;; WARNING: recursion requested but not available

  ;; OPT PSEUDOSECTION:
  ; EDNS: version: 0, flags:; udp: 1680
  ;; QUESTION SECTION:
  ;ns1-my.local-ip.co.		IN	A

  ;; Query time: 10 msec
  ;; SERVER: 54.37.130.80#53(ns-1.my.local-ip.co) (UDP)
  ;; WHEN: Sun Dec 18 15:19:14 CET 2022
  ;; MSG SIZE  rcvd: 47
  ```

@mrjones-plip
Copy link
Contributor Author

mrjones-plip commented Feb 25, 2023

Try out MVP

If you want to test the current MVP without needing to do all the setup in "Install self hosted DNS, Web and Cert generator servers", follow these simple 3 steps:

  1. Checkout custom branch in cht-core: git checkout 8100-local-ip-co-replace
  2. Run CHT 4.x Docker Helper ./cht-docker-compose.sh to resume an old project or create a new one
    OR
    Run ./add-local-ip-certs-to-docker-4.x.sh on a currently running project
  3. Enjoy your project at YOUR-IP.local-ip.medicmobile.org!

Steps for self hosted MVP

Content of this comment moved here.

@dianabarsan
Copy link
Member

dianabarsan commented Feb 27, 2023

I have some insight into why the local-ip.co DNS failures happened:

First, reproducing a failed request:

andrei@DESKTOP-V5IHRLP:~$ dig @8.8.8.8 8-8-8-8.my.local-ip.co

; <<>> DiG 9.16.1-Ubuntu <<>> @8.8.8.8 8-8-8-8.my.local-ip.co
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 25973
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;8-8-8-8.my.local-ip.co.                IN      A

;; Query time: 1120 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sun Feb 26 21:34:40 EET 2023
;; MSG SIZE  rcvd: 51

After 1120 msec (1.1 seconds), the query returns an error (SERVFAIL status) and doesn't return any results.

Extrapolating from how nip.io uses a custom PipeBackend written in Python, there's a high chance that local-ip could also be doing the same, and this custom backend is failing somehow, producing the SERVFAIL.

Custom backend failures, coupled with the very low TTL (10 seconds) produces the failures.

Example of a successful query:

[diana@doina ~]$ dig @8.8.8.8 8-8-8-8.my.local-ip.co

; <<>> DiG 9.18.11 <<>> @8.8.8.8 8-8-8-8.my.local-ip.co
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42000
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;8-8-8-8.my.local-ip.co.                IN      A

;; ANSWER SECTION:
8-8-8-8.my.local-ip.co. 10      IN      A       8.8.8.8

;; Query time: 129 msec
;; SERVER: 8.8.8.8#53(8.8.8.8) (UDP)
;; WHEN: Mon Feb 27 09:35:34 EET 2023
;; MSG SIZE  rcvd: 67

Thank you @andreibacos for your assistance in debugging this.

@mrjones-plip
Copy link
Contributor Author

That's very interesting - thanks @dianabarsan (and also thanks @andrablaj !) for the details on this. Given this ticket uses sslip.io's Go based resolver, we should be safe. If either of you wants to give it a try, I welcome any feedback (see "Try out MVP" above).

@andrablaj
Copy link
Member

Correct me if I am wrong @mrjones-plip, but I didn't contribute to this ticket 😄. Did you mean to tag someone else? I wouldn't like for them to miss your thank you note. 😊

@mrjones-plip
Copy link
Contributor Author

aha - you're right @andrablaj 😅 ! I meant to thank @andreibacos instead.

Thanks @andreibacos ;)

@mrjones-plip
Copy link
Contributor Author

mrjones-plip commented Apr 16, 2023

Content of this comment moved here

mrjones-plip added a commit that referenced this issue Apr 21, 2023
* update install certs script to use new local-ip service per #8100

* edit nginx config to work around ssl_error_no_cypher_overlap error

* cut over to using medic hosted local-ip, stop munging nginx conf

* update add cert 4.x script to work w/ medicmobile

* use currect URL, use more gentle nginx app restart instead of whole container restart

* use currect URL for fullchain per feedback
@latin-panda latin-panda modified the milestone: 4.2.0 May 12, 2023
@latin-panda
Copy link
Contributor

@mrjones-plip is this still in progress? I see a commit already merged

@mrjones-plip
Copy link
Contributor Author

I think as far as CHT Core is concerned we have:

we can close this! Thanks for the ping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Technical issue Improve something that users won't notice
Projects
None yet
Development

No branches or pull requests

4 participants