Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS SRV lookup fails with <nil> is not a valid SRV record #632

Closed
jagregory opened this Issue Apr 9, 2015 · 3 comments

Comments

Projects
None yet
2 participants
@jagregory
Copy link

jagregory commented Apr 9, 2015

I'm trying to use SD for jobs, as per a quick chat with @juliusv on twitter, but I'm not getting very far with it. All I see printed in the log is:

target_provider.go:117] <nil> is not a valid SRV record

I've been testing it against a local dnsmasq setup, but I've also tried it against a real DNS server on one of my public domains too.

My job looks like this:

job: {
  name: "example-random"
  scrape_interval: "5s"
  sd_name: "_sd._tcp.sd.calendars.io"
  metrics_path: "/metrics"
}

Running dig SRV _sd.tcp.sd.calendars.io responds with:

; <<>> DiG 9.8.3-P1 <<>> SRV _sd._tcp.sd.calendars.io
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 14933
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;_sd._tcp.sd.calendars.io.  IN  SRV

;; ANSWER SECTION:
_sd._tcp.sd.calendars.io. 59  IN  SRV 1 1 8080 test.host.

;; Query time: 203 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Fri Apr 10 07:49:24 2015
;; MSG SIZE  rcvd: 71

And tailing my DNS logs shows something (I assume Prometheus) is doing a query for an A record for _sd._tcp.sd.calendars.io but never seems to query for an SRV record.

Any ideas? This has got me pretty stumped.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Apr 10, 2015

Prometheus definitely queries for an SRV record - it actually boils down to https://gist.github.com/fabxc/25bc3af953d13338388e, which works fine for me.

It also works on my local Prometheus.
Unfortunately, this leaves me without an answer except that the issue lies probably not within Prometheus.

You could run the minimal example above (with the DNS server you are using) and report back whether that works.

@jagregory

This comment has been minimized.

Copy link
Author

jagregory commented Apr 10, 2015

Thanks for the reply. I tried your snippet locally and it worked as expected. I figured it out, it's not Prometheus as you suspected, explanation below.

I wondered what's different between the code I was running from you and how Prometheus is running said code. The difference being Prometheus is in Docker, and Docker is in VirtualBox (because I'm using boot2docker).

I SSH'd onto the boot2docker VM and ran the code you gave again, boom, it explodes in a much similar way to Prometheus. Interestingly, no matter what query you do of the DNS it always responded with an A record which is why the Prometheus code was logging the <nil>. It was receiving a Result but it wasn't a *dns.SRV it was a *dns.A.

I dug a bit deeper and my VM had --natdnshostresolver1 on which was there to forward DNS resolving to my host, but for whatever reason that seems to assume A record queries only. Switching natdnshostresolver1 off and setting my host's DNS to always route via dnsmasq seems to have worked, but it's a little weird.

Either way, it's not a Prometheus issue. Thanks for the pointers.

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 24, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 24, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.