Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added dns sd to query and rule #575

Merged
merged 16 commits into from
Nov 27, 2018
Merged

Added dns sd to query and rule #575

merged 16 commits into from
Nov 27, 2018

Conversation

ivan-valkov
Copy link

@ivan-valkov ivan-valkov commented Oct 15, 2018

ref #492

TODO:

  • Extract dns sd to separate package
  • Write tests for dns sd
  • Add dns sd to ruler and query
  • Resolve dns periodically and not on every query (add provider/cache)
  • Add a flag for the dns resolution interval
  • Clarify assumptions (default port/failure behaviour)
  • Update docs explaining how dns sd can be used in static flags and file sd PR here - Documented DNS SD and added to the changelog #613

Changes

Extracted the DNS SD that was used only for alertmanagers inside ruler into a separate package. Added it to query and rule. This allows for any address provided via static flag or file sd to be a domain name that can be resolved. You can add dns+ or dnssrv+ in front of the address to get it resolved as a A/AAAA or SRV DNS lookup respectively.

Verification

Added unit tests to the DNS Resolver and Provider.

Any ideas how to do an e2e test using some sort of actual dns configuration?

@ivan-valkov ivan-valkov self-assigned this Oct 15, 2018
@ivan-valkov ivan-valkov changed the title Added dns sd to query and rule [WIP] Added dns sd to query and rule Oct 15, 2018
pkg/dns/dns.go Outdated
hosts = append(hosts, net.JoinHostPort(ip.String(), port))
}
case "dnssrv":
_, recs, err := s.resolver.LookupSRV(ctx, "", proto, host)
Copy link
Member

@jojohappy jojohappy Oct 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious why did we left empty the service for SRV record? It will be looked up as _._proto.name. The standard form for the SRV record is _service._proto.name.

Copy link
Member

@jojohappy jojohappy Oct 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Kubernetes, we can create a headless service to using the SRV record with the form _my-port-name._my-port-protocol.my-svc.my-namespace.svc.cluster.local, in this case, we can not use the SRV record to discover the hosts in Thanos. Of course, we should use dns+ to look up the hosts for headless service in Kubernetes. Just curious.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed on slack, this seems to be broken. Will fix it in future commits.

return runutil.Repeat(dnsSDInterval, ctx.Done(), func() error {
addresses := append(fileSDCache.Addresses(), storeAddrs...)
// TODO(ivan): default port... Use a flag maybe?
if err := dnsProvider.Resolve(ctx, addresses, 9090); err != nil {
Copy link
Author

@ivan-valkov ivan-valkov Oct 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the defaultPort here? Should it be exposed as a flag?
The defaultPort is used when a non SRV lookup is made if the port is not specified. Same question for rule.go

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just use a variable to represent the default port of store address. User can be easy to set the --grpc-address for the store components.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no default. We should fail if port is not specified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And potentially have metric for failed resolutions and incorrect filesd entries.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The default port is only used for alertmanagers right now. The resolver shouldn't care about it. I am moving this logic to the alertmanagerSet update function. I will also add metrics.

@ivan-valkov
Copy link
Author

@bwplotka @domgreen @jojohappy This PR is ready for review

@ivan-valkov ivan-valkov changed the title [WIP] Added dns sd to query and rule Added dns sd to query and rule Oct 23, 2018
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this looks great. Some suggestions after first review.

cmd/thanos/query.go Show resolved Hide resolved
cmd/thanos/query.go Show resolved Hide resolved
return runutil.Repeat(dnsSDInterval, ctx.Done(), func() error {
addresses := append(fileSDCache.Addresses(), storeAddrs...)
// TODO(ivan): default port... Use a flag maybe?
if err := dnsProvider.Resolve(ctx, addresses, 9090); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no default. We should fail if port is not specified.

return runutil.Repeat(dnsSDInterval, ctx.Done(), func() error {
addresses := append(fileSDCache.Addresses(), storeAddrs...)
// TODO(ivan): default port... Use a flag maybe?
if err := dnsProvider.Resolve(ctx, addresses, 9090); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And potentially have metric for failed resolutions and incorrect filesd entries.

cmd/thanos/rule.go Outdated Show resolved Hide resolved
cmd/thanos/rule.go Outdated Show resolved Hide resolved
pkg/discovery/dns/provider.go Show resolved Hide resolved
pkg/discovery/dns/provider_test.go Show resolved Hide resolved
pkg/discovery/dns/provider.go Outdated Show resolved Hide resolved
pkg/discovery/dns/provider.go Outdated Show resolved Hide resolved
cmd/thanos/query.go Show resolved Hide resolved
cmd/thanos/query.go Outdated Show resolved Hide resolved
cmd/thanos/query.go Outdated Show resolved Hide resolved
cmd/thanos/query.go Show resolved Hide resolved
cmd/thanos/rule.go Outdated Show resolved Hide resolved
pkg/discovery/dns/provider.go Outdated Show resolved Hide resolved
pkg/discovery/dns/provider.go Outdated Show resolved Hide resolved
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, small things only.

pkg/discovery/dns/resolver_test.go Outdated Show resolved Hide resolved
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks legit. Last touches I hope!

pkg/discovery/dns/provider.go Outdated Show resolved Hide resolved
pkg/discovery/dns/resolver.go Outdated Show resolved Hide resolved
pkg/discovery/dns/resolver.go Show resolved Hide resolved
@ivan-valkov ivan-valkov merged commit f02c096 into master Nov 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants