Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] dns: add support for SRV records in DNS lookup #6379

Closed
wants to merge 23 commits into from

Conversation

venilnoronha
Copy link
Member

@venilnoronha venilnoronha commented Mar 26, 2019

Description: This adds support for SRV records in DNS lookup by introducing a new SrvInstance type which holds a regular Address::Instance object along with priority and weight information.
Risk Level: Med
Testing: Pending
Docs Changes: Pending
Release Notes: Pending
Fixes #125
Related #517

This PR is currently a WIP. Wiring, configuration, tests, documentation, etc. will be added once the implementation looks okay.

/cc @mattklein123 @htuch

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level, this PR and approach looks good, thanks for the contribution. What I'd like to see is some more refactoring in dns_impl.cc to avoid having code duplication around some of the admittedly sketchy code that is already there, e.g. the exception handling in cares, the self deletion, etc. The work on markForDestruction() that you've done already is a good example of this :)

/wait

include/envoy/network/address.h Outdated Show resolved Hide resolved
@stale
Copy link

stale bot commented Apr 4, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Apr 4, 2019
@venilnoronha
Copy link
Member Author

WIP

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Apr 4, 2019
@stale
Copy link

stale bot commented Apr 11, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Apr 11, 2019
@venilnoronha
Copy link
Member Author

Still WIP

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Apr 12, 2019
@stale
Copy link

stale bot commented Apr 19, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Apr 19, 2019
@venilnoronha
Copy link
Member Author

.

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Apr 24, 2019
@venilnoronha
Copy link
Member Author

/wait

@venilnoronha
Copy link
Member Author

/wait

@stale
Copy link

stale bot commented May 17, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label May 17, 2019
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
This implements a DNS SRV resolver named envoy.srv and handles it's
dynamic registration on server startup.

Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jul 17, 2019
Signed-off-by: Venil Noronha <veniln@vmware.com>
@htuch htuch added the no stalebot Disables stalebot from closing an issue label Jul 18, 2019
Signed-off-by: Venil Noronha <veniln@vmware.com>
Signed-off-by: Venil Noronha <veniln@vmware.com>
@mattklein123 mattklein123 removed the no stalebot Disables stalebot from closing an issue label Jul 28, 2019
Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments on the current PR. What's the status of this? Is it still WiP?
/wait

});
dns_timeout->enableTimer(std::chrono::seconds(5));

latch.Wait();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that what you're encountering here is the limitation that exists today in Envoy resolvers; they are synchronous. If you want, I think a separate PR to add asynch resolver plugin support would be welcomed.

* @return if non-null, a handle that can be used to cancel the resolution.
* This is only valid until the invocation of callback or ~DnsResolver().
*/
virtual ActiveDnsQuery* resolveSrv(const std::string& dns_name, DnsLookupFamily dns_lookup_family,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the existing API does it this way, but I'd be interested if we could make ActiveDnsQuery RAII.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be willing to explain what you mean here more? The ActiveDnsQuery abstract class only has one method and no data members. The PendingResolution struct is derived from ActiveDnsQuery and contains data, is that what you want to be RAII? Do you know why a struct was used instead of a class? Was it just to avoid another class definition?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was suggesting that the returned ActiveDnsQuery be a unique ptr. The idea is that if this returned object is then destructed, it gives you automagic cancellation of the request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch Are you saying it's better to return a std::unique_ptr<ActiveDnsQuery> instead of ActiveDnsQuery*? I'm not quite sure how it actually cancels the pending dns request. Mind elaborating a bit more? (Sorry if the answer is obvious as I'm pretty new to C++ :) )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, return std::unique_ptr<ActiveDnsQuery>. The destructor for ActiveDnsQuery can then perform the cancellation.

void DnsResolverImpl::PendingSrvResolution::onAresSrvFinishCallback(
std::list<DnsResponse>&& srv_records) {
if (!srv_records.empty()) {
completed_ = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this true regardless of the length of the srv_records? I.e. the query is over, we're not waiting any longer?

for (auto instance = response.begin(); instance != response.end(); ++instance) {
Address::InstanceConstSharedPtr inst_with_port(
Utility::getAddressWithPort(*instance->address_, current_reply->port));
mutex->lock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we taking locks? Isn't everything thread local on this dispatcher?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are threads and dispatchers always coupled? I think so, but I'm not 100% sure. So any operations happening through a particular dispatcher will operate on the same thread?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the purpose of DNS resolution, this should be true (that they are coupled).

}
});
}
replies_parsed = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need an explicit replies_parsed variable here, or can the control logic be made more explicit?

@stale
Copy link

stale bot commented Aug 7, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Aug 7, 2019
@stale
Copy link

stale bot commented Aug 15, 2019

This pull request has been automatically closed because it has not had activity in the last 14 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot closed this Aug 15, 2019
@cottonbeckfield
Copy link

Is this still actually a WIP or has it really closed out? Seems like there was just a drop off..

@venilnoronha
Copy link
Member Author

I've been a little busy so couldn't get it to a finish. I may pick it up later once I have more time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale stalebot believes this issue/PR has not been touched recently waiting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SRV service discovery support
7 participants