Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate SRV records from Services to respect K8s DNS spec #1330

Closed

Conversation

weihanglo
Copy link

@weihanglo weihanglo commented Dec 24, 2019

What

Generates SRV records from services to respect Kubernetes DNS specification.

Thanks for #559, ExternalDNS already got a basic implementation of generating SRV records from NodePort serivces. However, it is still far away from Kubernetes DNS specification. This pull request tries to implement full part of the spec concerning SRV records.

How

Simply follow the spec. Here are two common rules for all service types:

  • Unnamed ports are ignored.
  • Each name port, generate one corresponding SRV record.

Below are type specific rules:

  • NodePort/LoadBalancer
    • Use Service.spec.nodePort as the port on the target host.
  • ClusterIP with a valid ClusterIP
    • Use Service.spec.port as the port on the target host.
  • Headless Service
    • Headless Service itself does not generate SRV records.
    • Each endpoint of headless Service, particularly a Pod, generates its own SRV record. That is to say, if we get M named ports with N selected Pods, there would be M x N SRV records.
    • The domain name of each target host would be the corresponding endpoint of querying A record for the headless Service.

Future works

If all changes are valid, I would get more tasks done in the following PRs (or push more commits if we are comfortable with PR bloating 😂).

  • Update plan/plan.go#filterRecordsForPlan to filter in SRV records.
  • Add more tests for plan/plan_test.go for SRV records.
  • Update document of headless Service for generating SRV records.
  • Make sure each provider can generate correct SRV records.
    • A known issue is that on Google Cloud DNS, it would get 400 Bad Request if target hostname in SRV record is not FQDN.
    • ⚠️ [Discussion needed] I've already write a workaround to ensure FQDN of supporting record types with a target hostname. However, current logic is that we ensureTrailingDot on provider-sides. Supposed it's more appropriate to do that at the end of Endpoints() function for each source.

References

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 24, 2019
@k8s-ci-robot
Copy link
Contributor

Welcome @weihanglo!

It looks like this is your first PR to kubernetes-sigs/external-dns 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/external-dns has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 24, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: weihanglo
To complete the pull request process, please assign njuettner
You can assign the PR to them by writing /assign @njuettner in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@weihanglo
Copy link
Author

/assign @njuettner

@weihanglo
Copy link
Author

@Raffo Sorry for bothering you. Just want to make sure this PR is queued for reviews. Thanks.

@JoaoBraveCoding
Copy link
Contributor

@weihanglo @Raffo I'm also interested in this pull request since I was about to open a pull request to add support for SRV records with provider rfc2136 and source CRD.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 10, 2020
@weihanglo
Copy link
Author

Hi, sorry for ping you again. Would this PR be reviewed after my resolving conflicts?
@njuettner @linki

@weihanglo
Copy link
Author

@Raffo @hjacobs would you have time to take a look on this? Let me know if this PR should go on or not. Thanks!

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 4, 2020
@Raffo
Copy link
Contributor

Raffo commented Jul 8, 2020

Sorry @weihanglo we totally missed that, apologies. Can you please resolve the conflict so that I can proceed to a review?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 8, 2020
@weihanglo
Copy link
Author

Sorry @weihanglo we totally missed that, apologies. Can you please resolve the conflict so that I can proceed to a review?

@Raffo Ok. I'll resolve this ASAP.

Signed-off-by: Weihang Lo <me@weihanglo.tw>
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 8, 2020
@weihanglo
Copy link
Author

@Raffo
Updated. Please take a look. Thanks 😀

@seanmalloy
Copy link
Member

/kind feature

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 19, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 19, 2020
@weihanglo
Copy link
Author

@seanmalloy
Thanks for adding the label. This PR has been updated following current master.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 18, 2020
@k8s-ci-robot
Copy link
Contributor

@weihanglo: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@seanmalloy
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2020
@arianvp
Copy link

arianvp commented Nov 26, 2020

One thing I noticed with external-dns that it does differently from the kubernetes internal DNS is that when you have headless service + statefulset in k8s then if a Pod goes UnReady it is ONLY removed from the SRV record; but the A record for the pod remains. This is important as the A record is the stable identity used in .e.g distributed databases.

However, external-dns removes my A record my-pod-0.myservice.mydomain.example immediately as soon as pod-0 goes in an NotReady which is very different behavior. Also very unwanted because the reason why databases go unhealthy is for example because they can't reach their peers. If that then causes their stable identity to disappear then the problem only worsens further.

When you gracefully terminate a pod in a distributed database; you don't want its identity to disappear before the pod is gone. So the A-record should stick around until the pod is properly gone instead of just NotReady

It wasn't very clear to me from this PR if this PR addresses this issue as well. But it would be great if it did.

Comment on lines +561 to +571
if port.Name == "" {
// Unnamed ports do not have a SRV record.
continue
}

var exposedPort int32
switch svc.Spec.Type {
case v1.ServiceTypeLoadBalancer:
fallthrough
case v1.ServiceTypeNodePort:
exposedPort = port.NodePort
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid this won't work for NodePort type. K8s uses unnamed ports when exposing NodePort services.

If you set the type field to NodePort, the Kubernetes control plane allocates a port from a range specified by --service-node-port-range flag (default: 30000-32767).

https://kubernetes.io/docs/concepts/services-networking/service/#nodeport

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 9, 2021
@seanmalloy
Copy link
Member

@weihanglo I know this PR has been open for a while now, but are you still interested in getting this merged? If yes then could you please resolve the conflicts?

We can remove the stale label too.

Thanks!

@weihanglo
Copy link
Author

I would love to resolve the conflict if it has a chance getting merged. If maintainers think this issue looks reasonable to merge please tell me. I am open to discussion 😃

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 10, 2021
@mindw
Copy link

mindw commented May 10, 2021

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label May 10, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 8, 2021
@mindw
Copy link

mindw commented Aug 8, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 8, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 6, 2021
@mindw
Copy link

mindw commented Nov 8, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 8, 2021
@weihanglo
Copy link
Author

Thanks for all your comments so far!
I am going to close it since I don't use this feature anymore. If someone want to push it forward, feel free to take it.

@weihanglo weihanglo closed this Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet