Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to use internal node IPs for NodePort services #10278

Merged
merged 13 commits into from
Feb 27, 2024

Conversation

jorisvergeer
Copy link
Contributor

@jorisvergeer jorisvergeer commented Dec 3, 2023

What does this PR do?

This PR causes traefik to use the ip adresses of the node insead of the pods when kubernetes services are of type NodePort. In this case it also uses the services nodePort instead of the port.

Motivation

I have a mixed cloud environment with some micro services running in k8s and some not. I run traefik externally in a separate instance. Using the kubernetesCRD provider, treafik is able to detect the services. The only problem is that the IP addresses that are detected are the internal pod ip adresses. And those are not accessible from the instance running traefik.

The services are accessible from the node ip when the service type is NodePort. In this case traefik should be able to access the services using the node ip and the nodeport of the service.

This feature would help me to assist in a gradual migration from native services to traefik and k8s.

More

  • Added/updated tests
  • Added/updated documentation

Additional Notes

This might have impact on users that accidently (or purposely) use nodeport as service type.

Is this a feature that will even be considered?

Can this feature also be merged into the 2.x branch?

@rtribotte
Copy link
Member

Hello @jorisvergeer,

Thanks for opening this!

We took a look at the changes and discussed the feature and we have some feedback.
The PR introduces a breaking behavior for nodePort Kubernetes services, as you mentioned, and we think it is needed to control it with an option. We are thinking about something like a nodePortLB option that would be added to the IngressRoute or as an annotation on the Kubernetes service, like the nativeLB option (which would be exclusive to this one).

We may need to discuss further the initial behavior or say the handling of NodePort services by Traefik in a proposal issue.

We also noticed that the new behavior is implemented only for the Kubernetes CRD provider, but it should also be available for the Kubernetes Ingress provider, could you please address that in your PR?

As a first-glance, review comment, we also noticed that in the Kubernetes client, it uses the service namespace to fetch node resources, but as nodes are not namespaced, this should not be the case.

@jorisvergeer
Copy link
Contributor Author

jorisvergeer commented Jan 31, 2024

Hi @rtribotte ,

I am trying to implement your suggestions. I think I got the "nodePortLB" working in the CRD in the code. But if I search in the repo for "nativeLB" it comes up at hundreds of places, and also in other route types, like UDP and TCP route kinds.

Also there are lots of copies of the CRD definition schema's where nativeLB is mentioned as well.

I'd like some guidance to where I can limit the scope of this PR.

@rtribotte
Copy link
Member

Hello @jorisvergeer,

Sure, for CRD manifests:

  • docs/content/reference/dynamic-configuration/kubernetes-crd-definition-v1.yml
  • docs/content/reference/dynamic-configuration/traefik.io_ingressroutes.yaml
  • docs/content/reference/dynamic-configuration/traefik.io_ingressroutetcps.yaml
  • docs/content/reference/dynamic-configuration/traefik.io_ingressrouteudps.yaml
  • docs/content/reference/dynamic-configuration/traefik.io_middlewares.yaml
  • docs/content/reference/dynamic-configuration/traefik.io_traefikservices.yaml
  • integration/fixtures/k8s/01-traefik-crd.yml

you need to run make generate-crd target that will update those for you.

For the documentation (kubernetes-crd.md and kubernetes-ingress.md), this is a manual step, you should document the new nodePortLB option, like it has been done for the nativeLB option.

@ldez ldez changed the base branch from v3.0 to v2.11 February 5, 2024 12:58
@ldez
Copy link
Member

ldez commented Feb 5, 2024

Before applying changes, the PR should be rebased on the branch v3.0 because the PR is based on v2.10 branch.

$ git remote -v         
origin  git@github.com:jorisvergeer/traefik.git (fetch)
origin  git@github.com:jorisvergeer/traefik.git (push)
upstream        git@github.com:traefik/traefik.git (fetch)
upstream        git@github.com:traefik/traefik.git (push)

$ fetch --multiple upstream origin
...
$ git switch feature_node_port_ip
...
$ git rebase --onto=upstream/v3.0 dae0491b6
...
# The conflicts should be fixed.

@ldez ldez changed the base branch from v2.11 to v2.10 February 5, 2024 13:06
@jorisvergeer jorisvergeer changed the base branch from v2.10 to v2.11 February 5, 2024 15:28
@jorisvergeer jorisvergeer changed the base branch from v2.11 to v3.0 February 5, 2024 15:30
@jorisvergeer
Copy link
Contributor Author

Sorry, for the changing branches a bit. I was testing changes based on 2.11 and 3.0 locally.

I merged my changes, originally build on 2.10 with 3.0 now.

I think the code is finished, currently testing it in my test environment. Expecting it to pass in the next hour or so.

@jorisvergeer
Copy link
Contributor Author

jorisvergeer commented Feb 5, 2024

@rtribotte @ldez
Works fine in my test environment, both ingress and ingressroute

@ldez
Copy link
Member

ldez commented Feb 5, 2024

Due to the multiple branch changes the CI is lost, I should close and re-open the PR.

Note: I recommend using rebase instead of merge to handle PR because it's easier to change the targeted branch and avoid mixing non-related commits.

@ldez ldez closed this Feb 5, 2024
@ldez ldez reopened this Feb 5, 2024
@jorisvergeer
Copy link
Contributor Author

jorisvergeer commented Feb 5, 2024

I see that I have to work on the unit tests.
They were OK when i was still on the 2.X branch.

@ldez
Copy link
Member

ldez commented Feb 5, 2024

I see that I have to work on the unit tests.

the problem is not unit tests but the code because the code of v2 is not the same as the v3

https://github.com/traefik/traefik/actions/runs/7788029566/job/21236555014?pr=10278

@jorisvergeer
Copy link
Contributor Author

I see that I have to work on the unit tests.

the problem is not unit tests but the code because the code of v2 is not the same as the v3

https://github.com/traefik/traefik/actions/runs/7788029566/job/21236555014?pr=10278

That's what i meant. The mocked GetNodes got lost i think.

@jorisvergeer
Copy link
Contributor Author

jorisvergeer commented Feb 5, 2024

Fixed locally, and took the liberty to add unit tests for NodePortLB as well.

Some unrelated docker unit tests failed locally. Not sure if that's a problem with my code.

@kevinpollet kevinpollet changed the base branch from v3.0 to master February 27, 2024 09:39
@kevinpollet kevinpollet removed their assignment Feb 27, 2024
@traefiker traefiker merged commit c1ef742 into traefik:master Feb 27, 2024
22 checks passed
@jorisvergeer jorisvergeer deleted the feature_node_port_ip branch February 27, 2024 12:42
@nmengin nmengin modified the milestones: 3.0, next May 2, 2024
@rtribotte rtribotte modified the milestones: next, 3.0 May 2, 2024
@deveshk0
Copy link

deveshk0 commented May 3, 2024

Hi Guys

this is not included in 3.0, any plans to add this to any version?

@jorisvergeer
Copy link
Contributor Author

@deveshk0 It's merged into master so it should be available in 3.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/k8s/crd area/provider/k8s kind/enhancement a new or improved feature. priority/P2 need to be fixed in the future size/S
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

8 participants