-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create new LoadBalancer svc instead of modifying the Juju ClusterIP svc #319
Comments
Renaming/rewording this bug with new information. |
FYI we're affected by this issue on our COS deployment, currently more or less every week we loose our Octavia LB and have to manually patch the service (remove the lb ID annotation) and make a DNS PR to change the IP and sometimes do a FW PR |
Waiting for followup from juju team. @wallyworld |
Hi Everyone, Thanks for providing the information. As from the Juju's teams perspective I agree with the current description of this issue in that the charm should not be modifying any resource made by Juju in the Kubernetes cluster. Juju operates in a reconciliation loop where it constantly drives the desired state back into the external system. By modifying our own resources we are going to end up in a situation where we ping pong the change around. The better approach would be for the charm to provision their own service. Juju will detect this and also clean up this service on behalf of the charm when the charm is removed from a controller. In an ideal world we very much understand that we need to model ingress and load balancers into Juju and that will give everyone the best of both worlds. |
Juju used to allow an application to be deployed such that the type of k8s service created for it could be specified. Unfortunately the transition to sidecar charms saw that capability go away. We have at carious times internally discussed ideas around allowing resources created by Juju to be patched, but as Tom says, this is dangerous and would be a last resort that should be avoided if there's a viable alterative. We really do want to properly model ingress and other missing aspects of the network model offered by Juju, but that's a way off. |
Bug Description
This bug report doesn't necessarily need a fix in Traefik but there is some behaviour which could be changed to improve an underlying Juju bug.Currently (seen in Juju 3.1.6) when a Traefik unit is unable to connect to a Juju controller the
LoadBalancer
svc that Traefik created reverts to aClusterIP
as mentioned in this PR. In one observed cloud (Openstack/PS6) this triggers the existing load balancer on the cloud to be deleted and when the service is patched, returning the svc type fromClusterIP
toLoadBalancer
, the LB in the cloud no longer exists.The current fix for this in the above scenario is to manually run
kubectl patch svc cos-ingress -p '{"metadata":{"annotations":{"loadbalancer.openstack.org/load-balancer-id":null}}}'
and have Traefik request a new LB.Some relevant information on why this happens is available in this discussion but quoting the relevant bits
It seems in light of the above that the Traefik operator shouldn't be modifying the existing
ClusterIP
service that Juju creates but rather create a separateLoadBalancer
resource with the same selector as the one currently used. This would prevent the svc from being updated when Juju "resets" things.To avoid the above manual operation the operator could detect when the cloud is returning a 404 for the requested LB and clear theload-balancer-id
annotation so that a new load balancer is automatically requested. Alternatively the existing behaviour may be desirable as it makes it easier to debug these issues on the cloud which are out of the operator's control.To Reproduce
WIP
Environment
Juju: v3.1.6
traefik-k8s: latest/stable 166
Relevant log output
Additional context
Relevant Juju bug https://bugs.launchpad.net/juju/+bug/2059411
The text was updated successfully, but these errors were encountered: