Using different network for loadbalancer #1558

ayetkin · 2023-05-18T08:27:46Z

/kind bug

What steps did you take and what happened:
In our OpenStack environment, networks assigned to projects are announced with BGP, so we have direct access to the network we defined for loadbalancers and instances. Therefore, in the OpenStackMachineTemplate.spec.template.spec.networks section, we give the network we announced with the current BGP with the network filter so that we can directly access the nodes without bastion host or giving floating ip. We also give the same network in the OpenStackCluster.spec.network section. In the setup we have done in this way, the VIP that the loadbalancer receives is unfortunately not announced with BGP. We asked this issue to octavia and neutron by opening the following issues.

Octavia: https://storyboard.openstack.org/#!/story/2010758
Neutron: https://bugs.launchpad.net/neutron/+bug/2020001

When we want to overcome this problem on the Cluster Api side, we giving a FIP network that we announced from L3 to the edges for the loadbalancer and setting the disableAPIServerFloatingIP: true , but at this point the loadbalancer is created, but when adding the first control plane instance as a member, we get the following error.

Reconciling load balancer member failed: error create lbmember: Misssing input for argument [Address]

When we made different experiments, we saw that the OpenStackCluster.spec.network and the network filter in the OpenStackMachineTemplate.spec.template.spec.networks are not the same, we get this error every time.

We also tried to ensure that the loadbalancer VIP ip we experienced by Octavia and Neutron received a FIP from the external network to the BGP announcement problem, but at this stage, we got the following error.

External network 47906ae8-fb7f-4817-91db-7272174296ac is not reachable from subnet 3fcf3df4-0884-4af5-be8b-e38627afd3f5. Therefore, cannot associate Port 1eea4f1e-83da-4f56-bc1e-869e0ca09f08 with a Floating IP. Neutron server returns request_ids: ['req-ce9bd175-7740-4d1f-94ad-651898a0decd']

It is obvious that this error says that the BGP network and external network we gave to the instances and loadbalancer on the openstack side are inaccessible, btw when we connect the two networks to each other through a router, we think that the FIP assigned to the loadbalancer is still inaccessible with the asymmetric route.

What did you expect to happen:
When two different networks are given to the instances and the loadbalancer on the CAPI Openstack Provider side,
We think that the Misssing input for argument [Address] error is a coding-related bug.

OpenStackCluster

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha6
kind: OpenStackCluster
metadata:
  name: devops-k8s-test
spec:
  apiServerLoadBalancer:
    enabled: true
  cloudName: openstack
  dnsNameservers:
    - x.x.x.x
    - y.y.y.y
  externalNetworkId: xxxxxxxx-xxxxx-xxxxx-xxxxxx-xxxxx  # admin-fip-provider-net-01
  identityRef:
    kind: Secret
    name: "devops-k8s-test-cloud-config"
  managedSecurityGroups: false
  disableAPIServerFloatingIP: true
  network:
    name: admin-fip-provider-net-01

OpenStackMachineTemplate

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha6
kind: OpenStackMachineTemplate
metadata:
  name: devops-k8s-test-control-plane
spec:
  template:
    spec:
      cloudName: openstack
      flavor: capi-controlplane-default
      identityRef:
        kind: Secret
        name: devops-k8s-test-cloud-config
      image: ubuntu-2004-kube-v1.25.8
      sshKeyName: iacops
      securityGroups:
        - name: default
      networks:
        - filter:
            name: devopstest-k8s-net
      rootVolume:
        diskSize: 60
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha6
kind: OpenStackMachineTemplate
metadata:
  name: devops-k8s-test-worker-pool
spec:
  template:
    spec:
      cloudName: openstack
      flavor: capi-worker-small
      identityRef:
        kind: Secret
        name: devops-k8s-test-cloud-config
      image: ubuntu-2004-kube-v1.25.8
      sshKeyName: iacops
      securityGroups:
        - name: default
      networks:
        - filter:
            name: devopstest-k8s-net
      rootVolume:
        diskSize: 60

Environment:

Cluster API Provider OpenStack version: v0.7.1
Cluster-API version: v1.4.2
OpenStack version: Wallaby (cluster installed via kolla-ansible)
Kubernetes version (use kubectl version): v1.25.6
OS (e.g. from /etc/os-release): Ubuntu 2004
OS Version: Ubuntu 20.04.2 LTS Hosts. (Kernel:5.4.0-90-generic)
Octavia Version: Wallaby - 8.0.1.dev35
Neutron Version: 18.1.2.dev118 [“neutron-server”, “neutron-dhcp-agent”, “neutron-openvswitch-agent”, “neutron-l3-agent”, “neutron-bgp-dragent”, “neutron-metadata-agent”]
There exist 5 controller+network node.
OpenvSwitch used in DVR mode and router HA is disabled. (l3_ha = false)
We are using a single centralized neutron router for connecting all tenant networks to provider network.
We are using bgp_dragent to announce unique tenant networks.
Tenant network type: vxlan
External network type: vlan

The text was updated successfully, but these errors were encountered:

mdbooth · 2023-05-18T09:32:41Z

A quick look at the code suggests that

Reconciling load balancer member failed: error create lbmember: Misssing input for argument [Address]

is due to

cluster-api-provider-openstack/controllers/openstackmachine_controller.go

Lines 511 to 512 in 06f81e5

    
           ip := instanceNS.IP(openStackCluster.Status.Network.Name) 
        
           loadbalancerService, err := loadbalancer.NewService(scope)

:

    ip := instanceNS.IP(openStackCluster.Status.Network.Name)

Here we are assuming that:

The server has an IP address on the cluster network
This IP address is the one we will use for the LB member

It looks like your workers don't have an interface on this network, though, which is why this is breaking. Can you think of a way to do what you need within these constraints?

I would like to improve this situation, btw, but it's going to require architecture changes, possibly even outside of the scope of just CAPO (i.e. a platform-independent API loadbalancer provider). This is an awesome write up which I will try to ensure is re-used when we're looking at use cases, but at first glance I don't think we're going to be able to fix this quickly.

k8s-triage-robot · 2024-01-20T18:13:02Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-02-19T18:16:15Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2024-03-20T19:05:03Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

k8s-ci-robot · 2024-03-20T19:05:09Z

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 18, 2023

oblazek mentioned this issue Jan 3, 2024

Add the ability to specify a configurable VIP network for loadbalancer #1809

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 19, 2024

k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using different network for loadbalancer #1558

Using different network for loadbalancer #1558

ayetkin commented May 18, 2023

mdbooth commented May 18, 2023

k8s-triage-robot commented Jan 20, 2024

k8s-triage-robot commented Feb 19, 2024

k8s-triage-robot commented Mar 20, 2024

k8s-ci-robot commented Mar 20, 2024

Using different network for loadbalancer #1558

Using different network for loadbalancer #1558

Comments

ayetkin commented May 18, 2023

mdbooth commented May 18, 2023

k8s-triage-robot commented Jan 20, 2024

k8s-triage-robot commented Feb 19, 2024

k8s-triage-robot commented Mar 20, 2024

k8s-ci-robot commented Mar 20, 2024