Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CCM fails to initialize nodes when node IP does not match primary VNIC IP #448

Open
adriengentil opened this issue Jan 22, 2024 · 1 comment

Comments

@adriengentil
Copy link

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

CCM Version: 1.25.0

Environment:

  • Kubernetes version (use kubectl version): 1.28 / OCP 4.15.0-rc.1
  • OS (e.g. from /etc/os-release): Red Hat Enterprise Linux CoreOS 415.92.202312250243-0
  • Kernel (e.g. uname -a): 5.14.0-284.45.1.el9_2.x86_64
  • Others:

What happened?

I have an instance with 2 VNICs (the primary one and I created a secondary one), I configured the kubelet to use the IP of the secondary VNIC (--node-ip option). The CCM refused to initialize the node with this log message:

I0122 11:18:24.802391       1 node_controller.go:415] Initializing node test-infra-cluster-d8f6aff5-master-1 with cloud provider
E0122 11:18:24.987927       1 node_controller.go:229] error syncing 'test-infra-cluster-d8f6aff5-master-1': failed to get node modifiers from cloud provider: provided node ip for node "test-infra-cluster-d8f6aff5-master-1" is not valid: failed to get node address from cloud provider that matches ip: 10.0.1.54, requeuing

What you expected to happen?

I would expected the CCM to initialize the node as long as the node IP matches the IP of one of the VNICs attached to the instance.

How to reproduce it (as minimally and precisely as possible)?

Create a Kubernetes cluster, start the kubelet with the option --node-ip and specify the IP address of a secondary VNIC attached to the instance. The CCM should fail to initialize the node with the error pasted above.

Anything else we need to know?

We need a secondary VNIC because the the primary one will handle the iSCSI traffic for the boot volume. Doing so allows us to perform changes on the secondary interface without having the risk to loose the connectivity to the iSCSI boot volume.

@mhrivnak
Copy link

After just a quick look, there's a good chance that this function is to blame, which only grabs IP addresses from the primary NIC: https://github.com/oracle/oci-cloud-controller-manager/blob/a95563b/pkg/cloudprovider/providers/oci/instances.go#L68-L121

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants