clusterapi: `HasInstance` function tries to find machine with wrong key #6774

mweibel · 2024-04-29T09:11:57Z

Which component are you using?:
cluster-autoscaler

What version of the component are you using?:
latest master

What k8s version are you using (kubectl version)?:
1.28.5

What environment is this in?:
clusterapi provider running in an CAPZ cluster

What did you expect to happen?:
HasInstance method is able to find Machine which exists and does not report machine not found for node: X

What happened instead?:
Lookup of machine is using a wrong key.

The store contains items using {{cluster-namespace}}/{{machine-name}} but the lookup uses only {{machine-name}} as the key.

How to reproduce it (as minimally and precisely as possible):

Setup a CAPZ
Start cluster-autoscaler
Set a breakpoint in clusterapi_controller.go#findMachine()
Go into store find and compare the lookup key with the existing keys in the cache/informer

Anything else we need to know?:
Autoscaler was run in debug mode using following flags:

--cloud-provider=clusterapi 
--namespace=kube-system 
--node-group-auto-discovery=clusterapi:clusterName={{clusterName}},namespace={{managementClusterNamespace}}
--kubeconfig=~/.kube/{{clusterName}}.yaml 
--cloud-config=~/.kube/{{management-cluster}} 
--ignore-daemonsets-utilization=true 
--logtostderr=true 
--max-node-provision-time=0m30s 
--scale-down-delay-after-add=2m
--scale-down-unneeded-time=2m
--scan-interval=10s
--stderrthreshold=info
--v=4
--leader-elect=false

I'm not sure if there are specific flags which trigger the store keys being different. I'm going to look this up.
Ping @MaxFedotov - the HasInstance method was introduced in #6708 and maybe we could figure out what is different in our envs?

The text was updated successfully, but these errors were encountered:

fixes kubernetes#6774

elmiko · 2024-05-13T13:08:18Z

/area provider/cluster-api

mweibel added the kind/bug Categorizes issue or PR as related to a bug. label Apr 29, 2024

mweibel added a commit to helio/autoscaler that referenced this issue Apr 29, 2024

fix(clusterapi): HasInstance with namespace prefix

c616556

fixes kubernetes#6774

mweibel linked a pull request Apr 29, 2024 that will close this issue

fix(clusterapi): HasInstance with namespace prefix #6776

Open

k8s-ci-robot added the area/provider/cluster-api Issues or PRs related to Cluster API provider label May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

clusterapi: `HasInstance` function tries to find machine with wrong key #6774

clusterapi: `HasInstance` function tries to find machine with wrong key #6774

mweibel commented Apr 29, 2024

elmiko commented May 13, 2024

clusterapi: HasInstance function tries to find machine with wrong key #6774

clusterapi: HasInstance function tries to find machine with wrong key #6774

Comments

mweibel commented Apr 29, 2024

elmiko commented May 13, 2024

clusterapi: `HasInstance` function tries to find machine with wrong key #6774

clusterapi: `HasInstance` function tries to find machine with wrong key #6774