NAP is creating NodeClaims and VMs but not registering Nodes to the cluster #248
Labels
area/bootstrap
Issues or PRs related to bootstrap
area/provisioning
Issues or PRs related to provisioning (instance provider)
area/security
Issues or PRs related to security
triage/duplicate
Indicates an issue is a duplicate of other open issue.
Version
Karpenter Overlay Version: N/A - we are using managed NAP
Kubernetes Version: v1.28.5
Expected Behavior
Karpenter should create NodeClaims and provision VMs which then register to the cluster and are capable of scheduling Pods.
Actual Behavior
Pods go into pending state and create a NodeClaim with Karpenter i.e.
![image](https://private-user-images.githubusercontent.com/43652033/319137311-51a3da71-bd5c-4e9d-aec9-694d2177e6b0.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAzNzc0OTAsIm5iZiI6MTcyMDM3NzE5MCwicGF0aCI6Ii80MzY1MjAzMy8zMTkxMzczMTEtNTFhM2RhNzEtYmQ1Yy00ZTlkLWFlYzktNjk0ZDIxNzdlNmIwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzA3VDE4MzMxMFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWVkZTUyMzRiN2NhZjAxMzc2OTA0ZWY2N2ZjNzc1ZDQ5NDEwYzcwMGIxMTAzMWMzMGIyOTdkNzVmOTY2MTg1MTUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.ROSAe1bLO65lqRYfpICj6MNQ7k9MLTZLGCWozXX9ZvQ)
The NodeClaims then create and provision a VM but never reach Ready state
![image](https://private-user-images.githubusercontent.com/43652033/319137604-b1023aa1-ed6c-465e-9cc6-3f07969b252c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAzNzc0OTAsIm5iZiI6MTcyMDM3NzE5MCwicGF0aCI6Ii80MzY1MjAzMy8zMTkxMzc2MDQtYjEwMjNhYTEtZWQ2Yy00NjVlLTljYzYtM2YwNzk2OWIyNTJjLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzA3VDE4MzMxMFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTVlNmE3MTA3ODkyMzhiMzE1YWU3Njg5OTc3NWRhNTUxOTc4N2YyODNhODEzODkxMDg3MGQ3YmM3MDQxNDZiODAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.cLjpFUUkD_q1ZeYOxO2i6_EeAk-aruuP5E434B1V5hk)
The reason given in the Claim is NodeNotFound:
![image](https://private-user-images.githubusercontent.com/43652033/319137816-2444cfb0-1983-4a6a-9858-f4be1ff990fe.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAzNzc0OTAsIm5iZiI6MTcyMDM3NzE5MCwicGF0aCI6Ii80MzY1MjAzMy8zMTkxMzc4MTYtMjQ0NGNmYjAtMTk4My00YTZhLTk4NTgtZjRiZTFmZjk5MGZlLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzA3VDE4MzMxMFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTY5ZDQ4MDA0YjIwOTk3ZGU1MjhhZDJkZTg5YTZlODQ3M2FhMzc5MDYzNDc1NThhZGMzYzRlYjcxMzQ0NjE1MTgmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.umDo1bsaW7SnuiQkNF2qvswt9wp7wSuQ9FNp-Tntgfg)
We can see the VMs are up and running in the account:
![image](https://private-user-images.githubusercontent.com/43652033/319138795-febc48a8-f03d-4905-892a-2a800da71511.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjAzNzc0OTAsIm5iZiI6MTcyMDM3NzE5MCwicGF0aCI6Ii80MzY1MjAzMy8zMTkxMzg3OTUtZmViYzQ4YTgtZjAzZC00OTA1LTg5MmEtMmE4MDBkYTcxNTExLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzA3VDE4MzMxMFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTdjNTEyNWM4MTNhYTVhMGJmNTFmZTgxMDRmMjk5NzZiNTVmZjk4NDcwMTBjNzNkZTE2OTE4Yzk0NDM2MTEwOTUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.0sdLk5dYiGZs15buSfGfO2oIltV4AHGFEy12vW5PYOs)
Steps to Reproduce the Problem
N/A - this is a managed service. In our cluster this was working until yesterday afternoon and then Karpenter stopped being able to register Nodes with the cluster altogether.
Resource Specs and Logs
N/A - issue is with managed NAP not registering nodes to the cluster, no logs we can see outside of those shared in the "Actual behaviour" above.
Community Note
The text was updated successfully, but these errors were encountered: