New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for node power management with cluster create and delete #146
Comments
3 tasks
#147 was held up on workflow approval, that's been pushed through and likely will need review once completed |
chrisdoherty4
added
the
kind/feature
Categorizes issue or PR as related to a new feature.
label
May 3, 2022
#147 still held up. Awaiting response from author. |
3 tasks
We're not integrating PBnJ to CAPT as we're integrating Rufio. Will get the issue closed. /close |
mergify bot
added a commit
that referenced
this issue
Jun 15, 2022
## Description The PR enables automated power on/off of nodes that are made part of the cluster using [Rufio BMCJobs](https://github.com/tinkerbell/rufio/blob/main/api/v1alpha1/bmcjob_types.go). 1. When a hardware is selected to be part of a cluster, a `BMCJob` is created to get it to a state ready for provisioning with Tinkerbell. 2. When a cluster is deleted, a `BMCJob` is created that would power off the hardware once its been released (owner labels removed). 3. The rufio controller looks for these jobs and executes the listed Tasks. ## Why is this needed Tries to address the issues listed on #146. The issue mentions PBNJ but Rufio is a k8s controller and fits well with CAPT. ## How Has This Been Tested? 1. Created a `KinD` cluster and installed `Rufio` and `CAPT`. 2. Applied a cluster manifest as elaborated [QUICK-START.md ](https://github.com/tinkerbell/cluster-api-provider-tinkerbell/blob/main/docs/QUICK-START.md#create-your-first-workload-cluster) 3. Monitored the nodes to check power on and PXE booting. 4. kubectl delete cluster <cluster-name> 5. Monitored the nodes to check power off. ## How are existing users impacted? What migration steps/scripts do we need? No impact on existing users. The `BMCJob` creates are skipped if the Tinkerbell hardware object on the cluster does not have a `.Spec.BMCRef` ## Checklist: I have: - [ ] updated the documentation and/or roadmap (if required) - [ ] added unit or e2e tests - [ ] provided instructions on how to upgrade
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When clusters are created/deleted with CAPT, the bare metal nodes have to be manually powered on and set to PXE boot for cluster create and need to be manually powered off after cluster delete to completely delete the cluster.
Expected Behaviour
Current Behaviour
Possible Solution
An integration with a BMC power management service like pbnj would help automate power on/off for CAPT.
The text was updated successfully, but these errors were encountered: