BMH with good BMC credentials and registration error stuck deleting #1385
Labels
kind/bug
Categorizes issue or PR as related to a bug.
triage/accepted
Indicates an issue is ready to be actively worked on.
What steps did you take and what happened:
Seen in #1382
Still need to verify the reproduction steps, but this is what I believe happened:
first
) inavailable
state was deleted without waiting for it to go away. Tthe important part is that it is still there, not that it was deleting. Just mentioning for completeness sake.)second
) was created with identical BMC details (it is the same machine, just "re-creating" the BMH).second
hits registration error since there is a node with identical MAC in Ironic already.second
is marked for deletion (cleaning up after the failed test).first
disappears eventually when the deletion is completed.second
remains indefinitely stuck trying to power off before deletion.I believe this is due to a simple logic fail here:
baremetal-operator/controllers/metal3.io/host_state_machine.go
Lines 588 to 592 in c493623
The situation I described above have a BMH with
NeedsRegistration=true
(because registration failed due to conflict) andhaveCreds=true
. We gettrue && false
so we do notskipToDelete
. It is impossible to power off the BMH since it was not registered but this is what is attempted. I think the proper check would beNeedsRegistration || !haveCreds
. In either of these cases we cannot power off the BMH and should just skip to deletion.What did you expect to happen:
The BMH with registration error should eventually be deleted.
Anything else you would like to add:
I'm planning to open a PR with the suggested fix.
Environment:
/kind bug
The text was updated successfully, but these errors were encountered: