Add bootstate endpoint to registry server#424
Conversation
bd4e68c to
9a2c29f
Compare
|
Adding a |
This would introduce a gap in which the bootstate can be lost:
Restarting the registry in the second step would loose the bootstate event. |
9a2c29f to
f32a257
Compare
hardikdr
left a comment
There was a problem hiding this comment.
Thanks for the PR @Nuckal777, dropped some nit inline.
Thinking about it a little more, I’d propose a following holistic way of handling the first-boot flow. Instead of tying it to a single signal, we could make it more flexible, for example, let the boot-operator set conditions like IPXEScriptFetched and IgnitionDataFetched, and have the /bootstate POST update a BootStateReceived condition on the ServerBootConfig.
Then, the metal-operator could decide which of these conditions to treat as the actual boot completion using a boot-completion-condition flag. This would also make it easier to support things like NetBootOnce and NetBootAlways policies on the ServerClaim side later even when /bootstate POST call is not configured in the Ignition. Wdyt?
internal/registry/server.go
Outdated
| conditionutils.UpdateStatus(metav1.ConditionTrue), | ||
| conditionutils.UpdateReason("BootStatePosted"), | ||
| conditionutils.UpdateMessage("Server successfully posted boot state"), | ||
| conditionutils.UpdateObserved(&server), |
There was a problem hiding this comment.
It would be interesting to explore the trade offs between having this condition on Server vs ServerBootConfig CR. I am somehow leaning more towards ServerBootConfig.
There was a problem hiding this comment.
Hmm, ServerBootConfiguration might be more fitting as their lifetimes are bound to a ServerClaim and the discovery boot. 🤔
There was a problem hiding this comment.
Perfect, I would say let's already move it to ServerBootConfig, and we can also discuss this in our PR review roulette soon.
Agree, iirc it was part of the initial discussion on the topic that the owner of a |
Sure, having a knob on the ServerClaim would allow configuring it per Claims then, I would also consider tradeoffs against having a common single flag in the metal-operator. |
2118833 to
4c36b38
Compare
hardikdr
left a comment
There was a problem hiding this comment.
Tests seem to be failing.
Looks good otherwise.
4c36b38 to
6f37a36
Compare
defo89
left a comment
There was a problem hiding this comment.
I think it looks good, also find the suggestions from @hardikdr valuable.
For the future we could consider adding BootStateReceivedTimeout for cases registry fails to receive the boot state from server (issues with registry or server failed to boot), to retry redeploying a server.
Proposed Changes
/bootstatecall back endpoint inmanagerto track ifServersuccessfully booted #399 the systemUUID is moved into the JSON payload, similar to the/registerendpoint.Fixes #399.