Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate HBMH before using it #1166

Open
guettli opened this issue Feb 15, 2024 · 1 comment
Open

Validate HBMH before using it #1166

guettli opened this issue Feb 15, 2024 · 1 comment

Comments

@guettli
Copy link
Contributor

guettli commented Feb 15, 2024

/kind feature

Describe the solution you'd like

We have a convenient feature: You can create a HetznerBaremetalHost without specifying RootDeviceHints (WWNs).

But up to now it is a bit hard to understand how this works.

Up to now the hbmh without RootDeviceHints gets chosen for a HetznerMachine object.

When the controller starts the provisioning process, the controller detects that RootDeviceHints are nil, and interrupt with ErrorMessageMissingRootDeviceHints.

At that point the hbmh has HardwareDetails set in the Status.

The user can then set the root device hints by coping values from HardwareDetails to RootDeviceHints.


This process works, but could be improved.

Goal: Add the HardwareDetails before adding the hbmh to a cluster. This means the process would look like this:

  1. The user creates a hbmh with just the serverId.
  2. The controller checks which clusters the hbmh could belong to. If no cluster was found a warning-condition gets set.
  3. If several clusters were found, the controller picks one randomly and sets a label on the hbmh, so that it is visible that the hbmh will get the HardwareDetails from that cluster. It is important to associate this with a cluster, because the robot-api-key is needed to boot the hbmh into rescue system.
  4. The controller takes the hbmh, creates a temporary ssh-key, and boots the hbmh into the rescue system.
  5. The controller then fetches the HardwareDetails and sets the hbmh into an invalid state.
  6. The hbmh would be in state invalid (because RootDeviceHints are still missing). The controller would not take the hbmh for a cluster.
  7. The user can edit the hbmh and add RootDeviceHints.
  8. The controller validates the RootDeviceHints, and if they are valid remove the invalid status of the hbmh.
  9. If a cluster needs a new bare-metal machine, the hbmh could be taken because it is valid.
@guettli
Copy link
Contributor Author

guettli commented Aug 15, 2024

@batistein what do you think: Is it worth the effort? Do our customers struggle here?

We could do it via an annotation. The user could set this:

hetznerbaremetalhost.infrastructure.cluster.x-k8s.io/get-hardwaredetails-via-wl-cluster: foo

Then the controller knows which cluster to use (from the same namespace).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant