Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFE] Change validate_env to include a one time boot to Foreman nic if provision fails #394

Open
QuantumPosix opened this issue Sep 8, 2021 · 2 comments

Comments

@QuantumPosix
Copy link
Contributor

With some of the OStree based OS's there is an issue where disk metadata is not being properly wiped and is requiring a second reboot (whether through badfish / GUI) and second one-time boot to pxe interface (foreman) which then properly provisions through anaconda.

suggest changing to allow this in validate_env.py
if the host is still in build state and has no ssh connection to then issue the (reboot / pxe to foreman) as part of the process

this should also work for other use cases where the install / provision hangs requires manual intervention.

@sadsfae
Copy link
Member

sadsfae commented Sep 14, 2021

@QuantumPosix @grafuls we seem to already be doing this here in validate_env.py

https://github.com/redhat-performance/quads/blob/master/quads/tools/validate_env.py#L142

In case of ALIAS systems, I wonder what's keeping this from trying, should we be clearing jobs first?

@grafuls grafuls self-assigned this Sep 15, 2021
@grafuls
Copy link
Contributor

grafuls commented Oct 1, 2021

The way VE works is by first collecting all hosts that are still marked for build in Foreman and runs a netcat cmd against port 22. If the port is not open we assume that the host is either booting or stuck on the provisioning so we do skip the host and we let it be handled by the subsequent cron task execution for VE.
If port 22 is open we assume the host is running the OS already so we go ahead and we send the boot-to-foreman cmd via Badfish if supported.
We can check if the is_supported method is working correctly but this ticket should be moved from RFE to BUG if that's the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: To do
Development

No branches or pull requests

3 participants