You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So the install_start watchdog appears to be set to 3 hours when the install starts. Which seems kind of long for the use cases I do.
Are installs really taking longer than 30 minutes?
The watchdog gets set to 3 hours when the install process tells the beaker lab controller that it is starting the install. So the install has started at that point.
The reason I am asking is we are sometimes seeing issues where our install finishes reboots and then something goes wrong in our firmware and it doesn't boot from the disk. So then it takes about 3 hours for the job to get aborted.
Looking at the watchdog flow it appears to be:
30 minutes when the provision process starts
3 hours when the install kickstart hits the "/install_start/<recipe_id>" on the lab controller
40 minutes (2400 seconds) when Restraint starts
Then task specific watchdog time once task starts.
Long one. All values were precisely picked based on systems used in Red Hat and issues with NFS / HTTP / FTP services servicing all the content. It may look like overkill but we bridged some of the values in the past.
If your system is connected to console, then make sure that the Install detector + Panic detector are turned on. It will abort installation in case you will meet criterium.
So the install_start watchdog appears to be set to 3 hours when the install starts. Which seems kind of long for the use cases I do.
Are installs really taking longer than 30 minutes?
The watchdog gets set to 3 hours when the install process tells the beaker lab controller that it is starting the install. So the install has started at that point.
The reason I am asking is we are sometimes seeing issues where our install finishes reboots and then something goes wrong in our firmware and it doesn't boot from the disk. So then it takes about 3 hours for the job to get aborted.
Looking at the watchdog flow it appears to be:
install_start code:
beaker/LabController/src/bkr/labcontroller/proxy.py
Lines 634 to 640 in 87a0f32
beaker/Server/bkr/server/recipes.py
Lines 217 to 235 in 87a0f32
The text was updated successfully, but these errors were encountered: