-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
agent: improve maintenance scheduling, various improvements
- New requests can now be merged with existing ones if their activities have the same type. Significant changes to activities cause postponing of the updated request. - Requests can be cancelled by requesting an activity which nullifies the original activity (for example, reset channel back to current system channel => planned update will be cancelled) - UpdateActivity with metadata and better comment generation replaces dumb shell scripts for planned system updates. - VMChangeActivity with metadata replaces RebootActivity for mem and core changes. - All activities can request a reboot which will be done after all due requests have been executed. - Continously scheduled requests will be executed in one go if at least the first request is due, avoiding repeated switching to maintenance mode in a short time frame and possibly unneccessary reboots. - Overdue requests (more than 30 minutes after scheduled start time) will be postponed to avoid overrunning the planned maintenance window or interfering with other machines going into maintenance mode. - Maintenance preparation time and request execution time are different concepts now. Execution of requests is typically quite fast but there may be commands delaying the execution of all requests. Directory doesn't support this yet so we just report the sum of preparation time and estimated execution time (but at least 15min). - Un-tangled maintenance code and manage.py: all maintenance requests are now generated in maintenance.py. - Fix handling of postponed requests and cleaned up state updates in the process. tempfail and retrylimit don't exist anymore as dedicated states. - Update shortcut saving time: if the new channel of an UpdateActivity results in the same system, just set the system channel and forget about the update. - Explicitly exit after calling the reboot command. - Reduce number of channel URL resolve calls (which impact Hydra), UpdateActivity expects a resolved URL now. PL-129777
- Loading branch information
Showing
44 changed files
with
2,585 additions
and
1,318 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
"""Scheduled machine reboot. | ||
This activity does nothing if the machine has been booted for another reason in | ||
the time between creation and execution. | ||
""" | ||
|
||
from typing import Union | ||
|
||
import structlog | ||
|
||
from ..estimate import Estimate | ||
from . import Activity, ActivityMergeResult, RebootType | ||
|
||
_log = structlog.get_logger() | ||
|
||
|
||
class RebootActivity(Activity): | ||
estimate = Estimate("5m") | ||
|
||
def __init__( | ||
self, action: Union[str, RebootType] = RebootType.WARM, log=_log | ||
): | ||
super().__init__() | ||
self.set_up_logging(log) | ||
self.reboot_needed = RebootType(action) | ||
|
||
@property | ||
def comment(self): | ||
return "Scheduled {}".format( | ||
"cold boot" if self.reboot_needed == RebootType.COLD else "reboot" | ||
) | ||
|
||
def merge(self, other): | ||
if not isinstance(other, RebootActivity): | ||
self.log.debug( | ||
"merge-incompatible-skip", | ||
self_type=type(self), | ||
other_type=type(other), | ||
) | ||
return ActivityMergeResult() | ||
|
||
if self.reboot_needed == other.reboot_needed: | ||
self.log.debug("merge-reboot-identical") | ||
return ActivityMergeResult(self, is_effective=True) | ||
|
||
if ( | ||
self.reboot_needed == RebootType.COLD | ||
and other.reboot_needed == RebootType.WARM | ||
): | ||
self.log.debug( | ||
"merge-reboot-cold-warm", | ||
help=( | ||
"merging a warm reboot into a cold reboot results in a " | ||
"cold reboot." | ||
), | ||
) | ||
return ActivityMergeResult(self, is_effective=True) | ||
|
||
if ( | ||
self.reboot_needed == RebootType.WARM | ||
and other.reboot_needed == RebootType.COLD | ||
): | ||
self.log.debug( | ||
"merge-reboot-warm-to-cold", | ||
help=( | ||
"merging a cold reboot into a warm reboot results in a " | ||
"cold reboot. This is a significant change." | ||
), | ||
) | ||
return ActivityMergeResult( | ||
self, | ||
is_effective=True, | ||
is_significant=True, | ||
changes={"before": RebootType.WARM, "after": RebootType.COLD}, | ||
) |
Oops, something went wrong.