Design and Implementation of Maintenance Windows #1309
Closed
heitor-lassarote
started this conversation in
Proposals
Replies: 1 comment 1 reply
-
|
We have an implementation of this feature already out for review and testing: |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello. After reading the Gonka Community Network Roadmap, we’ve seen project 2 in track 3 entitled “Maintenance windows for hosts”, which as of 2026-03-06 has the following description (copied and pasted here):
The project should give a host a way to declare a maintenance window in advance, check whether the maintenance window is allowed, temporarily step out of part of its duties, and return to service without separate coordination and without being penalized for planned downtime.
Metrics:
What this gives the network:
The network gets a formal maintenance-window process that separates planned downtime from unplanned failures and reduces avoidable misses, penalties, and disputes.
A possible high-level design outline for this project may look like the following:
MsgSetScheduledMaintenance, allowing the host to schedule expected maintenance downtime and broadcast it to the mainnet. The exact fields in this message are up for debate, but it should at least contain a timestamp for when the maintenance begins (e.g.:maintenance_start_timestamp).DRAININGprior to the maintenance (host will finalize their ongoing sessions but won’t participate in new ones) and toMAINTENANCE(the host is temporarily offline due to scheduled maintenance). There might be the necessity for more statuses and transitions in this state machine, which should be researched.DRAININGtime may be tuned, but a good start may be for example, at least for an entire epoch, as a devshard session currently can’t cross the epoch boundary.MsgSetScheduledMaintenance, to prevent abuse.MsgFinishScheduledMaintenance.MsgCancelScheduledMaintenance.MsgFinishScheduledMaintenanceshould be sent during the scheduled maintenance window to let the participant close the window and resume its ordinary activities, whileMsgCancelScheduledMaintenanceshould be sent before the scheduled maintenance window to prevent it from ever beginning.There are still some open questions and considerations that need research with this design:
We would like to offer a team to refine the idea and design and begin work on this project. A tentative plan for the team should look like the following:
Beta Was this translation helpful? Give feedback.
All reactions