Skip to content

Commit

Permalink
Set the deletion grace period to 1 on forced VM restarts
Browse files Browse the repository at this point in the history
In order to allow a swift and safe restart, set the deletion grace
period to 1 on the pod delete when a force restart is requested.

One is the shortest safe restart period where we don't risk a
split-brain scenario for VMs, since Zero effectively deletes the pod
right away.

This leads to the following outcome:
 * If a guest OS is stuck, people get a swift restart.
 * Users don't risk data corruption.
 * Reserving a force pod delete for situations where the cluster is
   unhealthy.

Signed-off-by: Roman Mohr <rmohr@google.com>
  • Loading branch information
rmohr committed Jun 6, 2023
1 parent 6e776e3 commit 44b6856
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 2 deletions.
1 change: 1 addition & 0 deletions pkg/virt-api/rest/BUILD.bazel
Expand Up @@ -53,6 +53,7 @@ go_library(
"//vendor/k8s.io/client-go/kubernetes/typed/authorization/v1:go_default_library",
"//vendor/k8s.io/client-go/rest:go_default_library",
"//vendor/k8s.io/client-go/util/flowcontrol:go_default_library",
"//vendor/k8s.io/utils/pointer:go_default_library",
"//vendor/kubevirt.io/containerized-data-importer-api/pkg/apis/core/v1beta1:go_default_library",
],
)
Expand Down
5 changes: 3 additions & 2 deletions pkg/virt-api/rest/subresource.go
Expand Up @@ -41,6 +41,7 @@ import (
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/json"
"k8s.io/apimachinery/pkg/util/yaml"
"k8s.io/utils/pointer"

"kubevirt.io/kubevirt/pkg/util/status"

Expand Down Expand Up @@ -439,8 +440,8 @@ func (app *SubresourceAPIApp) RestartVMRequestHandler(request *restful.Request,
response.WriteHeader(http.StatusAccepted)
return
}
// set terminationGracePeriod and delete the VMI pod to trigger a forced restart
err = app.virtCli.CoreV1().Pods(namespace).Delete(context.Background(), vmiPodname, k8smetav1.DeleteOptions{GracePeriodSeconds: bodyStruct.GracePeriodSeconds})
// set terminationGracePeriod to 1 (which is the shorted safe restart period) and delete the VMI pod to trigger a swift restart.
err = app.virtCli.CoreV1().Pods(namespace).Delete(context.Background(), vmiPodname, k8smetav1.DeleteOptions{GracePeriodSeconds: pointer.Int64(1)})
if err != nil {
if !errors.IsNotFound(err) {
writeError(errors.NewInternalError(err), response)
Expand Down

0 comments on commit 44b6856

Please sign in to comment.