-
Notifications
You must be signed in to change notification settings - Fork 68
Closed
Description
Description
When the common PVC cleanup job occurs, if the initial cleanup pod fails, up to 3 more cleanup pods will be created.
If all 4 cleanup pods fail, DWO will continuously log an error regarding being unable to clean up the workspace storage.
Restarting DWO does not fix the issue - it seems the only way to fix this issue is to uninstall and reinstall DWO on the cluster.
How To Reproduce
Steps to reproduce the behaviour:
- Modify the common PVC cleanup job spec (in
pkg/provision/storage/cleanup.go
) so that the created pods will fail (change the container args):
Args: []string{
"-c",
- fmt.Sprintf(cleanupCommandFmt, path.Join(pvcClaimMountPath, workspaceId)),
+ "exit 1",
},
- Start up DWO
- Create 2 workspaces that use the common PVC storage-class strategy
- Delete one of the workspaces so that the common PVC cleanup job will be run
- Wait for all the PVC cleanup job-related pods to fail
- DWO will now continuously log an error similar to the following:
{"level":"error","ts":1653690734.265899,"logger":"controllers.DevWorkspace","msg":"Failed to clean up DevWorkspace storage","Request.Namespace":"devworkspace-controller","Request.Name":"theia-next","devworkspace_id":"workspace542919afbaf744fa","error":"DevWorkspace PVC cleanup job failed: see logs for job \"cleanup-workspace542919afbaf744fa\" for details","stacktrace":"github.com/devfile/devworkspace-operator/controllers/workspace.(*DevWorkspaceReconciler).finalize\n\t/home/aobuchow/git/devworkspace-operator/controllers/workspace/finalize.go:63\ngithub.com/devfile/devworkspace-operator/controllers/workspace.(*DevWorkspaceReconciler).Reconcile\n\t/home/aobuchow/git/devworkspace-operator/controllers/workspace/devworkspace_controller.go:130\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/aobuchow/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.5/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/aobuchow/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.5/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/aobuchow/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.5/pkg/internal/controller/controller.go:214"}
Expected behaviour
DWO should stop trying to reconcile the workspace after a certain number of PVC cleanup job failures (or perhaps after the first failure?). The workspace should be marked as failed, and the above-mentioned error should stop being logged.
amisevsk
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working