-
Notifications
You must be signed in to change notification settings - Fork 12
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the Bug
Create an AppWrapper around a resource whose CRD is not installed.
As expected, creation fails and the AppWrapper enters a terminal failed state.
Unfortunately, deleting the appwrapper gets stuck with the appwrapper in a terminating state because the delete of the non-existing resource fails with an unexpected error.
yaml
cpu: 1
status:
componentStatus:
- apiVersion: kubeflow.org/v1
conditions:
- lastTransitionTime: "2024-12-12T02:07:02Z"
message: ""
reason: ComponentCreationInitiated
status: Unknown
type: ResourcesDeployed
kind: PyTorchJob
name: pytorch-simple
podSets:
- path: template.spec.pytorchReplicaSpecs.Master.template
replicas: 1
- path: template.spec.pytorchReplicaSpecs.Worker.template
replicas: 1
conditions:
- lastTransitionTime: "2024-12-12T02:07:02Z"
message: Suspend is false
reason: Resuming
status: "True"
type: QuotaReserved
- lastTransitionTime: "2024-12-12T02:07:02Z"
message: Suspend is false
reason: Resuming
status: "True"
type: ResourcesDeployed
- lastTransitionTime: "2024-12-12T02:07:02Z"
message: Suspend is false
reason: Resuming
status: "False"
type: PodsReady
- lastTransitionTime: "2024-12-12T02:07:02Z"
message: 'error creating components: no matches for kind "PyTorchJob" in version
"kubeflow.org/v1"'
reason: CreateFailed
status: "True"
type: Unhealthy
- lastTransitionTime: "2024-12-12T02:07:02Z"
message: ""
reason: DeletionInitiated
status: "True"
type: DeletingResources
phase: Terminating
kind: List
metadata:
resourceVersion: ""
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:227
2024-12-12T02:10:17.473541557Z ERROR logr@v1.4.2/logr.go:301 Deletion error {"controller": "AppWrapper", "controllerGroup": "workload.codeflare.dev", "controllerKind": "AppWrapper", "AppWrapper": {"name":"sample-pytorch-job","namespace":"default"}, "namespace": "default", "name": "sample-pytorch-job", "reconcileID": "936970f7-f7db-4a0c-b561-bad27b1dd2fe", "error": "no matches for kind \"PyTorchJob\" in version \"kubeflow.org/v1\""}
github.com/go-logr/logr.Logger.Error
/go/pkg/mod/github.com/go-logr/logr@v1.4.2/logr.go:301
github.com/project-codeflare/appwrapper/internal/controller/appwrapper.(*AppWrapperReconciler).deleteComponents.func1
/workspace/internal/controller/appwrapper/resource_management.go:371
github.com/project-codeflare/appwrapper/internal/controller/appwrapper.(*AppWrapperReconciler).deleteComponents
/workspace/internal/controller/appwrapper/resource_management.go:386
github.com/project-codeflare/appwrapper/internal/controller/appwrapper.(*AppWrapperReconciler).Reconcile
/workspace/internal/controller/appwrapper/appwrapper_controller.go:120
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:316
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.5/pkg/internal/controller/controller.go:227
(base) dgrove@Dave's IBM Mac kueue % kubectl get appwrapper -o yaml
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working