This repository was archived by the owner on Nov 16, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 43
This repository was archived by the owner on Nov 16, 2023. It is now read-only.
Failed to put CRD error, frameworkcontroller frequently restarts #66
Copy link
Copy link
Closed
Description
I set up kubernetes system with kubeadm!
I have one master node and one worker node.
I use calico as pod-network.
There are some problem when setting up frameworkcontroller.
Frameworkcontroller restarts frequently (every 1min) and I found this message when typing kubectl logs frameworkcontroller-0
I1028 02:01:51.888234 10 controller.go:207] Initializing frameworkcontroller
I1028 02:01:51.888637 10 controller.go:210] With Config:
kubeApiServerAddress: https://localhost:40443
kubeConfigFilePath: ""
kubeClientQps: 200
kubeClientBurst: 300
workerNumber: 500
largeFrameworkCompression: true
crdEstablishedCheckIntervalSec: 1
crdEstablishedCheckTimeoutSec: 60
objectLocalCacheCreationTimeoutSec: 300
frameworkCompletedRetainSec: 2592000
frameworkMinRetryDelaySecForTransientConflictFailed: 60
frameworkMaxRetryDelaySecForTransientConflictFailed: 900
logObjectSnapshot:
framework:
onFrameworkRetry: true
onFrameworkDeletion: true
task:
onTaskRetry: true
onTaskDeletion: true
pod:
onPodDeletion: true
podFailureSpec: []
I1028 02:01:51.889430 10 controller.go:427] Recovering frameworkcontroller
E1028 02:02:21.890073 10 runtime.go:69] Observed a panic: &errors.errorString{s:"Failed to put CRD: Get https://localhost:40443/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions/frameworks.fr
ameworkcontroller.microsoft.com: dial tcp: i/o timeout"} (Failed to put CRD: Get https://localhost:40443/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions/frameworks.frameworkcontroller.microsoft.
com: dial tcp: i/o timeout)
/go/src/github.com/microsoft/frameworkcontroller/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76
/go/src/github.com/microsoft/frameworkcontroller/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/github.com/microsoft/frameworkcontroller/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/panic.go:522
/go/src/github.com/microsoft/frameworkcontroller/pkg/internal/utils.go:66
/go/src/github.com/microsoft/frameworkcontroller/pkg/controller/controller.go:428
/go/src/github.com/microsoft/frameworkcontroller/cmd/frameworkcontroller/main.go:35
/usr/local/go/src/runtime/proc.go:200
/usr/local/go/src/runtime/asm_amd64.s:1337
E1028 02:02:21.890143 10 panic.go:522] Stopping frameworkcontroller
panic: Failed to put CRD: Get https://localhost:40443/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions/frameworks.frameworkcontroller.microsoft.com: dial tcp: i/o timeout [recovered]
panic: Failed to put CRD: Get https://localhost:40443/apis/apiextensions.k8s.io/v1beta1/customresourcedefinitions/frameworks.frameworkcontroller.microsoft.com: dial tcp: i/o timeout
goroutine 1 [running]:
github.com/microsoft/frameworkcontroller/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/src/github.com/microsoft/frameworkcontroller/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x105
panic(0x11f84e0, 0xc0003f8110)
/usr/local/go/src/runtime/panic.go:522 +0x1b5
github.com/microsoft/frameworkcontroller/pkg/internal.PutCRD(0xc000338960, 0xc0003438c0, 0xc000047590, 0xc000047598)
/go/src/github.com/microsoft/frameworkcontroller/pkg/internal/utils.go:66 +0x173
github.com/microsoft/frameworkcontroller/pkg/controller.(*FrameworkController).Run(0xc000107290, 0xc0000e4960)
/go/src/github.com/microsoft/frameworkcontroller/pkg/controller/controller.go:428 +0x157
main.main()
/go/src/github.com/microsoft/frameworkcontroller/cmd/frameworkcontroller/main.go:35 +0x47
I use the example yaml described in frameworkcontroller guideline https://github.com/Microsoft/frameworkcontroller/tree/master/example/run
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: frameworkcontroller
namespace: default
spec:
serviceName: frameworkcontroller
selector:
matchLabels:
app: frameworkcontroller
replicas: 1
template:
metadata:
labels:
app: frameworkcontroller
spec:
# Using the ServiceAccount with granted permission
# if the k8s cluster enforces authorization.
serviceAccountName: frameworkcontroller
containers:
- name: frameworkcontroller
image: frameworkcontroller/frameworkcontroller
# Using k8s inClusterConfig, so usually, no need to specify
# KUBE_APISERVER_ADDRESS or KUBECONFIG
env:
- name: KUBE_APISERVER_ADDRESS
value: https://localhost:40443 #{http[s]://host:port}
#- name: KUBECONFIG
# value: {Pod Local KubeConfig File Path}
command: [
"bash", "-c",
"cp /frameworkcontroller-config/frameworkcontroller.yaml . &&
./start.sh"]
volumeMounts:
- name: frameworkcontroller-config
mountPath: /frameworkcontroller-config
volumes:
- name: frameworkcontroller-config
configMap:
name: frameworkcontroller-config
Additionally frameworkbarrier cannot find the custom resource frameworks, my guessing is that frameworkcontroller doesn't work well so it cannot build the custom resource frameworks.
I also upload my entire script for launching frameworkcontroller!
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta4/nvidia-device-plugin.yml
sleep 10
kubectl create serviceaccount frameworkcontroller --namespace default
kubectl create clusterrolebinding frameworkcontroller \
--clusterrole=cluster-admin \
--user=system:serviceaccount:default:frameworkcontroller
sleep 5
#kubectl create -f frameworkcontroller-with-default-config.yaml
# custom config
kubectl create -f frameworkcontroller-customized-config.yaml
kubectl create -f frameworkcontroller-with-customized-config.yaml
sleep 15
kubectl create serviceaccount frameworkbarrier --namespace default
kubectl create clusterrole frameworkbarrier --verb=get,list,watch --resource=frameworks
kubectl create clusterrolebinding frameworkbarrier --clusterrole=frameworkbarrier --user=system:serviceaccount:default:frameworkbarrier
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels