You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to deploy the Operator on top of a GKE cluster, but I'm running into issues when I'm trying to deploy the sample. I have a fairly standard cluster atm, without too many features enabled. I've deployed both cert manager and the operator and both are up and running without a problem.
Flink Operator System Namespace
k get po,svc -n flink-operator-system
NAME READY STATUS RESTARTS AGE
pod/flink-operator-controller-manager-5b4f96ddc5-dhlv5 2/2 Running 0 4h29m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/flink-operator-controller-manager-metrics-service ClusterIP 10.150.117.28 <none> 8443/TCP 4h29m
service/flink-operator-webhook-service ClusterIP 10.150.234.114 <none> 443/TCP 4h29m
Cert Manager Namespace
k get po,svc -n cert-manager
NAME READY STATUS RESTARTS AGE
pod/cert-manager-848f547974-fbtfd 1/1 Running 0 4h42m
pod/cert-manager-cainjector-54f4cc6b5-49p58 1/1 Running 0 4h42m
pod/cert-manager-webhook-58fb868868-4w4pr 1/1 Running 0 4h42m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/cert-manager ClusterIP 10.150.202.12 <none> 9402/TCP 4h42m
service/cert-manager-webhook ClusterIP 10.150.136.39 <none> 443/TCP 4h42m
However, when I try to deploy the sample session cluster, I get the following error message:
Error from server (InternalError): error when creating "./samples/flinkoperator_v1beta1_flinksessioncluster.yaml": Internal error occurred: failed calling webhook "mflinkcluster.flinkoperator.k8s.io": Post "https://flink-operator-webhook-service.flink-operator-system.svc:443/mutate-flinkoperator-k8s-io-v1beta1-flinkcluster?timeout=10s": dial tcp 10.100.0.10:9443: i/o timeout
Does anyone have any pointers? I've checked the services and they point to the correct endpoints. The selectors look fine, but I'm a bit stuck on how I can troubleshoot this efficiently.
I was a muppet who forgot to add the necessary firewall rule to allow the master to communicate with the nodes. So if someone runs into the same problem, you can add this rule to your network:
I'm trying to deploy the Operator on top of a GKE cluster, but I'm running into issues when I'm trying to deploy the sample. I have a fairly standard cluster atm, without too many features enabled. I've deployed both cert manager and the operator and both are up and running without a problem.
Flink Operator System Namespace
Cert Manager Namespace
However, when I try to deploy the sample session cluster, I get the following error message:
Does anyone have any pointers? I've checked the services and they point to the correct endpoints. The selectors look fine, but I'm a bit stuck on how I can troubleshoot this efficiently.
This is my cluster config:
The text was updated successfully, but these errors were encountered: