Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Found two or more with same ip #7897

Closed
bvboca opened this issue Jul 24, 2021 · 17 comments
Closed

Found two or more with same ip #7897

bvboca opened this issue Jul 24, 2021 · 17 comments

Comments

@bvboca
Copy link

bvboca commented Jul 24, 2021

  1. Upgraded my cluster from 1.18 to 1.19
  2. Run jx admin to install JenkinsX
  3. Wait until all-set
  4. JX Dashboard can be accessed successfully, but some of the pods keep recreating.
  5. I log most of the failure pods. It seems something wrong with the health check mechanism.

截屏2021-07-24 下午11 25 06

截屏2021-07-24 下午11 25 58

logs from deployment pod in kuberhealthy

time="2021-07-24T15:01:05Z" level=info msg="Successfully hit service endpoint."
time="2021-07-24T15:01:05Z" level=info msg="Rolling update option is enabled. Performing roll."
time="2021-07-24T15:01:05Z" level=info msg="Creating deployment resource with 4 replica(s) in kuberhealthy namespace using image [nginxinc/nginx-unprivileged:1.17.9] with environment variables: map[]"
time="2021-07-24T15:01:05Z" level=info msg="Creating container using image [nginxinc/nginx-unprivileged:1.17.9] with environment variables: map[]"
time="2021-07-24T15:01:05Z" level=info msg="Created rolling-update deployment resource."
time="2021-07-24T15:01:05Z" level=info msg="Performing rolling-update on deployment deployment-deployment to [nginxinc/nginx-unprivileged:1.17.9]"
time="2021-07-24T15:01:26Z" level=info msg="Rolled deployment in kuberhealthy namespace: deployment-deployment"
time="2021-07-24T15:01:26Z" level=info msg="Looking for a response from the endpoint."
time="2021-07-24T15:01:26Z" level=info msg="Beginning backoff loop for HTTP GET request."
time="2021-07-24T15:01:26Z" level=info msg="Successfully made an HTTP request on attempt: 1"
time="2021-07-24T15:01:26Z" level=info msg="Got a 200 with a GET to http://10.108.6.162"
time="2021-07-24T15:01:26Z" level=info msg="Got a result from GET request backoff: 200 OK"
time="2021-07-24T15:01:26Z" level=info msg="Successfully hit service endpoint after rolling-update."
time="2021-07-24T15:01:26Z" level=info msg="Cleaning up deployment and service."
time="2021-07-24T15:01:26Z" level=info msg="Attempting to delete service deployment-svc in kuberhealthy namespace."
time="2021-07-24T15:01:31Z" level=info msg="Attempting to delete deployment in kuberhealthy namespace."
time="2021-07-24T15:01:36Z" level=info msg="Attempting to delete deployment in kuberhealthy namespace."
time="2021-07-24T15:01:41Z" level=info msg="Finished clean up process."
time="2021-07-24T15:01:41Z" level=info msg="Reporting success to Kuberhealthy."
time="2021-07-24T15:02:42Z" level=fatal msg="error reporting to kuberhealthy: bad status code from kuberhealthy status reporting url: [400] 400 Bad Request

@bvboca
Copy link
Author

bvboca commented Jul 28, 2021

Found some logs from kuberhealthy
time="2021-07-28T13:34:11Z" level=info msg="047fe940-68df-4da6-b01f-d8ed6d21968e jx/jx-webhook-events: [Last report time was: 2021-07-28 13:30:08.885051131 +0000 UTC vs 2021-07-28 13:30:08.885051131 +0000 UTC]"
time="2021-07-28T13:34:11Z" level=info msg="047fe940-68df-4da6-b01f-d8ed6d21968e jx/jx-webhook-events: [have not yet seen pod update since 2021-07-28 13:30:08.885051131 +0000 UTC]"
time="2021-07-28T13:34:11Z" level=info msg="499d2931-7418-493d-ac76-b095cea1183c kuberhealthy/jx-pod-status: [waiting for external checker pod to report in...]"
time="2021-07-28T13:34:11Z" level=info msg="499d2931-7418-493d-ac76-b095cea1183c kuberhealthy/jx-pod-status: [Last report time was: 2021-07-28 13:29:21.391488049 +0000 UTC vs 2021-07-28 13:29:21.391488049 +0000 UTC]"
time="2021-07-28T13:34:11Z" level=info msg="499d2931-7418-493d-ac76-b095cea1183c kuberhealthy/jx-pod-status: [have not yet seen pod update since 2021-07-28 13:29:21.391488049 +0000 UTC]"
time="2021-07-28T13:34:11Z" level=warning msg="was unable to find calling pod with remote IP 192.168.31.123 while watching for duration. Error: failed to fetch pod with remote ip 192.168.31.123 - found two or more with same ip"
time="2021-07-28T13:34:12Z" level=warning msg="was unable to find calling pod with remote IP 192.168.31.123 while watching for duration. Error: failed to fetch pod with remote ip 192.168.31.123 - found two or more with same ip"
time="2021-07-28T13:34:12Z" level=warning msg="was unable to find calling pod with remote IP 192.168.31.123 while watching for duration. Error: failed to fetch pod with remote ip 192.168.31.123 - found two or more with same ip"
time="2021-07-28T13:34:13Z" level=warning msg="was unable to find calling pod with remote IP 192.168.31.123 while watching for duration. Error: failed to fetch pod with remote ip 192.168.31.123 - found two or more with same ip"

It seems to be related to the issue.
https://github.com/kuberhealthy/kuberhealthy/issues/870

@bvboca bvboca changed the title Error reporting to kuberhealthy Found two or more with same ip Jul 28, 2021
@jstrachan
Copy link
Member

did the last boot job succeed (see the end of the log of the most recent jx admin log)

@jstrachan
Copy link
Member

which install instructions are you following? how did you install kuberhealthy?

@bvboca
Copy link
Author

bvboca commented Jul 28, 2021

HI James,

I installed the server including kuberhealthy by steps of https://jenkins-x.io/v3/admin/setup/operator/
jx admin operator --url=...

It seems my installation boot job succeeds with some errors:
截屏2021-07-28 下午11 15 29

@jstrachan
Copy link
Member

could you do a dummy commit in your git repository (e.g. modify the README.md) and do jx admin log -w and post the last page or so please?

@bvboca
Copy link
Author

bvboca commented Jul 29, 2021

James,

Here's my log:

截屏2021-07-29 上午11 57 34

@jstrachan
Copy link
Member

which git repository template did you start from? the on premise one right? https://github.com/jx3-gitops-repositories/jx3-kubernetes

whats the output of:

kubectl get pod --all-namespaces

I don't really understand why kuberhealthy is not working in your cluster. The daemonset / deployment checks (the first 2 health checks) are pure vanilla kuberhealthy + k8s checks and have absolutely nothing to do with jenkins x at all - they verify k8s stuff

@jstrachan
Copy link
Member

also try

kubectl top node

it could be your cluster is out of capacity?

@bvboca
Copy link
Author

bvboca commented Jul 29, 2021

Yes I'm using the on premise one. https://github.com/bvboca/jx3-kubernetes
Here's my log:
[root@k8s-slave1 ~]# kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default busybox 1/1 Running 165 17d
default eureka-deployment-9456f8d8b-jtcm9 1/1 Running 3 6d18h
default eureka-deployment-9456f8d8b-wmgx4 1/1 Running 3 6d18h
default example-memcached-ff665679d-5vrw9 1/1 Running 101 391d
default example-memcached-ff665679d-79wvl 1/1 Running 102 391d
default example-memcached-ff665679d-jdpv5 1/1 Running 99 390d
default k8sdemo-jenkins-deployment-779c8ddf5c-vlxhr 1/1 Running 62 218d
default myservice-deployment-75996bb45f-kfsq2 1/1 Running 3 6d18h
default myservice-deployment-75996bb45f-tc9h8 1/1 Running 3 6d18h
default volume-test 1/1 Running 35 157d
gitlab gitlab-b59cb885-29pwn 1/1 Running 87 207d
gitlab gitlab-postgresql-75b5f9cc76-hqlsc 1/1 Running 51 207d
gitlab gitlab-redis-5884fd96d4-nbn5w 1/1 Running 51 207d
hyperledger ca-org1-d9c6bf67-r6sjr 1/1 Running 73 228d
hyperledger ca-org2-7756d59595-z7ns4 1/1 Running 120 457d
hyperledger chaincode-marbles-org1-76fcc64b7c-8mjbz 1/1 Running 120 457d
hyperledger chaincode-marbles-org2-54b6f9594d-gwc25 1/1 Running 120 457d
hyperledger cli-org1-6989f44f9b-sswsp 1/1 Running 120 457d
hyperledger cli-org2-7d45c8cc6-6mvmd 1/1 Running 120 457d
hyperledger orderer0-66b898967b-976dh 1/1 Running 72 226d
hyperledger orderer1-8ffb5446f-7ktgf 1/1 Running 72 226d
hyperledger orderer2-dd7597968-r6pqq 1/1 Running 72 226d
hyperledger peer0-org1-fc87994cd-77vls 1/1 Running 120 457d
hyperledger peer0-org2-7446b69d75-tfz7t 1/1 Running 121 457d
ipfs ipfscluster-0 2/2 Running 90 193d
ipfs ipfscluster-1 2/2 Running 90 193d
istio-system cluster-local-gateway-5ccc64698c-pdlpn 1/1 Running 114 430d
istio-system istio-ingressgateway-8546c8686b-djbxv 1/1 Running 113 430d
istio-system istio-pilot-564d5b8f95-mjbld 0/1 Evicted 0 430d
istio-system istio-pilot-564d5b8f95-x7j6r 1/1 Running 74 228d
istio-system zipkin-6cfd88c459-grhjt 1/1 Running 114 428d
jx-git-operator jx-boot-0e28cc89-3ec8-4065-850c-f5341a42ac03-6pgnw 0/1 Error 0 31h
jx-git-operator jx-boot-0e28cc89-3ec8-4065-850c-f5341a42ac03-p2v5w 0/1 Completed 0 31h
jx-git-operator jx-boot-19e5f6a8-b0c2-47ec-ab6a-21ea5fdfcaec-f6qlw 0/1 Completed 0 5h57m
jx-git-operator jx-boot-35ed9c49-0e72-48e5-8e78-5f4106e75fae-dcmlp 0/1 Completed 0 31h
jx-git-operator jx-boot-96bc1637-8e43-4e61-89c6-4cbe07f2f7b2-wrdcd 0/1 Completed 0 5h31m
jx-git-operator jx-boot-edd3191f-543a-4ba5-92db-6ca406ca60df-5f4h9 0/1 Completed 0 5h24m
jx-git-operator jx-boot-f92d9ffe-c972-4cc6-a446-3926f9cf8ecb-9hq2c 0/1 Completed 0 5h48m
jx-git-operator jx-git-operator-7965dbcb55-ncdfb 1/1 Running 3 31h
jx bucketrepo-bucketrepo-58568b75cd-jpwxh 1/1 Running 1 31h
jx jx-bot-token-1627546365 0/1 Error 0 23m
jx jx-bot-token-1627546665 0/1 Error 0 18m
jx jx-bot-token-1627546965 0/1 Error 0 13m
jx jx-bot-token-1627547265 0/1 Error 0 8m39s
jx jx-bot-token-1627547565 0/1 Error 0 3m39s
jx jx-build-controller-647fbc5d57-v8fdz 1/1 Running 2 31h
jx jx-gcactivities-1627538400-k9zlp 0/1 Completed 0 156m
jx jx-gcactivities-1627540200-gg5w2 0/1 Completed 0 126m
jx jx-gcpods-1627538400-pc59x 0/1 Completed 0 156m
jx jx-gcpods-1627540200-lmzb5 0/1 Completed 0 126m
jx jx-gcpods-1627540200-qdlbv 0/1 Error 0 126m
jx jx-pipelines-visualizer-68d8795dcc-7svvx 1/1 Running 1 31h
jx jx-preview-gc-jobs-1627546200-fg7v9 0/1 Completed 0 26m
jx jx-preview-gc-jobs-1627546800-jh274 0/1 Completed 0 16m
jx jx-preview-gc-jobs-1627547400-wdn55 0/1 Completed 0 6m29s
jx jx-webhook-1627547106 0/1 Error 0 11m
jx jx-webhook-1627547227 0/1 Error 0 9m17s
jx jx-webhook-1627547348 0/1 Error 0 7m16s
jx jx-webhook-1627547469 0/1 Error 0 5m15s
jx jx-webhook-1627547590 0/1 Error 0 3m14s
jx jx-webhook-1627547711 0/1 Error 0 73s
jx jx-webhook-events-1627546365 0/1 Error 0 23m
jx jx-webhook-events-1627546665 0/1 Error 0 18m
jx jx-webhook-events-1627546965 0/1 Error 0 13m
jx jx-webhook-events-1627547265 0/1 Error 0 8m39s
jx jx-webhook-events-1627547565 0/1 Error 0 3m39s
jx lighthouse-foghorn-645fdfcbfb-fxhpx 1/1 Running 1 31h
jx lighthouse-gc-jobs-1627543800-cttds 0/1 Completed 0 66m
jx lighthouse-gc-jobs-1627545600-ggc5h 0/1 Completed 0 36m
jx lighthouse-gc-jobs-1627547400-m4pst 0/1 Completed 0 6m29s
jx lighthouse-keeper-5858cfc54f-2ckws 1/1 Running 1 31h
jx lighthouse-tekton-controller-7fd9c75449-lfr59 1/1 Running 1 31h
jx lighthouse-webhooks-8469cb7d6-kp8qb 1/1 Running 0 5h18m
knative-eventing broker-controller-f5bb6b9fd-md529 1/1 Running 114 430d
knative-eventing eventing-controller-7c6c74bccd-62mqr 1/1 Running 113 430d
knative-eventing eventing-webhook-778f4bd59b-2z4d7 1/1 Running 114 430d
knative-eventing imc-controller-5c65dbd444-xjk2r 1/1 Running 114 430d
knative-eventing imc-dispatcher-5f47b446bb-vlsmc 1/1 Running 114 430d
knative-monitoring elasticsearch-logging-0 1/1 Running 519 428d
knative-monitoring elasticsearch-logging-1 1/1 Running 518 428d
knative-monitoring grafana-55d85fcd55-dp9xf 1/1 Running 113 428d
knative-monitoring kibana-logging-5cccb587d-dcn2t 1/1 Running 72 226d
knative-monitoring kube-state-metrics-7bd4d8fcb8-w8pph 1/1 Running 115 428d
knative-monitoring node-exporter-4gjnv 2/2 Running 159 428d
knative-monitoring node-exporter-kqtlp 2/2 Running 232 428d
knative-monitoring prometheus-system-0 0/1 CrashLoopBackOff 1116 226d
knative-monitoring prometheus-system-1 0/1 Running 23279 226d
knative-serving activator-845b77cbb5-8xlq8 1/1 Running 248 430d
knative-serving autoscaler-7fc56894f5-nwkbp 1/1 Running 115 430d
knative-serving controller-7ffb84fd9c-r2w42 1/1 Running 114 430d
knative-serving default-domain-cpxqs 0/1 Completed 0 430d
knative-serving networking-istio-7fc7f66675-8g8gf 1/1 Running 113 430d
knative-serving webhook-8597865965-fwf5t 1/1 Running 114 430d
kube-system calico-kube-controllers-77c5fc8d7f-9rts2 1/1 Running 5 6d19h
kube-system calico-node-r8wh7 1/1 Running 133 457d
kube-system calico-node-xlwst 1/1 Running 74 472d
kube-system coredns-f9fd979d6-54brv 1/1 Running 2 6d17h
kube-system coredns-f9fd979d6-lb2g4 1/1 Running 2 6d18h
kube-system etcd-k8s-master 1/1 Running 3 6d18h
kube-system kube-apiserver-k8s-master 1/1 Running 5 6d18h
kube-system kube-controller-manager-k8s-master 1/1 Running 3 6d18h
kube-system kube-proxy-4hjd8 1/1 Running 2 6d17h
kube-system kube-proxy-7n2qp 1/1 Running 3 6d18h
kube-system kube-scheduler-k8s-master 1/1 Running 3 6d18h
kube-system tiller-deploy-7fbb5bc5d4-6jk2l 1/1 Running 107 413d
kuberhealthy daemonset-1627543965 0/1 Error 0 63m
kuberhealthy daemonset-1627544865 0/1 Error 0 48m
kuberhealthy daemonset-1627545765 0/1 Error 0 33m
kuberhealthy daemonset-1627546665 0/1 Error 0 18m
kuberhealthy daemonset-1627547565 0/1 Error 0 3m39s
kuberhealthy deployment-1627543683 0/1 Error 0 68m
kuberhealthy deployment-1627544584 0/1 Error 0 53m
kuberhealthy deployment-1627545486 0/1 Error 0 38m
kuberhealthy deployment-1627546387 0/1 Error 0 23m
kuberhealthy deployment-1627547288 0/1 Error 0 8m16s
kuberhealthy dns-status-internal-1627543987 0/1 Completed 0 63m
kuberhealthy dns-status-internal-1627544888 0/1 Completed 0 48m
kuberhealthy dns-status-internal-1627545789 0/1 Completed 0 33m
kuberhealthy dns-status-internal-1627546691 0/1 Completed 0 18m
kuberhealthy dns-status-internal-1627547592 0/1 Completed 0 3m12s
kuberhealthy jx-pod-status-1627543383 0/1 Error 0 73m
kuberhealthy jx-pod-status-1627544284 0/1 Error 0 58m
kuberhealthy jx-pod-status-1627545185 0/1 Error 0 43m
kuberhealthy jx-pod-status-1627546086 0/1 Error 0 28m
kuberhealthy jx-pod-status-1627546987 0/1 Error 0 13m
kuberhealthy jx-secrets-1627546377 0/1 Error 0 23m
kuberhealthy jx-secrets-1627546678 0/1 Error 0 18m
kuberhealthy jx-secrets-1627546979 0/1 Error 0 13m
kuberhealthy jx-secrets-1627547280 0/1 Error 0 8m24s
kuberhealthy jx-secrets-1627547581 0/1 Error 0 3m23s
kuberhealthy kuberhealthy-7667c57ff7-7mc9q 1/1 Running 1 31h
kuberhealthy kuberhealthy-7667c57ff7-x92j6 1/1 Running 1 31h
kuberhealthy network-connection-check-1627540365 0/1 Completed 0 123m
kuberhealthy network-connection-check-1627542165 0/1 Completed 0 93m
kuberhealthy network-connection-check-1627543965 0/1 Completed 0 63m
kuberhealthy network-connection-check-1627545765 0/1 Completed 0 33m
kuberhealthy network-connection-check-1627547565 0/1 Completed 0 3m38s
kubernetes-dashboard dashboard-metrics-scraper-6b4884c9d5-sg4s9 1/1 Running 5 6d19h
kubernetes-dashboard kubernetes-dashboard-7b544877d5-v8pjq 1/1 Running 7 6d19h
local-path-storage local-path-provisioner-5b577f66ff-sbhjn 1/1 Running 51 157d
metallb-system controller-57f648cb96-n5r9j 1/1 Running 113 430d
metallb-system speaker-92rjl 1/1 Running 86 228d
metallb-system speaker-w4llv 1/1 Running 71 430d
n1 default-broker-filter-7447bffc4f-kgkk8 1/1 Running 127 429d
n1 default-broker-ingress-8b8779497-ff4xw 1/1 Running 129 429d
nginx ingress-nginx-admission-create-845b7 0/1 Completed 0 31h
nginx ingress-nginx-admission-patch-9rwnj 0/1 Completed 1 31h
nginx ingress-nginx-controller-7c47c6b6dc-gdxnb 1/1 Running 1 31h
nginx ingress-nginx-controller-7c47c6b6dc-hc27f 1/1 Running 1 31h
nginx ingress-nginx-controller-7c47c6b6dc-kfjgt 1/1 Running 1 31h
olm catalog-operator-c8bc7f97c-p6t6q 1/1 Running 103 394d
olm olm-operator-84cfcdbdb8-8bx9g 0/1 Evicted 0 228d
olm olm-operator-84cfcdbdb8-kqgng 0/1 Evicted 0 394d
olm olm-operator-84cfcdbdb8-rmkm8 0/1 Evicted 0 228d
olm olm-operator-84cfcdbdb8-vq2gk 1/1 Running 74 228d
olm olm-operator-84cfcdbdb8-x6fl8 0/1 Evicted 0 228d
olm operatorhubio-catalog-jsskt 1/1 Running 11 6d19h
olm packageserver-7c9b7f4bc8-2cfvp 1/1 Running 0 5h34m
olm packageserver-7c9b7f4bc8-p9nmt 1/1 Running 0 5h32m
tekton-pipelines tekton-pipelines-controller-77578b9fb-q5bmh 1/1 Running 1 31h
tekton-pipelines tekton-pipelines-webhook-59fd68db75-v7xjw 1/1 Running 1 31h

@bvboca
Copy link
Author

bvboca commented Jul 29, 2021

It seems the resources are fine now.

Resource Requests Limits


cpu 5270m (65%) 17320m (216%)
memory 4737096Ki (19%) 15317280Ki (61%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)

@bvboca
Copy link
Author

bvboca commented Jul 29, 2021

James,

I checked the failed pods' logs. Most of them are about the kuberhealthy status reporting :

Here's the log from jx-bot-token:
time="2021-07-29T08:27:56Z" level=info msg="Found instance namespace: jx"
time="2021-07-29T08:27:56Z" level=info msg="Kuberhealthy is located in the jx namespace."
starting jx-bot-token health checks
received 200
FATAL: failed to report success status bad status code from kuberhealthy status reporting url: [400] 400 Bad Request

Log from jx-webhooks:
time="2021-07-29T08:47:28Z" level=info msg="Found instance namespace: jx"
time="2021-07-29T08:47:28Z" level=info msg="Kuberhealthy is located in the jx namespace."
starting jx-webhooks health checks
FATAL: failed to report success status bad status code from kuberhealthy status reporting url: [400] 400 Bad Request

Log from jx-webhook-events:
time="2021-07-29T08:32:55Z" level=info msg="Found instance namespace: jx"
time="2021-07-29T08:32:55Z" level=info msg="Kuberhealthy is located in the jx namespace."
starting jx-webhook-events health checks
FATAL: failed to report failure status bad status code from kuberhealthy status reporting url: [400] 400 Bad Request

Log from jx-pod-status:
time="2021-07-29T08:23:14Z" level=info msg="Found instance namespace: kuberhealthy"
time="2021-07-29T08:23:14Z" level=info msg="Kuberhealthy is located in the kuberhealthy namespace."
starting jx-install health checks
skipping checks on pod because it is too young: jx/jx-preview-gc-jobs-1627546800-jh274
2021/07/29 08:23:14 checkClient: DEBUG: Reporting SUCCESS
2021/07/29 08:23:14 checkClient: DEBUG: Sending report with error length of:0
2021/07/29 08:23:14 checkClient: DEBUG: Sending report with ok state of:true
2021/07/29 08:23:14 checkClient: INFO: Using kuberhealthy reporting URL:http://kuberhealthy.kuberhealthy.svc.cluster.local/externalCheckStatus
2021/07/29 08:24:15 checkClient: ERROR: got a bad status code from kuberhealthy:400400 Bad Request
FATAL: failed to report success status bad status code from kuberhealthy status reporting url: [400] 400 Bad Request

@jstrachan
Copy link
Member

there is a kuberhealthy service running in the kuberhealthy namespace right? can you try curl http://kuberhealthy.kuberhealthy.svc.cluster.local from inside a pod in the cluster? I wonder if there's an issue with service + DNS in your cluster?

@jstrachan
Copy link
Member

e.g. run

kubectl exec -it jx-build-controller-XXXX bash 

then run

curl -v http://kuberhealthy.kuberhealthy.svc.cluster.local

@jstrachan
Copy link
Member

you should get a 200 with json output

@bvboca
Copy link
Author

bvboca commented Jul 30, 2021

$ curl http://kuberhealthy.kuberhealthy.svc.cluster.local
{
"OK": false,
"Errors": [
"Check execution error: kuberhealthy/deployment: timed out waiting for checker pod to report in",
"Check execution error: kuberhealthy/daemonset: timed out waiting for checker pod to report in",
"Check execution error: kuberhealthy/network-connection-check: timed out waiting for checker pod to report in",
"Check execution error: jx/jx-bot-token: timed out waiting for checker pod to report in",
"Check execution error: kuberhealthy/jx-pod-status: timed out waiting for checker pod to report in",
"Check execution error: jx/jx-webhook-events: timed out waiting for checker pod to report in",
"Check execution error: jx/jx-webhook: timed out waiting for checker pod to report in",
"Check execution error: kuberhealthy/jx-secrets: timed out waiting for checker pod to report in",
"Check execution error: kuberhealthy/dns-status-internal: timed out waiting for checker pod to report in"
],
"CheckDetails": {
"jx/jx-bot-token": {
"OK": false,
"Errors": [
"Check execution error: jx/jx-bot-token: timed out waiting for checker pod to report in"
],
"RunDuration": "",
"Namespace": "jx",
"LastRun": "2021-07-30T01:04:45.32803228Z",
"AuthoritativePod": "kuberhealthy-7667c57ff7-7mc9q",
"uuid": "034834f2-fc69-43fc-8e71-22193d500004"
},
"jx/jx-webhook": {
"OK": false,
"Errors": [
"Check execution error: jx/jx-webhook: timed out waiting for checker pod to report in"
],
"RunDuration": "",
"Namespace": "jx",
"LastRun": "2021-07-30T01:08:03.229217849Z",
"AuthoritativePod": "kuberhealthy-7667c57ff7-7mc9q",
"uuid": "0098a96f-a167-460c-b01d-65aa2f6df2d8"
},
"jx/jx-webhook-events": {
"OK": false,
"Errors": [
"Check execution error: jx/jx-webhook-events: timed out waiting for checker pod to report in"
],
"RunDuration": "",
"Namespace": "jx",
"LastRun": "2021-07-30T01:04:45.140197347Z",
"AuthoritativePod": "kuberhealthy-7667c57ff7-7mc9q",
"uuid": "23b4ce09-b608-46a4-bed5-970a321b59b6"
},
"kuberhealthy/daemonset": {
"OK": false,
"Errors": [
"Check execution error: kuberhealthy/daemonset: timed out waiting for checker pod to report in"
],
"RunDuration": "",
"Namespace": "kuberhealthy",
"LastRun": "2021-07-30T00:59:46.134773453Z",
"AuthoritativePod": "kuberhealthy-7667c57ff7-7mc9q",
"uuid": "99b273bb-b0b9-42a4-b00d-0a7fd01442a4"
},
"kuberhealthy/deployment": {
"OK": false,
"Errors": [
"Check execution error: kuberhealthy/deployment: timed out waiting for checker pod to report in"
],
"RunDuration": "",
"Namespace": "kuberhealthy",
"LastRun": "2021-07-30T00:59:19.52312946Z",
"AuthoritativePod": "kuberhealthy-7667c57ff7-7mc9q",
"uuid": "e7693f4f-6ca5-46a4-baec-55c4c982672a"
},
"kuberhealthy/dns-status-internal": {
"OK": false,
"Errors": [
"Check execution error: kuberhealthy/dns-status-internal: timed out waiting for checker pod to report in"
],
"RunDuration": "",
"Namespace": "kuberhealthy",
"LastRun": "2021-07-30T01:04:22.883599436Z",
"AuthoritativePod": "kuberhealthy-7667c57ff7-7mc9q",
"uuid": "c325572e-51bc-4992-a490-c357a58ff011"
},
"kuberhealthy/jx-pod-status": {
"OK": false,
"Errors": [
"Check execution error: kuberhealthy/jx-pod-status: timed out waiting for checker pod to report in"
],
"RunDuration": "",
"Namespace": "kuberhealthy",
"LastRun": "2021-07-30T01:09:19.389733671Z",
"AuthoritativePod": "kuberhealthy-7667c57ff7-7mc9q",
"uuid": "3fbd8795-6938-4c14-ae21-f9b657c2e0e0"
},
"kuberhealthy/jx-secrets": {
"OK": false,
"Errors": [
"Check execution error: kuberhealthy/jx-secrets: timed out waiting for checker pod to report in"
],
"RunDuration": "",
"Namespace": "kuberhealthy",
"LastRun": "2021-07-30T01:06:36.588579586Z",
"AuthoritativePod": "kuberhealthy-7667c57ff7-7mc9q",
"uuid": "3bd3190e-e1a6-4bcb-8753-08335e728a0e"
},
"kuberhealthy/network-connection-check": {
"OK": false,
"Errors": [
"Check execution error: kuberhealthy/network-connection-check: timed out waiting for checker pod to report in"
],
"RunDuration": "",
"Namespace": "kuberhealthy",
"LastRun": "2021-07-30T00:42:45.501532474Z",
"AuthoritativePod": "kuberhealthy-7667c57ff7-7mc9q",
"uuid": "3dabdfb2-c8cf-49e0-b345-708016631ecd"
}
},
"JobDetails": {},
"CurrentMaster": "kuberhealthy-7667c57ff7-7mc9q"

@bvboca
Copy link
Author

bvboca commented Jul 30, 2021

James,

I think the pod DNS should be fine. The above comment shows the response from kuberhealthy service. And I have other apps depending on DNS and running well.

Is it related to kuberhealthy/kuberhealthy#858?

@msvticket
Copy link
Member

We are now disabling kuberhealthy. It seems to cause more headaches that it solves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants