Upsteam error issue [503] since side-car-istio-proxy cannot connect to pilot #6085

johnzheng1975 · 2018-06-07T23:47:38Z

istio 0.8.0 (mTLS disabled, also no control plane security)
k8s 1.9.5
cilium 1.1.0

Steps
Redeploy a service with new version

Expect result:
service can be accessed through istio-ingress

Actual result,
It show 503 upstream error issue.

From attached side-car istio-proxy log, it show cannot connect to istio-pilot.

[2018-06-07 23:03:32.170][16][info][main] external/envoy/source/server/server.cc:396] starting main dispatch loop
[2018-06-07 23:03:37.170][16][warning][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:235] gRPC config stream closed: 14, no healthy upstream
[2018-06-07 23:03:37.170][16][warning][upstream]

UPSTREQM-istio-proxyerror.log

The text was updated successfully, but these errors were encountered:

ryanyard · 2018-06-11T16:51:31Z

Same issue:
istio 0.8.0
gke: 1.10

Deploy using quick start:
kubectl apply -f install/kubernetes/istio-demo-auth.yaml

Deploy sample app:
https://istio.io/docs/guides/bookinfo/

Test:
curl -o /dev/null -s -w "%{http_code}\n" http://${GATEWAY_URL}/productpage

Returns:
upstream connect error or disconnect/reset before headers

LinAnt · 2018-06-12T11:00:10Z

I have the same issue, the dotwiz looks quite interesting:
https://www.dropbox.com/s/oba25uv4fnwgls7/Screenshot%202018-06-12%2013.58.48.png?dl=0

I am trying to upgrade pgadmin on a fresh cluster.

andraxylia · 2018-06-13T21:42:34Z

@johnzheng1975 is the log you attached from the istio-proxy by the pilot? Did you use ingress or gateway? Could you please send logs from the gateway/ingress pods, as well as from pilot, also kubectl get all?

rlenglet · 2018-06-14T04:25:16Z

is the log you attached from the istio-proxy by the pilot?

istio-proxy

Did you use ingress or gateway?

ingress

Could you please send logs from the gateway/ingress pods, as well as from pilot, also kubectl get all?

johnzheng1975 · 2018-06-15T06:39:02Z

@andraxylia For logs you requested, I will send all when I find this issue next time.
@rlenglet Thanks answer for me.

cizixs · 2018-08-28T05:08:05Z

Any update on this issue? same issue here.

dwdraju · 2018-08-30T09:40:16Z

Similar issue here:

[2018-08-30 09:19:04.878][15][info][upstream] external/envoy/source/server/lds_api.cc:80] lds: add/update listener '0.0.0.0_8060'
[2018-08-30 09:19:04.879][15][info][upstream] external/envoy/source/server/lds_api.cc:80] lds: add/update listener 'virtual'
[2018-08-30 09:19:04.900][15][info][config] external/envoy/source/server/listener_manager_impl.cc:908] all dependencies initialized. starting workers
[2018-08-30 09:20:04.894][15][info][main] external/envoy/source/server/drain_manager_impl.cc:63] shutting down parent after drain
[2018-08-30 09:25:36.297][15][warning][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:240] gRPC config stream closed: 13, 
[2018-08-30 09:30:36.594][15][warning][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:240] gRPC config stream closed: 13, 
[2018-08-30 09:35:37.039][15][warning][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:240] gRPC config stream closed: 13,

jojimt · 2018-10-16T21:22:24Z

I am hitting this issue as well. Is there a workaround? Or an older version that does not exhibit this issue?

zparnold · 2018-11-01T23:18:38Z

Oh no, sorry to say that I am joining the party here. Seems to only occur when a deployment is in flight. Do I have a messed up policy config? I have the same setup as @johnzheng1975

costinm · 2018-11-02T19:15:00Z

The logs seem to show that pilot closes the connection every 5 minutes - but it reconnects after immediately. It's actually a feature ( an accidental one ) - so pilot connections gets re-balanced. We're working on a better way to rebalance - and after we'll fix this 5-min reconnect.

AFAIK it should not cause any problems.

jojimt · 2018-11-02T22:25:10Z

@costinm re: "AFAIK it should not cause any problems." I am getting 503 while accessing the bookinfo sample application. If this message is not indicative of the problem, is there something else I can check?

Relevant logs are below:

Gateway:
[2018-10-15T20:55:29.665Z] "GET /productpage HTTP/1.1" 503 UF 0 57 1003 - "1.100.101.13" "curl/7.54.0" "05e952c1-9a46-9f3a-9a88-64b777f7e0ec" "172.28.184.179:33637" "10.2.97.32:9080"

productpage sidecar:
[2018-10-15 21:07:29.445][23][warning][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:240] gRPC config stream closed: 14, no healthy upstream
[2018-10-15 21:07:29.445][23][warning][upstream] external/envoy/source/common/config/grpc_mux_impl.cc:41] Unable to establish new stream

JAtula · 2018-11-06T13:03:08Z

Just hit the same issue, a workaround would be awsome, if somebody knows one?

jsw · 2018-11-06T22:41:06Z

I just ran into this on Istio 1.0.3. Not sure if I saw this on previous versions. Deleting the istio-pilot pods seems to help, but is probably only a temporary fix. The pods were only a day old (upgraded Istio 1.0.2 -> 1.0.3 yesterday) and I didn't notice anything obviously bad in the pilot dashboard. Perhaps the recent activity in this issue is people running 1.0.3?

courcelm · 2018-11-07T02:07:56Z

I'm also hitting UNAVAILABLE:upstream connect error or disconnect/reset before headers on Istio 1.0.3

JAtula · 2018-11-07T08:52:10Z

I just ran into this on Istio 1.0.3. Not sure if I saw this on previous versions. Deleting the istio-pilot pods seems to help, but is probably only a temporary fix. The pods were only a day old (upgraded Istio 1.0.2 -> 1.0.3 yesterday) and I didn't notice anything obviously bad in the pilot dashboard. Perhaps the recent activity in this issue is people running 1.0.3?

Yeah, deleting pilot seems to fix issue, but it's not really ideal 😄

Ps. running 1.0.3 as well.

cchanley2003 · 2018-11-07T17:12:16Z

I am seeing this with 1.0.3 as well. My scenario:

A nginx proxy sits in front of apache httpd. I have two apaches deployed with different versions of a web app. All pods are within the service mesh and have automatically injected sidecars. I have a virtual service sitting in front of the apache. Any time I restart the apache pods I see the disconnect/reset errors. Restarting pilot or applying new virutalservices routes to the apache virtual service fixes the issues.

Any logs I should look for?

sysC0D · 2018-11-08T07:59:48Z

Same problem on Istio 1.0.3.
In most case, when I reapply our configuration files this error is fixed but not for a long time.
Maybe be the volumetry? 4000-5000/req/s on ingress-gateway.

wansuiye · 2018-11-09T10:12:17Z

same problem on Istio 1.0.3, qps is 40~50

emedina · 2018-12-10T16:46:36Z

Possibly related to #10360

novakov-alexey-zz · 2019-01-17T14:51:09Z

Similar behaviour with 1.0.5. Restarting the pilot POD helps.

gowthamreddyvintha · 2019-01-21T14:26:43Z

I am facing a similar issue with 1.0.5. Even restarting the pilot pod didn't solve my problem.

kish3007 · 2019-01-22T04:21:44Z

I am facing the same issue, restarting pilot doesn't seem to help.

m1o1 · 2019-01-22T04:26:46Z

Is the problem reproducible with Minikube? Is it hosted? If it only happens on a hosted platform it might be related to an issue I was having.

kish3007 · 2019-01-22T05:04:23Z

My issue is in GKE, running Istio 1.0.5(mtls not enabled), noticed this when I was trying to access one of my service(https) through Gateway.

gowthamreddyvintha · 2019-01-22T16:49:54Z

My issue on AKS, running Istio 1.0.5(mtls not enabled). Sample Book info app is running fine, but when deployed own application( simple web app) VS not routing to the route path of the pod.

liutanrong · 2019-01-26T02:19:36Z

I probably found the cause of the problem.

The reason

I found that the cluster in istio-proxy contains the Kubernetes pod IP that no longer exists.
And I see from the error log of istio-proxy that all request traffic with an error of 503 UF is sent to this non-existing IP. So I think I found the reason for this problem.

How to solve this

In my case,i solve it by apply the destinationRule again,and istio sync the right cluster.

The log

In my case,the cloudspidergateway have 2 pod in Kubernetes which ip is 10.244.25.4，10.244.14.34 ,but the istio-proxy think i have three pod (10.244.25.4，10.244.7.51，10.244.14.34)

the error log of istio-proxy

docker logs <istio-proxy container> |grep 503 |grep UF

{"log":"[2019-01-23T19:28:44.859Z] \"POST /cloudSpiderAccessor/v1/executeReport/taskId/ff808081662e858401687bfd7d51256fHTTP/1.1\" 503 UF 394 57 999 - \"-\" \"Java/1.8.0_181\" \"8c75b45d-5b51-98a8-b4f8-d378c675aae4\" \"cloudSpiderGateWay:9000\" \"10.244.7.51:9000\" outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local - 10.96.123.71:9000 10.244.13.6:41746\n","stream":"stdout","time":"2019-01-23T19:28:53.30835571Z"}
{"log":"[2019-01-23T19:28:55.415Z] \"POST /cloudSpiderAccessor/v1/executeReport/taskId/e4e4781d662e89310168770017726a6dHTTP/1.1\" 503 UF 464 57 506 - \"-\" \"Java/1.8.0_181\" \"0ae79043-9708-9545-9d8b-76c2482cef33\" \"cloudSpiderGateWay:9000\" \"10.244.7.51:9000\" outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local - 10.96.123.71:9000 10.244.13.6:41752\n","stream":"stdout","time":"2019-01-23T19:29:03.311469856Z"}
{"log":"[2019-01-23T19:29:09.249Z] \"POST /cloudSpiderAccessor/v1/executeReport/taskId/e4e4781d662e89310168770017726a6dHTTP/1.1\" 503 UF 466 57 495 - \"-\" \"Java/1.8.0_181\" \"a8592ead-409f-9a2a-a8c7-0ec6a25deb74\" \"cloudSpiderGateWay:9000\" \"10.244.7.51:9000\" outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local - 10.96.123.71:9000 10.244.13.6:41762\n","stream":"stdout","time":"2019-01-23T19:29:13.31274334Z"}
{"log":"[2019-01-23T19:29:27.814Z] \"POST /cloudSpiderAccessor/v1/executeReport/taskId/e4e4781d662e89310168770017726a6dHTTP/1.1\" 503 UF 469 57 570 - \"-\" \"Java/1.8.0_181\" \"1f52f1fd-81d8-9329-8fba-9783244bdff6\" \"cloudSpiderGateWay:9000\" \"10.244.7.51:9000\" outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local - 10.96.123.71:9000 10.244.13.6:41768\n","stream":"stdout","time":"2019-01-23T19:29:33.312063164Z"}
{"log":"[2019-01-23T19:29:38.026Z] \"POST /cloudSpiderAccessor/v1/executeReport/taskId/ff808081662e858401687bfd7d51256fHTTP/1.1\" 503 UF 394 57 1001 - \"-\" \"Java/1.8.0_181\" \"cc896a0a-4a1a-98ba-8daa-3e5354b31545\" \"cloudSpiderGateWay:9000\" \"10.244.7.51:9000\" outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local - 10.96.123.71:9000 10.244.13.6:41772\n","stream":"stdout","time":"2019-01-23T19:29:43.310374027Z"}
{"log":"[2019-01-23T19:29:46.382Z] \"POST /cloudSpiderAccessor/v1/executeReport/taskId/e4e4781d662e89310168770017726a6dHTTP/1.1\" 503 UF 461 57 719 - \"-\" \"Java/1.8.0_181\" \"f5fbcb7b-aa36-9f18-aad5-04e9c3fc9d62\" \"cloudSpiderGateWay:9000\" \"10.244.7.51:9000\" outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local - 10.96.123.71:9000 10.244.13.6:41778\n","stream":"stdout","time":"2019-01-23T19:29:53.310893339Z"}
{"log":"[2019-01-23T19:29:58.154Z] \"POST /cloudSpiderAccessor/v1/executeReport/taskId/e4e4781d662e89310168770017726a6dHTTP/1.1\" 503 UF 446 57 723 - \"-\" \"Java/1.8.0_181\" \"87c376c9-ad42-9c70-8e44-cdd4d31c76a5\" \"cloudSpiderGateWay:9000\" \"10.244.7.51:9000\" outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local - 10.96.123.71:9000 10.244.13.6:41784\n","stream":"stdout","time":"2019-01-23T19:30:03.309634823Z"}
{"log":"[2019-01-23T19:30:10.029Z] \"POST /cloudSpiderAccessor/v1/executeReport/taskId/e4e4781d662e89310168770017726a6dHTTP/1.1\" 503 UF 438 57 692 - \"-\" \"Java/1.8.0_181\" \"139f3e91-adfc-9f7a-a0ab-b64fe20b9127\" \"cloudSpiderGateWay:9000\" \"10.244.7.51:9000\" outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local - 10.96.123.71:9000 10.244.13.6:41792\n","stream":"stdout","time":"2019-01-23T19:30:13.312569876Z"}
{"log":"[2019-01-23T19:30:21.735Z] \"POST /cloudSpiderAccessor/v1/executeReport/taskId/e4e4781d662e89310168770017^C

The Kubernetes service describe

kubectl describe service cloudspidergateway -n cloudspider
Name:              cloudspidergateway
Namespace:         cloudspider
Labels:            app=cloudspidergateway
Annotations:       kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"creationTimestamp":null,"labels":{"app":"cloudspidergateway"},"name":"cloudspidergate...
Selector:          app=cloudspidergateway
Type:              ClusterIP
IP:                10.96.123.71
Port:              http-9000-9000-ztdzg  9000/TCP
TargetPort:        9000/TCP
Endpoints:         10.244.14.34:9000,10.244.25.4:9000
Session Affinity:  None
Events:            <none>

the cluster info find in istio-proxy

curl 127.0.0.1:15000/clusters |grep cloudspidergateway

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_connections::1024
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_pending_requests::1024
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_requests::1024
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_retries::3
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_connections::1024
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_pending_requests::1024
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_requests::1024
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_retries::3
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::added_via_api::true
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::cx_active::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::cx_connect_fail::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::cx_total::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::rq_active::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::rq_error::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::rq_success::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::rq_timeout::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::rq_total::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::health_flags::healthy
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::weight::1
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::region::
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::zone::
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::sub_zone::
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::canary::false
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::success_rate::-1
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::cx_active::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::cx_connect_fail::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::cx_total::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::rq_active::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::rq_error::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::rq_success::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::rq_timeout::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::rq_total::0
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::health_flags::healthy
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::weight::1
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::region::
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::zone::
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::sub_zone::
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::canary::false
outbound|9000|v2|cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::success_rate::-1
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_connections::1024
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_pending_requests::1024
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_requests::1024
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_retries::3
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_connections::1024
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_pending_requests::1024
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_requests::1024
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_retries::3
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::added_via_api::true
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::cx_active::1
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::cx_connect_fail::1
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::cx_total::12330
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::rq_active::0
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::rq_error::1
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::rq_success::78050
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::rq_timeout::0
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::rq_total::78051
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::health_flags::healthy
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::weight::1
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::region::
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::zone::
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::sub_zone::
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::canary::false
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::success_rate::-1
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::cx_active::1
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::cx_connect_fail::0
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::cx_total::2990
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::rq_active::0
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::rq_error::2
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::rq_success::16921
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::rq_timeout::0
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::rq_total::16923
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::health_flags::healthy
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::weight::1
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::region::
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::zone::
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::sub_zone::
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::canary::false
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.25.4:9000::success_rate::-1
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::cx_active::0
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::cx_connect_fail::16933
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::cx_total::37860
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::rq_active::0
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::rq_error::16945
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::rq_success::138003
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::rq_timeout::0
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::rq_total::154948
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::health_flags::healthy
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::weight::1
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::region::
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::zone::
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::sub_zone::
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::canary::false
outbound|9000||cloudspidergateway.cloudspider.svc.cluster.local::10.244.7.51:9000::success_rate::-1
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_connections::1024
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_pending_requests::1024
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_requests::1024
10outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::default_priority::max_retries::3
0  25outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_connections::1024
1k    0 outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_pending_requests::1024
 251k    0 outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_requests::1024
    0  77.9outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::high_priority::max_retries::3
M      0outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::added_via_api::true
 --:--outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::cx_active::0
:-- --outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::cx_connect_fail::0
:--:-- outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::cx_total::0
--:--:-outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::rq_active::0
- 81.9outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::rq_error::0
M
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::rq_success::0
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::rq_timeout::0
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::rq_total::0
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::health_flags::healthy
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::weight::1
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::region::
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::zone::
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::sub_zone::
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::canary::false
outbound|9000|v1|cloudspidergateway.cloudspider.svc.cluster.local::10.244.14.34:9000::success_rate::-1

duderino · 2019-02-11T18:59:47Z

I'll try this one, but it won't make it into 1.1, sorry

duderino · 2019-02-28T23:32:24Z

@liutanrong thanks for the tip

duderino · 2019-02-28T23:33:13Z

@howardjohn please do what you can here. Not sure if we can get a fix into 1.1, but please give it a shot

duderino · 2019-03-05T00:43:25Z

This bug has become a collector for many problems that all end up looking similar to the downstream client because of the 503s. I'm going to close this since I think we have a large batch of 503 mitigations going into 1.1. Most are already in 1.1rc2, but a few more will go into 1.1rc3 which we will create later this week.

If you still have issues with 1.1rc3+ please file a fresh bug.

seanclerkin · 2019-03-08T13:14:43Z

I'm seeing this on 1.1rc2. After a rolling release of a service, the Ingress returns 503s until a restart of Pilot.

franpog859 · 2019-04-01T07:15:34Z

Is there any follow up issue, @seanclerkin , @duderino ? I would like to track the work on it 😉

jaygorrell · 2019-04-09T14:53:49Z

Seeing upstream 503 issues still on 1.1.2.

howardjohn · 2019-04-09T14:57:57Z

If you are seeing issues please open a new issue with details @jaygorrell

AL-Cui · 2019-04-27T02:02:55Z

Similar behaviour with 1.0.5. Restarting the pilot POD helps.

Similar behaviour with 1.0.5. But, How to restarting the pilot POD.I delete the pilot pod,but it Cannot start automatically。how to restarting the pilot。

baifan · 2019-05-04T08:17:01Z

Similar behaviour with 1.1.4.

[2019-05-04 08:06:37.571][19][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-05-04 08:06:37.571][19][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-05-04 08:06:37.571][19][warning][misc] [external/envoy/source/common/protobuf/utility.cc:174] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-05-04 08:06:37.575][19][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 14, no healthy upstream
[2019-05-04 08:06:37.575][19][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:49] Unable to establish new stream
[2019-05-04 08:06:38.105][19][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure
2019-05-04T08:06:38.858463Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
[2019-05-04 08:06:40.372][19][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure
2019-05-04T08:06:40.857978Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2019-05-04T08:06:42.858037Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
[2019-05-04 08:06:43.489][19][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure
2019-05-04T08:06:44.860023Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2019-05-04T08:06:46.858293Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2019-05-04T08:06:48.858084Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2019-05-04T08:06:50.858129Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2019-05-04T08:06:52.858128Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
[2019-05-04 08:06:53.675][19][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure
2019-05-04T08:06:54.857942Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2019-05-04T08:06:56.858256Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2019-05-04T08:06:58.858245Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2019-05-04T08:07:00.858143Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
[2019-05-04 08:07:02.448][19][warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure
2019-05-04T08:07:02.858391Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2019-05-04T08:07:04.858327Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected
2019-05-04T08:07:06.858297Z	info	Envoy proxy is NOT ready: config not received from Pilot (is Pilot running?): cds updates: 0 successful, 0 rejected; lds updates: 0 successful, 0 rejected

ghost · 2019-05-11T21:29:11Z

Version: 1.1.15

I am encountering following issue that stops my application and my pod is not able to launch.
I found the following error in the stack drive and the subsequent line says, terminating task executor.

[warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 13,
[warning][config] [bazel-out/k8-opt/bin/external/envoy/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:86] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure

So application never launch.

As specified in the thread I have delete istio-pilot but the error persists.

jamierobert · 2019-05-16T10:38:27Z

This solved my issue,

kubectl delete meshpolicy default

This does carry security implications and I assume a new meshpolicy will have to be defined - but it does stop the 503's

wdrdres3qew5ts21 · 2020-08-05T10:59:05Z

I'm also facing the same problem too sometime after upgrade service version my workload are calling to another Kubernetes Service that none exist (namespace not found) and in Kiali it depict the picture service call out to PassthroughCluster ! (even though it internal service in the same cluster)
I dont know what is the root cause but from what I see is "envoy proxy calling old pod's ip that cause Service not found !
What is the root cause of this problem ? I dont really understand but in my opinion it may cause Pilot "not really sync to all envoy" that cause Envoy Proxy persist route to old pod ip even though it didnt exist !

rlenglet · 2020-08-05T16:29:32Z

@wdrdres3qew5ts21 please file a separate issue. It's very likely to be completely unrelated to this issue.

smawson added the area/networking label Jun 8, 2018

rshriram assigned frankbu Jun 12, 2018

andraxylia added this to the 1.0 milestone Jun 13, 2018

louiscryan added the kind/customer issue label Jun 19, 2018

arnecls mentioned this issue Jun 22, 2018

Istio 0.8.0 bookinfo example on GKE returns 504 #6511

Closed

johnzheng1975 mentioned this issue Jun 25, 2018

upstream connect error or disconnect/reset before headers. #4999

Closed

tcnghia mentioned this issue Jun 27, 2018

Ingress returns 503 "upstream connect error or disconnect/reset before headers" knative/serving#898

Closed

SS-TT mentioned this issue Sep 25, 2018

Can't get routing to work on BookInfo sample #8660

Closed

MengLi619 mentioned this issue Oct 13, 2018

NodePort GRPC failed after inject istio #9299

Closed

duderino modified the milestones: 1.2, 1.1 Oct 15, 2018

frankbu removed their assignment Nov 9, 2018

emedina mentioned this issue Dec 10, 2018

Pilot ADS not coping and failing to push #10360

Closed

duderino assigned duderino and unassigned costinm Feb 11, 2019

duderino assigned howardjohn and unassigned duderino Feb 28, 2019

duderino closed this as completed Mar 5, 2019

franpog859 mentioned this issue Apr 1, 2019

Application Registry tests are failing kyma-project/kyma#3108

Closed

yangminzhu mentioned this issue May 23, 2019

e2e-mixer-noauth-v1alpha3-v2 flake: command failed: "Error from server (BadRequest): container \"prometheus\" in pod \"prometheus-7b9868c46b-trl67\" is waiting to start: PodInitializing\n" exit status 1 #12431

Closed

Upsteam error issue [503] since side-car-istio-proxy cannot connect to pilot #6085

Upsteam error issue [503] since side-car-istio-proxy cannot connect to pilot #6085

Comments

johnzheng1975 commented Jun 7, 2018 • edited

ryanyard commented Jun 11, 2018

LinAnt commented Jun 12, 2018

andraxylia commented Jun 13, 2018

rlenglet commented Jun 14, 2018

johnzheng1975 commented Jun 15, 2018 • edited

cizixs commented Aug 28, 2018

dwdraju commented Aug 30, 2018

jojimt commented Oct 16, 2018

zparnold commented Nov 1, 2018

costinm commented Nov 2, 2018

jojimt commented Nov 2, 2018 • edited

JAtula commented Nov 6, 2018

jsw commented Nov 6, 2018

courcelm commented Nov 7, 2018

JAtula commented Nov 7, 2018 • edited

cchanley2003 commented Nov 7, 2018 • edited

sysC0D commented Nov 8, 2018

wansuiye commented Nov 9, 2018

emedina commented Dec 10, 2018

novakov-alexey-zz commented Jan 17, 2019

gowthamreddyvintha commented Jan 21, 2019

kish3007 commented Jan 22, 2019

m1o1 commented Jan 22, 2019

kish3007 commented Jan 22, 2019

gowthamreddyvintha commented Jan 22, 2019

liutanrong commented Jan 26, 2019 • edited

The reason

How to solve this

More question

The log

the error log of istio-proxy

The Kubernetes service describe

the cluster info find in istio-proxy

duderino commented Feb 11, 2019

duderino commented Feb 28, 2019

duderino commented Feb 28, 2019

duderino commented Mar 5, 2019

seanclerkin commented Mar 8, 2019

franpog859 commented Apr 1, 2019

jaygorrell commented Apr 9, 2019

howardjohn commented Apr 9, 2019

AL-Cui commented Apr 27, 2019

baifan commented May 4, 2019

ghost commented May 11, 2019 • edited by ghost

jamierobert commented May 16, 2019 • edited

wdrdres3qew5ts21 commented Aug 5, 2020

rlenglet commented Aug 5, 2020

johnzheng1975 commented Jun 7, 2018 •

edited

johnzheng1975 commented Jun 15, 2018 •

edited

jojimt commented Nov 2, 2018 •

edited

JAtula commented Nov 7, 2018 •

edited

cchanley2003 commented Nov 7, 2018 •

edited

liutanrong commented Jan 26, 2019 •

edited

ghost commented May 11, 2019 •

edited by ghost

jamierobert commented May 16, 2019 •

edited