-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ingress gateway restarts with tcp_cluster_rewrite #21676
Comments
cc @douglas-reid @PiotrSikora (last two people to touch tcp_cluster_rewrite.cc) |
@rshriram, the gateway proxy is crashing somewhere near the sni_filter and tcp_rewrite_cluster filter. Any ideas? The latter filter hasn't changed in a while. Possibly a bad interaction with other filters maybe? |
@ayj did you used the following command to install the multicluster? I I failed to find the file, is this the one |
One thing we can check is if using a 1.4-based ingress ( which is compatible with 1.5 control plane ) is a reasonable workaround until 1.5.1 ships. |
It seems |
Does this not suggest a user could do a denial-of-service on Istio by supplying any .global hostname in a request? Is the single-cluster setup immune? cc @mbanikazemi |
@hzxuzhonghu fwiw in my setup I do see the outbound_.8000_._.httpbin.bar.svc.cluster.local cluster and see envoy still crashing probably at Envoy::Tcp::TcpClusterRewrite::TcpClusterRewriteFilter::onNewConnection() |
We should check if disable the |
There was a PR to disable these by default I think |
This filter is only set if multicluster is enabled:
|
I think you mean the one from @vikaschoudhary16, it is to disable normal clusters, this kind of clusters here are sni-dnat clusters, which is not influenced |
… On Thu, Mar 5, 2020 at 5:42 PM Zhonghu Xu ***@***.***> wrote:
It seems outbound_.8000_._.httpbin.bar.svc.cluster.local does not exist
from the config dump, not sure if this matters
There was a PR to disable these by default I think
I think you mean the one from @vikaschoudhary16
<https://github.com/vikaschoudhary16>, it is to disable normal clusters,
this kind of clusters here are sni-dnat clusters, which is not influenced
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#21676?email_source=notifications&email_token=AAEYGXLQANMRNOU54HUUJOLRGBIG7A5CNFSM4K65O7ZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEN7XA3I#issuecomment-595554413>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAEYGXLG22MAKNBXVXHJRTLRGBIG7ANCNFSM4K65O7ZA>
.
|
But what tested here is release-1.5, and i searched in my deployment, it is |
|
/cc @incfly |
cc @sdake |
@ayj et al, for context on why this is needed. I think I had put this information in a PR somewhere, or its in a design doc, but I struggle to find it even with google search, so I'll do the evil deed of repeating it here :)
Finally: |
- *fixes* ingress gateway restarts with tcp_cluster_rewrite #21676 - *fixes* stackdriver to separate the bucket definitions used for bytes distributions from the definitions used for latency measurements.
In case someone refers to the list of things done by this filter as listed above, note that the filter changes the cluster name from .global (generally to .svc.cluster.local) and not to .global. |
@mbanikazemi thanks! I have improved the reasoning with your comments. CHeers |
Any updates on this one? I am still getting 503 while using kubectl exec --context=$CTX_1 $SLEEP_POD -n foo -c sleep -- curl -I httpbin.bar.global:8000/headers
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 91 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
HTTP/1.1 503 Service Unavailable
content-length: 91
content-type: text/plain
date: Sun, 18 Oct 2020 17:45:31 GMT
server: envoy I have installed Istio with Cluster 1 apiVersion: v1
kind: ConfigMap
metadata:
name: kube-dns
namespace: kube-system
data:
stubDomains: |
{"global": ["$CLUSTER_IP_GKE"]} Cluster 2 apiVersion: v1
kind: ConfigMap
metadata:
annotations:
labels:
eks.amazonaws.com/component: coredns
k8s-app: kube-dns
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
upstream
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
loop
reload
loadbalance
}
global:53 {
errors
cache 30
forward . $CLUSTER_IP_EKS:53
reload
} Thank you |
@SRodi Did you have any luck as I am hitting similar error but I am observing 404 errors on remote cluster gateways:
I am not sure if TLD is being translated by rewrite filter and might not able to find upstream cluster reference to gateway endpoint |
Steps to reproduce:
Follow the replicated control plane setup through testing the example service (see https://preliminary.istio.io/docs/setup/install/multicluster/gateways/#configure-the-example-services). The curl request fails with a 503. The destination ingress gateway logs show an exception and backstrace. The ready checks then fail and the pod is restarted.
The stack trace points to https://github.com/istio/proxy/blob/release-1.5/src/envoy/tcp/tcp_cluster_rewrite/tcp_cluster_rewrite.cc where the *.global suffix is rewritting to *.svc.cluster.local.
config_dump: https://gist.github.com/ayj/bb456945b45fb450c63c1539d27a72e4
logs: https://gist.github.com/ayj/6658f586b028bdea12991a876a2a1dcf
logs w/level=trace: https://gist.github.com/ayj/5c07b20d651bb6be48a08b653f0ff590
The text was updated successfully, but these errors were encountered: