Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid Header Field Istio Proxy #39196

Closed
jimid27 opened this issue May 30, 2022 · 12 comments · Fixed by #39365 or #39476
Closed

Invalid Header Field Istio Proxy #39196

jimid27 opened this issue May 30, 2022 · 12 comments · Fixed by #39365 or #39476

Comments

@jimid27
Copy link

jimid27 commented May 30, 2022

Bug Description

In pursuit of deploying an Opentelemetry Collector to pull Istio metrics and traces, I ran into an issue whereby the Istio-proxy sidecar throws the following error:

[2022-05-30T14:51:42.631Z] "- - HTTP/2" 0 DPE http2.invalid.header.field - "-" 0 0 0 - "-" "-" "-" "-" "-" - - 10.128.0.122:4317 10.128.0.122:50082 outbound_.4317_._.otelcol-deployment-collector.istio-system.svc.cluster.local -

For testing we have the Open Telemetry collector deployed in the Istio System namespace with the manifest coming from the Istio samples (note did change the service name to match how Otel gets deployed via helm) https://raw.githubusercontent.com/istio/istio/release-1.13/samples/open-telemetry/otel.yaml

My istio configmap looks as follows:

apiVersion: v1
data:
  mesh: |-
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      tracing:
        zipkin:
          address: zipkin.istio-system.svc.cluster.local:9411
    enablePrometheusMerge: true
    extensionProviders:
    - envoyOtelAls:
        port: 4317
        service: otelcol-deployment-collector.istio-system.svc.cluster.local
      name: otel
    - envoyExtAuthzHttp:
        headersToUpstreamOnAllow:
        - authorization
        - path
        - x-prefect-actor-id
        - x-prefect-actor-permissions
        - Origin
        - Access-Control-Request-Method
        includeRequestHeadersInCheck:
        - authorization
        - Origin
        - Access-Control-Request-Method
        port: "4300"
        service: auth.cloud2.svc.cluster.local
      name: sample-ext-authz-http
    rootNamespace: istio-system
    trustDomain: cluster.local
  meshNetworks: 'networks: {}'
kind: ConfigMap
metadata:
  annotations:
    meta.helm.sh/release-name: istiod
    meta.helm.sh/release-namespace: istio-system
  creationTimestamp: "2022-04-06T18:49:12Z"
  labels:
    app.kubernetes.io/managed-by: Helm
    helm.toolkit.fluxcd.io/name: istiod
    helm.toolkit.fluxcd.io/namespace: istio-system
    install.operator.istio.io/owning-resource: unknown
    istio.io/rev: default
    operator.istio.io/component: Pilot
    release: istiod
  name: istio
  namespace: istio-system
  resourceVersion: "51605218"
  uid: 53020fd7-5322-44ed-9cf3-bbda2add8b8c

I do not see any errors from the Istiod Pod in connecting to the collector. I only see errors from the Istio-proxy sidecar pod on the otel collector pod.

When turning on Debug logs for that pod (istioctl proxy-config log otel-collector-55cf757858-nnsvp --level=debug) I see the following

2022-05-27T16:44:02.766198Z     debug   envoy http      [C448][S4627748604952493369] stream reset
2022-05-27T16:44:02.766285Z     debug   envoy connection        [C448] closing data_to_write=0 type=2
2022-05-27T16:44:02.766303Z     debug   envoy connection        [C448] setting delayed close timer with timeout 1000 ms
2022-05-27T16:44:02.768414Z     debug   envoy connection        [C446] triggered delayed close
2022-05-27T16:44:02.768439Z     debug   envoy connection        [C446] closing socket: 1
2022-05-27T16:44:02.768558Z     debug   envoy connection        [C446] SSL shutdown: rc=0
2022-05-27T16:44:02.768627Z     debug   envoy conn_handler      [C446] adding to cleanup list
2022-05-27T16:44:02.768636Z     debug   envoy connection        [C445] remote close
2022-05-27T16:44:02.768649Z     debug   envoy connection        [C445] closing socket: 0
2022-05-27T16:44:02.768720Z     debug   envoy connection        [C445] SSL shutdown: rc=1
2022-05-27T16:44:02.768777Z     debug   envoy client    [C445] disconnect. resetting 1 pending requests
2022-05-27T16:44:02.768787Z     debug   envoy client    [C445] request reset
2022-05-27T16:44:02.768803Z     debug   envoy pool      [C445] destroying stream: 0 remaining
2022-05-27T16:44:02.768818Z     debug   envoy router    [C0][S12665852141700473177] upstream reset: reset reason: connection termination, transport failure reason: 
2022-05-27T16:44:02.768818Z     debug   envoy router    [C0][S12665852141700473177] upstream reset: reset reason: connection termination, transport failure reason: 
2022-05-27T16:44:02.768877Z     debug   envoy http      async http request response headers (end_stream=true):
':status', '200'
'content-type', 'application/grpc'
'grpc-status', '14'
'grpc-message', 'upstream connect error or disconnect/reset before headers. reset reason: connection termination'

2022-05-27T16:44:02.768907Z     debug   envoy pool      [C445] client disconnected, failure reason: 
2022-05-27T16:44:02.768918Z     debug   envoy pool      invoking idle callbacks - is_draining_for_deletion_=false
2022-05-27T16:44:03.079432Z     debug   envoy filter    original_dst: new connection accepted
2022-05-27T16:44:03.079509Z     debug   envoy filter    tls inspector: new connection accepted
2022-05-27T16:44:03.079569Z     debug   envoy filter    tls:onServerName(), requestedServerName: outbound_.4317_._.otelcol-deployment-collector.opentelemetry.svc.cluster.local
2022-05-27T16:44:03.079675Z     debug   envoy conn_handler      [C449] new connection from 10.128.2.46:35390
2022-05-27T16:44:03.081089Z     debug   envoy http2     [C449] updating connection-level initial window size to 268435456
2022-05-27T16:44:03.081141Z     debug   envoy http      [C449] new stream
2022-05-27T16:44:03.081190Z     debug   envoy http2     [C449] invalid http2: Invalid HTTP header field was received: frame type: 1, stream: 1, name: [:authority], value: [outbound|4317||otelcol-deployment-collector.opentelemetry.svc.cluster.local]
2022-05-27T16:44:03.081199Z     debug   envoy http2     [C449] invalid frame: Invalid HTTP header field was received on stream 1
2022-05-27T16:44:03.081208Z     debug   envoy http      [C449] dispatch error: The user callback function failed
2022-05-27T16:44:03.081221Z     debug   envoy http      [C449][S4224173591166327055] stream reset
2022-05-27T16:44:03.081308Z     debug   envoy connection        [C449] closing data_to_write=0 type=2
2022-05-27T16:44:03.081320Z     debug   envoy connection        [C449] setting delayed close timer with timeout 1000 ms
2022-05-27T16:44:03.155647Z     debug   envoy conn_handler      [C450] new connection from 10.128.0.1:50632
2022-05-27T16:44:03.155805Z     debug   envoy http      [C450] new stream
2022-05-27T16:44:03.155890Z     debug   envoy http      [C450][S17571925179678838022] request headers complete (end_stream=true):
':authority', '10.128.0.109:15021'
':path', '/healthz/ready'
':method', 'GET'
'user-agent', 'kube-probe/1.21'
'accept', '*/*'
'connection', 'close'

Version

Istio Version

client version: 1.13.3
control plane version: 1.14.0-beta.1
data plane version: 1.14.0-beta.1 (20 proxies), 1.13.4 (8 proxies)

K8s version

Client Version: v1.23.4
Server Version: v1.21.11-gke.1100

Helm Version

version.BuildInfo{Version:"v3.8.0", GitCommit:"d14138609b01886f544b2025f5000351c9eb092e", GitTreeState:"clean", GoVersion:"go1.17.6"}

Additional Information

No response

@zirain
Copy link
Member

zirain commented May 31, 2022

can you share a config_dump from the pod? you can get it by running istioctl pc all <pod_name>

@jimid27
Copy link
Author

jimid27 commented May 31, 2022

config_dump.txt

@howardjohn
Copy link
Member

I think there is some header field or similar we are supposed to set. We ran into this with zipkin as well

@zirain
Copy link
Member

zirain commented Jun 1, 2022

config_dump.txt

sorry, I forgot -o , can you run it again with istioctl pc all <pod_name> -oyaml

I can not reproduce it in my env.

@jimid27
Copy link
Author

jimid27 commented Jun 1, 2022

istio-config.txt

Github doesn't allow for the submission of YAML files but this is the output of that command. Note this is in a Private GKE cluster, I'm not sure if that has anything to do with this.

@zirain
Copy link
Member

zirain commented Jun 2, 2022

@jimid27 just got this line istioctl proxy-config log otel-collector-55cf757858-nnsvp --level=debug, the otel-collector pod is injected?

@rahulanand16nov
Copy link

rahulanand16nov commented Jun 2, 2022

We have a similar issue as referenced above by @alexsnaps. What is happening is that callout from Envoy replaces the authority header to that of the cluster name, which is in the format outbound|port|subset|svc.FQDN. This format is not following the standard seen for :authority header and thus request is rejected.

The Authority header is defined at https://datatracker.ietf.org/doc/html/rfc3986#section-3.2. Under the registered name section for Host:
A registered name intended for lookup in the DNS uses the syntax defined in Section 3.5 of [RFC1034] and Section 2.1 of [RFC1123]. Such a name consists of a sequence of domain labels separated by ".", each domain label starting and ending with an alphanumeric character and possibly also containing "-" characters.

From the text, looks like having a pipe | in authority header is wrong and that's why http2 implementations are rejecting such requests and they are correct in doing so.

I see two possible solutions:

  1. Istio changes the cluster name format to use dot (.) instead of the pipe. @howardjohn is it possible to do that change? Looks like it's hardcoded so it should be possible I guess?
    func BuildSubsetKey(direction TrafficDirection, subsetName string, hostname host.Name, port int) string {
    return string(direction) + "|" + strconv.Itoa(port) + "|" + subsetName + "|" + string(hostname)
    }
  2. Envoy uses a different authority header somehow.

@jimid27
Copy link
Author

jimid27 commented Jun 2, 2022

@jimid27 just got this line istioctl proxy-config log otel-collector-55cf757858-nnsvp --level=debug, the otel-collector pod is injected?

Yes the otel-collector pod is injected, yes.

@zirain
Copy link
Member

zirain commented Jun 6, 2022

@jimid27 just got this line istioctl proxy-config log otel-collector-55cf757858-nnsvp --level=debug, the otel-collector pod is injected?

Yes the otel-collector pod is injected, yes.

IIRC, you should not get that pod injected, do you have any special use cases?

@howardjohn
Copy link
Member

I think #39473 fixes it, but I don't have experience/time to test it. Would be great if someone could try it out/take over the PR.

@zirain
Copy link
Member

zirain commented Jun 16, 2022

I think #39473 fixes it, but I don't have experience/time to test it. Would be great if someone could try it out/take over the PR.

I will take a look at that PR.

@howardjohn
Copy link
Member

howardjohn commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants