Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffering middleware often suffers from high latency #10337

Open
2 tasks done
kk-kwok opened this issue Jan 8, 2024 · 4 comments
Open
2 tasks done

Buffering middleware often suffers from high latency #10337

kk-kwok opened this issue Jan 8, 2024 · 4 comments
Labels
area/middleware contributor/wanted Participation from an external contributor is highly requested kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed.

Comments

@kk-kwok
Copy link

kk-kwok commented Jan 8, 2024

Welcome!

  • Yes, I've searched similar issues on GitHub and didn't find any.
  • Yes, I've searched similar issues on the Traefik community forum and didn't find any.

What did you do?

Buffering middleware often suffers from high latency。

What did you see instead?

traefik-tracing

What version of Traefik are you using?

traefik v2.10.7

What is your environment & configuration?

Traefik

global:
  checkNewVersion: true
  sendAnonymousUsage: true

api:
  insecure: true

entryPoints:
  web:
    address: :80
    transport:
      respondingTimeouts:
        idleTimeout: 180
  websecure:
    address: :443
    transport:
      respondingTimeouts:
        idleTimeout: 180
  metrics:
    address: :8082

ping: {}

serversTransport:
  maxIdleConnsPerHost: 100
  forwardingTimeouts:
    dialTimeout: 30
    responseHeaderTimeout: 60

log:
  format: json
  level: DEBUG

accessLog:
  format: json
  fields:
    defaultMode: keep
    names:
      StartUTC: drop
    headers:
      defaultMode: keep

metrics:
  prometheus:
    buckets:
      - 0.1
      - 0.3
      - 1.2
      - 5.0
    addEntryPointsLabels: true
    addRoutersLabels: true
    addServicesLabels: true
    entryPoint: metrics

providers:
  kubernetescrd: 
    allowCrossNamespace: true

tracing:
  serviceName: traefik
  jaeger:
    gen128Bit: true
    propagation: b3
    traceContextHeaderName: x-b3-traceid
    samplingType: const
    samplingParam: 1
    samplingServerURL: http://jaeger-agent.jaeger.svc.cluster.local:5778/sampling
    localAgentHostPort: jaeger-agent.jaeger.svc.cluster.local:6831

Middleware

apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: max-request-body
spec:
  buffering:
    maxRequestBodyBytes: 100000000
    memRequestBodyBytes: 500000000
    maxResponseBodyBytes: 100000000
    memResponseBodyBytes: 500000000
    retryExpression: "IsNetworkError() && Attempts() < 2"
 
---
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
  name: gzip-compress
spec:
  compress:
    excludedContentTypes:
      - text/event-stream

IngressRoute

---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: test-xxx-com-tls
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`test.xxx.com`)
      kind: Rule
      middlewares:
        - name: max-request-body
        - name: gzip-compress
      services:
        - name: istio-ingressgateway
          namespace: istio-system
          port: 80
          passHostHeader: true
  tls: {}

If applicable, please paste the log output in DEBUG level

{"FILENAME":"/var/log/eks-log-agent/traefik-log/stdout-containerd/traefik-ingress-lb/4.log","HOSTNAME":"eks","INDEX_STATUS":"","PKGID":"298068BF657961CB-88EB6","PKG_LOGID":5963836,"RAWLOG":"","SOURCE":"10.40.27.237","TAG":{"cluster_id":"cls-xxxxxx","container_name":"traefik-ingress-lb","image_name":"hub.xxx.com/public/traefik:v2.10.7.1","namespace":"traefik","pod_ip":"10.40.27.237","pod_label_app":"traefik","pod_label_pod-template-hash":"689cdcd975","pod_label_security-istio-io/tlsMode":"istio","pod_label_service-istio-io/canonical-name":"traefik","pod_label_service-istio-io/canonical-revision":"latest","pod_label_tcm-cloud-tencent-com/managed-by":"mesh-xxxxxx","pod_name":"traefik-689cdcd975-tg2vh","pod_uid":"08c88340-2b61-411a-ad5b-7e24f01ed853"},"TIMESTAMP":"2024-01-08 17:43:03","level":"debug","msg":"Reporting span 35eda3454220452da54b6e3f7275d478:7ac8c3f15ab3bccb:5dc756ce9d1d4057:1","time":"2024-01-08T17:43:02+08:00","tracingProviderName":"jaeger"}

{"FILENAME":"/var/log/eks-log-agent/traefik-log/stdout-containerd/traefik-ingress-lb/4.log","HOSTNAME":"eks","INDEX_STATUS":"","PKGID":"298068BF657961CB-88EB6","PKG_LOGID":5963837,"RAWLOG":"","SOURCE":"10.40.27.237","TAG":{"cluster_id":"cls-xxxxxx","container_name":"traefik-ingress-lb","image_name":"hub.xxx.com/public/traefik:v2.10.7.1","namespace":"traefik","pod_ip":"10.40.27.237","pod_label_app":"traefik","pod_label_pod-template-hash":"689cdcd975","pod_label_security-istio-io/tlsMode":"istio","pod_label_service-istio-io/canonical-name":"traefik","pod_label_service-istio-io/canonical-revision":"latest","pod_label_tcm-cloud-tencent-com/managed-by":"mesh-xxxxxx","pod_name":"traefik-689cdcd975-tg2vh","pod_uid":"08c88340-2b61-411a-ad5b-7e24f01ed853"},"TIMESTAMP":"2024-01-08 17:43:03","level":"debug","msg":"Reporting span 35eda3454220452da54b6e3f7275d478:5dc756ce9d1d4057:20dd49d4cd3738b0:1","time":"2024-01-08T17:43:02+08:00","tracingProviderName":"jaeger"}

{"FILENAME":"/var/log/eks-log-agent/traefik-log/stdout-containerd/traefik-ingress-lb/4.log","HOSTNAME":"eks","INDEX_STATUS":"","PKGID":"298068BF657961CB-88EB6","PKG_LOGID":5963838,"RAWLOG":"","SOURCE":"10.40.27.237","TAG":{"cluster_id":"cls-xxxxxx","container_name":"traefik-ingress-lb","image_name":"hub.xxx.com/public/traefik:v2.10.7.1","namespace":"traefik","pod_ip":"10.40.27.237","pod_label_app":"traefik","pod_label_pod-template-hash":"689cdcd975","pod_label_security-istio-io/tlsMode":"istio","pod_label_service-istio-io/canonical-name":"traefik","pod_label_service-istio-io/canonical-revision":"latest","pod_label_tcm-cloud-tencent-com/managed-by":"mesh-xxxxxx","pod_name":"traefik-689cdcd975-tg2vh","pod_uid":"08c88340-2b61-411a-ad5b-7e24f01ed853"},"TIMESTAMP":"2024-01-08 17:43:03","level":"debug","msg":"Reporting span 35eda3454220452da54b6e3f7275d478:20dd49d4cd3738b0:0b78326c9426de19:1","time":"2024-01-08T17:43:02+08:00","tracingProviderName":"jaeger"}

{"FILENAME":"/var/log/eks-log-agent/traefik-log/stdout-containerd/traefik-ingress-lb/4.log","HOSTNAME":"eks","INDEX_STATUS":"","PKGID":"298068BF657961CB-88EB6","PKG_LOGID":5963839,"RAWLOG":"","SOURCE":"10.40.27.237","TAG":{"cluster_id":"cls-xxxxxx","container_name":"traefik-ingress-lb","image_name":"hub.xxx.com/public/traefik:v2.10.7.1","namespace":"traefik","pod_ip":"10.40.27.237","pod_label_app":"traefik","pod_label_pod-template-hash":"689cdcd975","pod_label_security-istio-io/tlsMode":"istio","pod_label_service-istio-io/canonical-name":"traefik","pod_label_service-istio-io/canonical-revision":"latest","pod_label_tcm-cloud-tencent-com/managed-by":"mesh-xxxxxx","pod_name":"traefik-689cdcd975-tg2vh","pod_uid":"08c88340-2b61-411a-ad5b-7e24f01ed853"},"TIMESTAMP":"2024-01-08 17:43:03","level":"debug","msg":"Reporting span 35eda3454220452da54b6e3f7275d478:0b78326c9426de19:3333a267314a5f4b:1","time":"2024-01-08T17:43:02+08:00","tracingProviderName":"jaeger"}

{"FILENAME":"/var/log/eks-log-agent/traefik-log/stdout-containerd/traefik-ingress-lb/4.log","HOSTNAME":"eks","INDEX_STATUS":"","PKGID":"298068BF657961CB-88EB6","PKG_LOGID":5963840,"RAWLOG":"","SOURCE":"10.40.27.237","TAG":{"cluster_id":"cls-xxxxxx","container_name":"traefik-ingress-lb","image_name":"hub.xxx.com/public/traefik:v2.10.7.1","namespace":"traefik","pod_ip":"10.40.27.237","pod_label_app":"traefik","pod_label_pod-template-hash":"689cdcd975","pod_label_security-istio-io/tlsMode":"istio","pod_label_service-istio-io/canonical-name":"traefik","pod_label_service-istio-io/canonical-revision":"latest","pod_label_tcm-cloud-tencent-com/managed-by":"mesh-xxxxxx","pod_name":"traefik-689cdcd975-tg2vh","pod_uid":"08c88340-2b61-411a-ad5b-7e24f01ed853"},"TIMESTAMP":"2024-01-08 17:43:03","level":"debug","msg":"Reporting span 35eda3454220452da54b6e3f7275d478:3333a267314a5f4b:26688be652528831:1","time":"2024-01-08T17:43:02+08:00","tracingProviderName":"jaeger"}

{"FILENAME":"/var/log/eks-log-agent/traefik-log/stdout-containerd/traefik-ingress-lb/4.log","HOSTNAME":"eks","INDEX_STATUS":"","PKGID":"298068BF657961CB-88EB6","PKG_LOGID":5963841,"RAWLOG":"","SOURCE":"10.40.27.237","TAG":{"cluster_id":"cls-xxxxxx","container_name":"traefik-ingress-lb","image_name":"hub.xxx.com/public/traefik:v2.10.7.1","namespace":"traefik","pod_ip":"10.40.27.237","pod_label_app":"traefik","pod_label_pod-template-hash":"689cdcd975","pod_label_security-istio-io/tlsMode":"istio","pod_label_service-istio-io/canonical-name":"traefik","pod_label_service-istio-io/canonical-revision":"latest","pod_label_tcm-cloud-tencent-com/managed-by":"mesh-xxxxxx","pod_name":"traefik-689cdcd975-tg2vh","pod_uid":"08c88340-2b61-411a-ad5b-7e24f01ed853"},"TIMESTAMP":"2024-01-08 17:43:03","level":"debug","msg":"Reporting span 35eda3454220452da54b6e3f7275d478:26688be652528831:31a2584f96ccbcfd:1","time":"2024-01-08T17:43:02+08:00","tracingProviderName":"jaeger"}

{"FILENAME":"/var/log/eks-log-agent/traefik-log/stdout-containerd/traefik-ingress-lb/4.log","HOSTNAME":"eks","INDEX_STATUS":"","PKGID":"298068BF657961CB-88EB6","PKG_LOGID":5963842,"RAWLOG":"","SOURCE":"10.40.27.237","TAG":{"cluster_id":"cls-xxxxxx","container_name":"traefik-ingress-lb","image_name":"hub.xxx.com/public/traefik:v2.10.7.1","namespace":"traefik","pod_ip":"10.40.27.237","pod_label_app":"traefik","pod_label_pod-template-hash":"689cdcd975","pod_label_security-istio-io/tlsMode":"istio","pod_label_service-istio-io/canonical-name":"traefik","pod_label_service-istio-io/canonical-revision":"latest","pod_label_tcm-cloud-tencent-com/managed-by":"mesh-xxxxxx","pod_name":"traefik-689cdcd975-tg2vh","pod_uid":"08c88340-2b61-411a-ad5b-7e24f01ed853"},"TIMESTAMP":"2024-01-08 17:43:03","level":"debug","msg":"Reporting span 35eda3454220452da54b6e3f7275d478:31a2584f96ccbcfd:0aa58868d6811f7b:1","time":"2024-01-08T17:43:02+08:00","tracingProviderName":"jaeger"}

{"FILENAME":"/var/log/eks-log-agent/traefik-log/stdout-containerd/traefik-ingress-lb/4.log","HOSTNAME":"eks","INDEX_STATUS":"","PKGID":"298068BF657961CB-88EB6","PKG_LOGID":5963843,"RAWLOG":"","SOURCE":"10.40.27.237","TAG":{"cluster_id":"cls-xxxxxx","container_name":"traefik-ingress-lb","image_name":"hub.xxx.com/public/traefik:v2.10.7.1","namespace":"traefik","pod_ip":"10.40.27.237","pod_label_app":"traefik","pod_label_pod-template-hash":"689cdcd975","pod_label_security-istio-io/tlsMode":"istio","pod_label_service-istio-io/canonical-name":"traefik","pod_label_service-istio-io/canonical-revision":"latest","pod_label_tcm-cloud-tencent-com/managed-by":"mesh-xxxxxx","pod_name":"traefik-689cdcd975-tg2vh","pod_uid":"08c88340-2b61-411a-ad5b-7e24f01ed853"},"TIMESTAMP":"2024-01-08 17:43:03","level":"debug","msg":"Reporting span 35eda3454220452da54b6e3f7275d478:0aa58868d6811f7b:0000000000000001:1","time":"2024-01-08T17:43:02+08:00","tracingProviderName":"jaeger"}

{"ClientAddr":"163.x.y.z:21356","ClientHost":"163.x.y.z","ClientPort":"21356","ClientUsername":"-","DownstreamContentSize":"4291","DownstreamStatus":"200","Duration":"2236687545","GzipRatio":"9.16942437660219","OriginContentSize":"39346","OriginDuration":"22616940","OriginStatus":"200","Overhead":"2214070605","RequestAddr":"test.xxx.com","RequestContentSize":"77","RequestCount":"758790","RequestHost":"test.xxx.com","RequestMethod":"POST","RequestPath":"/xxxx","RequestPort":"-","RequestProtocol":"HTTP/1.1","RequestScheme":"http","RetryAttempts":"0","RouterName":"traefik-test.xxx.com-e9172593321d688f7157@kubernetescrd","ServiceAddr":"10.40.23.232:80","ServiceName":"traefik-test.xxx.com-e9172593321d688f7157@kubernetescrd","ServiceURL":"{"Scheme":"http","Opaque":"","User":null,"Host":"10.40.23.232:80","Path":"","RawPath":"","OmitHost":false,"ForceQuery":false,"RawQuery":"","Fragment":"","RawFragment":""}","StartLocal":"2024-01-08T17:43:00.630559143+08:00","FILENAME":"/var/log/eks-log-agent/traefik-log/stdout-containerd/traefik-ingress-lb/4.log","HOSTNAME":"eks","INDEX_STATUS":"","PKGID":"298068BF657961CB-88EB6","PKG_LOGID":5963844,"RAWLOG":"","SOURCE":"10.40.27.237","TAG":{"cluster_id":"cls-xxxxxx","container_name":"traefik-ingress-lb","image_name":"hub.xxx.com/public/traefik:v2.10.7.1","namespace":"traefik","pod_ip":"10.40.27.237","pod_label_app":"traefik","pod_label_pod-template-hash":"689cdcd975","pod_label_security-istio-io/tlsMode":"istio","pod_label_service-istio-io/canonical-name":"traefik","pod_label_service-istio-io/canonical-revision":"latest","pod_label_tcm-cloud-tencent-com/managed-by":"mesh-xxxxxx","pod_name":"traefik-689cdcd975-tg2vh","pod_uid":"08c88340-2b61-411a-ad5b-7e24f01ed853"},"TIMESTAMP":"2024-01-08 17:43:03","downstream_Content-Encoding":"gzip","downstream_Content-Type":"application/json; charset=UTF-8","downstream_Date":"Mon, 08 Jan 2024 09:43:02 GMT","downstream_Server":"envoy","downstream_Vary":"Accept-Encoding,Origin","downstream_X-Envoy-Upstream-Service-Time":"21","entryPointName":"web","level":"info","msg":"","origin_Content-Encoding":"gzip","origin_Content-Type":"application/json; charset=UTF-8","origin_Date":"Mon, 08 Jan 2024 09:43:02 GMT","origin_Server":"envoy","origin_Vary":"Accept-Encoding,Origin","origin_X-Envoy-Upstream-Service-Time":"21","request_Accept":"/","request_Accept-Encoding":"br;q=1.0, gzip;q=0.9, deflate;q=0.8","request_Accept-Language":"zh-Hans-HK;q=1.0, en-HK;q=0.9, ja-HK;q=0.8, zh-Hant-HK;q=0.7, yue-Hant-HK;q=0.6","request_Ali-Cdn-Appview-Name":"cdn-tengine","request_Ali-Cdn-Real-Ip":"103.x.y.z","request_Ali-Cdn-Real-Port":"23653","request_Ali-Swift-Force-Ttl-Code":"400=0","request_Ali-Swift-Log-Host":"test.xxx.com","request_Ali-Swift-Origin-Host":"test.xxx.com","request_Ali-Swift-Stat-Host":"test.xxx.com","request_Ali-Swift-Ukeepalive-Timeout":"300","request_Ali-Swift-Urequest-Timeout":"300","request_Ali-Tproxy-Urequest-Timeout":"300","request_Content-Encoding":"gzip","request_Content-Length":"77","request_Content-Type":"application/json","request_Eagleeye-Traceid":"a3b5239617047069806487060e","request_User-Agent":"ios;5.5.20;iPhone12,1;17.1.2;10283437","request_Via":"hk14.l1, l2hk3.l2","request_X-B3-Sampled":"1","request_X-B3-Spanid":"0000000000000001","request_X-B3-Traceid":"35eda3454220452da54b6e3f7275d478","request_X-Client-Scheme":"https","request_X-Forwarded-Host":"test.xxx.com","request_X-Forwarded-Port":"80","request_X-Forwarded-Proto":"http","request_X-Forwarded-Server":"traefik-689cdcd975-tg2vh","request_X-Real-Ip":"163.x.y.z","time":"2024-01-08T17:43:02+08:00"}

@kk-kwok
Copy link
Author

kk-kwok commented Jan 8, 2024

This problem has appeared for a long time. This problem has been discovered since version v2.9.*. This problem also occurs after upgrading to the latest version v2.10.7.

@kevinpollet kevinpollet self-assigned this Jan 8, 2024
@kevinpollet
Copy link
Member

Hello @kk-kwok and thanks for reaching out,

This problem has appeared for a long time. This problem has been discovered since version v2.9.*. This problem also occurs after upgrading to the latest version v2.10.7.

Does it mean that the problem is not here in older versions?

Could you provide a minimal reproducible case (for instance, full Kubernetes manifest to reproduce the issue)?

In the meantime, if any community member can help us find verified steps to reproduce and fix the issue, we would love the help.

@kevinpollet kevinpollet added kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. area/middleware contributor/wanted Participation from an external contributor is highly requested and removed status/0-needs-triage labels Jan 10, 2024
@kk-kwok
Copy link
Author

kk-kwok commented Jan 10, 2024

Hello @kevinpollet .
Thank you very much for your help.

I mean I was using traefik v2.9 when I first discovered this problem. Then I updated to the latest version v2.10.7 and still have the same problem.

This problem was discovered in our production environment k8s cluster. I just tried to deploy a simple nginx deployment in a new namespace, and then used the wrk command to conduct a simple stress test. This problem did not occur.

@kk-kwok
Copy link
Author

kk-kwok commented Jan 10, 2024

When using high concurrency continuous stress testing, there is an overall high delay.

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.24
        ports:
        - containerPort: 80

---
apiVersion: v1
kind: Service

metadata:
  annotations:
    service.cloud.tencent.com/direct-access: "true"
    service.kubernetes.io/topology-aware-hints: auto
  name: nginx

spec:
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 80
  selector:
    app: nginx
  type: ClusterIP
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: test-xxx-com
spec:
  entryPoints:
    - web
  routes:
    - match: Host(`test.xxx.com`)
      kind: Rule
      middlewares:
        - name: max-request-body
        - name: gzip-compress
      services:
        - name: nginx
          namespace: default
          port: 80
          passHostHeader: true
wrk -c 1000 -t 600 -d 60s --timeout=10 -H 'Host: test.xxx.com' http://{ip}

traefik

@kevinpollet kevinpollet removed their assignment Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/middleware contributor/wanted Participation from an external contributor is highly requested kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed.
Projects
None yet
Development

No branches or pull requests

3 participants