Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nginx Ingress Pods Consuming Too Much Memory #8166

Closed
ParagPatil96 opened this issue Jan 19, 2022 · 85 comments
Closed

Nginx Ingress Pods Consuming Too Much Memory #8166

ParagPatil96 opened this issue Jan 19, 2022 · 85 comments
Labels
area/stabilization Work for increasing stabilization of the ingress-nginx codebase kind/support Categorizes issue or PR as a support question. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@ParagPatil96
Copy link

ParagPatil96 commented Jan 19, 2022

NGINX Ingress controller version

NGINX Ingress controller
Release: v1.1.1
Build: a17181e
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.19.9


Kubernetes version

version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.14-gke.1900", GitCommit:"abc4e63ae76afef74b341d2dba1892471781604f", GitTreeState:"clean", BuildDate:"2021-09-07T09:21:04Z", GoVersion:"go1.15.15b5", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: GCP

  • OS : Container-Optimized OS from Google

  • Kernel : 5.4.129+

  • How was the ingress-nginx-controller installed:

    • We have used helm to install Nginx Ingress Controller following are the values which provided
    ingress-nginx:
    controller:
      priorityClassName: "high-priority"
      replicaCount: 3
      image:
        registry: gcr.io
        image: fivetran-webhooks/nginx-ingress-controller
        tag: "v1.1.1"
        digest: ""
        pullPolicy: Always
      extraArgs:
        shutdown-grace-period: 60
      labels:
        app.kubernetes.io/part-of: wps
        wps-cloud-provider: gcp
        wps-location: <us/eu/uk>
      podLabels:
        app.kubernetes.io/part-of: wps
        wps-cloud-provider: gcp
        wps-location: <us/eu/uk>
      podAnnotations:
        fivetran.com/scrape-prometheus: "true"
        prometheus.io/port: "10254"
        cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
        fivetran.com/fivetran-app: "true"
      minReadySeconds: 30
      updateStrategy:
        rollingUpdate:
          maxSurge: 1
          maxUnavailable: 0
        type: RollingUpdate
      resources:
        requests:
          cpu: 2
          memory: 2Gi
      autoscaling:
        enabled: true
        minReplicas: 3
        maxReplicas: 9
        targetCPUUtilizationPercentage: 75
        targetMemoryUtilizationPercentage: 75
      service:
        enabled: true
        loadBalancerIP: null
      admissionWebhooks:
        enabled: false
      config:
        # logging config
        log-format-upstream: '{"logtype":"request_entry","status": $status, "request_id": "$req_id", "host": "$host", "request_proto": "$server_protocol", "path": "$uri", "request_query": "$args", "request_length": $request_length,  "request_time": $request_time, "method": "$request_method", "time_local": "$time_local", "remote_addr": "$remote_addr", "remote_user": "$remote_user", "http_referer": "$http_referer", "http_user_agent": "$http_user_agent", "body_bytes_sent": "$body_bytes_sent", "bytes_sent": "$bytes_sent", "upstream_addr": "$upstream_addr", "upstream_response_length": "$upstream_response_length", "upstream_response_time": "$upstream_response_time", "upstream_status": "$upstream_status"}'
        log-format-escape-json: 'true'
    
        # request contents config
        proxy-body-size: 9m
        client-body-buffer-size: 9m
    
        # request timeout config
        client-body-timeout: '300'
        client-header-timeout: '300'
        proxy-read-timeout: '300'
        proxy-send-timeout: '300'
    
        # upstream pod retry
        proxy-next-upstream: 'error timeout http_500 http_502 http_503 http_504 non_idempotent'
        proxy-next-upstream-timeout: '60'
        proxy-next-upstream-tries: '0'
        ssl-redirect: "false"
    
        # https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#load-balance
        load-balance: ewma
        worker-processes: '3' #As we are using 3 cpu for ingress controller pods
    
        # Recovery Server
        location-snippet: |
          proxy_intercept_errors on;
          error_page 500 501 502 503 = @fivetran_recovery;
        server-snippet: |
          location @fivetran_recovery {
          proxy_pass http://{Recovery Collector's ClusterIP service's IP address};
          }
    
  • Current State of the controller:

    • kubectl describe ingressclasses
    Name:         nginx
    Labels:       app.kubernetes.io/component=controller
                  app.kubernetes.io/instance=wps-us
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=ingress-nginx
                  app.kubernetes.io/part-of=wps
                  app.kubernetes.io/version=1.1.1
                  helm.sh/chart=ingress-nginx-4.0.15
                  wps-cloud-provider=gcp
                  wps-location=us
    Annotations:  meta.helm.sh/release-name: wps-us
                  meta.helm.sh/release-namespace: ingress-nginx
    Controller:   k8s.io/ingress-nginx
    Events:       <none>
    
  • Current state of ingress object, if applicable:

    • kubectl -n <appnamespace> describe ing <ingressname>
      Name:             collector-ingress
      Labels:           app.kubernetes.io/managed-by=Helm
      Namespace:        collector
      Address:          xxxxx
      Default backend:  default-http-backend:80 (10.18.48.196:8080)
      TLS:
        webhooks-ssl-certificate terminates 
        events-ssl-certificate terminates 
      Rules:
        Host                   Path  Backends
        ----                   ----  --------
        webhooks.fivetran.com  
                               /webhooks/|/internal/|/snowplow/|/health$|/api_docs|/swagger*   collector-service:80 (10.18.48.39:8001,10.18.52.54:8001,10.18.54.56:8001)
        events.fivetran.com    
                               /*   collector-service:80 (10.18.48.39:8001,10.18.52.54:8001,10.18.54.56:8001)
      Annotations:             kubernetes.io/ingress.class: nginx
                               meta.helm.sh/release-name: wps-us
                               meta.helm.sh/release-namespace: ingress-nginx
                               nginx.ingress.kubernetes.io/use-regex: true
      Events:                  <none>
      

What happened:
Our Ingress controller is serving ~1500 RPS
But over time ingress controller memory gets continuously increase but never goes down when it crosses the node limitation ~15GB pods gets evicted.

What you expected to happen:
We expect memory to get stablizise at some point.

profiling heap export:
high_mem.txt

For now we are manually restarting worker processes inorder to realise the memory

@ParagPatil96 ParagPatil96 added the kind/bug Categorizes issue or PR as related to a bug. label Jan 19, 2022
@k8s-ci-robot
Copy link
Contributor

@ParagPatil96: The label(s) triage/support cannot be applied, because the repository doesn't have them.

In response to this:

NGINX Ingress controller version

NGINX Ingress controller
Release: v1.1.1
Build: a17181e
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.19.9


Kubernetes version

version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.14-gke.1900", GitCommit:"abc4e63ae76afef74b341d2dba1892471781604f", GitTreeState:"clean", BuildDate:"2021-09-07T09:21:04Z", GoVersion:"go1.15.15b5", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: GCP

  • OS : Container-Optimized OS from Google

  • Kernel : 5.4.129+

  • How was the ingress-nginx-controller installed:

    • We have used helm to install Nginx Ingress Controller following are the values which provided
    ingress-nginx:
    controller:
      priorityClassName: "high-priority"
      replicaCount: 3
      image:
        registry: gcr.io
        image: fivetran-webhooks/nginx-ingress-controller
        tag: "v1.1.1"
        digest: ""
        pullPolicy: Always
      extraArgs:
        shutdown-grace-period: 60
      labels:
        app.kubernetes.io/part-of: wps
        wps-cloud-provider: gcp
        wps-location: <us/eu/uk>
      podLabels:
        app.kubernetes.io/part-of: wps
        wps-cloud-provider: gcp
        wps-location: <us/eu/uk>
      podAnnotations:
        fivetran.com/scrape-prometheus: "true"
        prometheus.io/port: "10254"
        cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
        fivetran.com/fivetran-app: "true"
      minReadySeconds: 30
      updateStrategy:
        rollingUpdate:
          maxSurge: 1
          maxUnavailable: 0
        type: RollingUpdate
      resources:
        requests:
          cpu: 2
          memory: 2Gi
      autoscaling:
        enabled: true
        minReplicas: 3
        maxReplicas: 9
        targetCPUUtilizationPercentage: 75
        targetMemoryUtilizationPercentage: 75
      service:
        enabled: true
        loadBalancerIP: null
      admissionWebhooks:
        enabled: false
      config:
        # logging config
        log-format-upstream: '{"logtype":"request_entry","status": $status, "request_id": "$req_id", "host": "$host", "request_proto": "$server_protocol", "path": "$uri", "request_query": "$args", "request_length": $request_length,  "request_time": $request_time, "method": "$request_method", "time_local": "$time_local", "remote_addr": "$remote_addr", "remote_user": "$remote_user", "http_referer": "$http_referer", "http_user_agent": "$http_user_agent", "body_bytes_sent": "$body_bytes_sent", "bytes_sent": "$bytes_sent", "upstream_addr": "$upstream_addr", "upstream_response_length": "$upstream_response_length", "upstream_response_time": "$upstream_response_time", "upstream_status": "$upstream_status"}'
        log-format-escape-json: 'true'
    
        # request contents config
        proxy-body-size: 9m
        client-body-buffer-size: 9m
    
        # request timeout config
        client-body-timeout: '300'
        client-header-timeout: '300'
        proxy-read-timeout: '300'
        proxy-send-timeout: '300'
    
        # upstream pod retry
        proxy-next-upstream: 'error timeout http_500 http_502 http_503 http_504 non_idempotent'
        proxy-next-upstream-timeout: '60'
        proxy-next-upstream-tries: '0'
        ssl-redirect: "false"
    
        # https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#load-balance
        load-balance: ewma
        worker-processes: '3' #As we are using 3 cpu for ingress controller pods
    
        # Recovery Server
        location-snippet: |
          proxy_intercept_errors on;
          error_page 500 501 502 503 = @fivetran_recovery;
        server-snippet: |
          location @fivetran_recovery {
          proxy_pass http://{Recovery Collector's ClusterIP service's IP address};
          }
    
  • Current State of the controller:

  • kubectl describe ingressclasses

Name:         nginx
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=wps-us
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=wps
              app.kubernetes.io/version=1.1.1
              helm.sh/chart=ingress-nginx-4.0.15
              wps-cloud-provider=gcp
              wps-location=us
Annotations:  meta.helm.sh/release-name: wps-us
              meta.helm.sh/release-namespace: ingress-nginx
Controller:   k8s.io/ingress-nginx
Events:       <none>
  • Current state of ingress object, if applicable:
  • kubectl -n <appnamespace> describe ing <ingressname>
    Name:             collector-ingress
    Labels:           app.kubernetes.io/managed-by=Helm
    Namespace:        collector
    Address:          xxxxx
    Default backend:  default-http-backend:80 (10.18.48.196:8080)
    TLS:
      webhooks-ssl-certificate terminates 
      events-ssl-certificate terminates 
    Rules:
      Host                   Path  Backends
      ----                   ----  --------
      webhooks.fivetran.com  
                             /webhooks/|/internal/|/snowplow/|/health$|/api_docs|/swagger*   collector-service:80 (10.18.48.39:8001,10.18.52.54:8001,10.18.54.56:8001)
      events.fivetran.com    
                             /*   collector-service:80 (10.18.48.39:8001,10.18.52.54:8001,10.18.54.56:8001)
    Annotations:             kubernetes.io/ingress.class: nginx
                             meta.helm.sh/release-name: wps-us
                             meta.helm.sh/release-namespace: ingress-nginx
                             nginx.ingress.kubernetes.io/use-regex: true
    Events:                  <none>
    

What happened:
Our Ingress controller is serving ~1500 RPS
But over time ingress controller memory gets continuously increase but never goes down when it crosses the node limitation ~15GB pods gets evicted.

What you expected to happen:
We expect memory to get stablizise at some point.

/triage support

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels Jan 19, 2022
@longwuyuan
Copy link
Contributor

/remove-kind bug
/kind support

What other info is available ?
You can get some info if you use the documented prometheus+grafana config.
Does it happen on controller version 0.50.X
What do the logs contain relevant to this

@k8s-ci-robot k8s-ci-robot added kind/support Categorizes issue or PR as a support question. and removed kind/bug Categorizes issue or PR as related to a bug. labels Jan 19, 2022
@rmathagiarun
Copy link

NGINX Ingress controller
Release:       v1.1.1
Build:         a17181e
Repository:    https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.19.9
 
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"9e5f344f6cdbf2eaa7e450d5acd8fd0b7f669bf9", GitTreeState:"clean", BuildDate:"2021-05-19T04:34:27Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
 
We are also facing this memory issue. Initially, when we set the memory limits to 2GB, ingress controller was continuously restarting due to OOM. Attached dmesg log for reference.
 
We then increased the memory limit to 6GB and from the attached Grafana Metrics we can see that the Pod constantly consumes close to 4GB of memory.
 
We were earlier using the following version, where we noticed the memory consumption stabilised at less than 2GB.
Release:       v0.48.1
Build:         git-1de9a24b2
Repository:    git@github.com:kubernetes/ingress-nginx.git
 nginx version: nginx/1.20.1
 
 
It looks like this version of ingress controller consumes too much memory when compared to earlier versions.
ingress-controller_pod1_dmesg.log
ingress-controller_pod1_grafana
ingress-controller_pod1.log
ingress-controller_pod2_dmesg.log
ingress-controller_pod2_grafana
ingress-controller_pod2.log

@rmathagiarun
Copy link

rmathagiarun commented Feb 9, 2022

@longwuyuan I posted the details here as I found the issue similar to ours. Let me know if you would want me to open a new one with all the details.

@longwuyuan
Copy link
Contributor

Can you upgrade to latest release and kindly answer the questions I asked earlier. Thank you.

@ramanNarasimhan77
Copy link

ramanNarasimhan77 commented Feb 18, 2022

@longwuyuan I am a co-worker of @rmathagiarun . The details shared by @rmathagiarun were from the latest released controller version v1.1.1, and we see this issue happening frequently.

@longwuyuan
Copy link
Contributor

Some questions I asked earlier are not answered. Basically, some digging is required and specifics on which process was using memory high should be determined. Along with checking node resources at that point of time.

@rmathagiarun
Copy link

@longwuyuan

I can see that you have suggested us to test using controller version 0.5.x - We have been using v0.48.1 for a long time and have never faced this issue. We had to upgrade to v1.1.1 as v0.48.1(even the suggested version v0.50.x) is not compatible with K8s Version 1.22 and 1.23.

The core components of our product has remained the same on both the versions(0.48.1 and v1.1.1) and we are facing this memory issue only with v1.1.1.

Unfortunately, the logs doesn't have much info on this memory leak. We were able to find the OOM issue only by using dmesg command inside the pod. I have already shared the ingress logs and Grafana screenshots for the same.

Nodes all along had sufficient resources and as you can see from the Grafana Screenshot that the pod is constantly consuming close to 4GB and spike is not specific only during certain operations.

@rmathagiarun
Copy link

@longwuyuan Any update on this. I have answered all the queries that you have posted, Let me know if you need any additional info.

What other info is available ?
Every time when we add/remove ingress, ingress controller gets reloaded (due to configuration changes). During this reload, memory utilisation shoots up.
I0218 13:29:45.249555 8 controller.go:155] "Configuration changes detected, backend reload required"
I0218 13:30:03.025708 8 controller.go:172] "Backend successfully reloaded"
Even after Pod restarts it continues to keep crashing as the Pod continues to hold the memory. Only a Deployment rollout or Deleting the Pod would release the memory.

You can get some info if you use the documented prometheus+grafana config.
Screenshots already shared.

Does it happen on controller version 0.50.X
As stated in my Previous comment, we have been using v0.48.X for a long time and we want to upgrade to V1.1.1 to make our selves compatible with K8s Version 1.22 and 1.23.

What do the logs contain relevant to this.
Have already shared the Logs, but, the logs doesn't have much info on this memory leak. We were able to find the OOM issue only by using dmesg command inside the pod.

@rmathagiarun
Copy link

rmathagiarun commented Mar 7, 2022

@longwuyuan Any update on this. We have been hitting this issue quite frequently.

@longwuyuan
Copy link
Contributor

Hi, its unfortunate that you have a problem. I have made some attempts to drive this to some resolution but failed. So next attempt I can make is suggest some info gathering.

You will requires some kind of a background or will have to get someone who has some kind of a history with performance related troubleshooting. Basically there are some steps like preparations to capture related information on the processes. Then there are some steps for the actual processes running. That is all too elaborate or "unixy" so to speak.

Please look at the history of issues weorked on here. We did discover some issues in Lua and nginx and some singnificant changes were made to the base image and components like Lua. Please find those issues and checkout the procedures described there. It included attaching debuggers for specific processes and regular trace/ptrace for controller process.

Also, I have no clue as to how many replicas you are running and/or if you are using a daemonset. Maybe its in the info in this issue but its not glaring.

My comment summary is get the info likes trace, trace, debugger, and then relate that with statistics from monitoring. Once you can provide a precise step-by-step instruction here, someone can try to reproduce the problem.

@rmathagiarun
Copy link

Hi @longwuyuan,

The issue can be reproduced by following the below steps,

  1. Install ingress-nginx using Helm Chart - helm-chart-4.0.17 We updated ingressClass to APP-nginx-ingress-controller before installing.

  2. Execute the following to create multiple ingress and observe the memory spike in Prometheus or Grafana GUI each time backend config is reloaded.

for i in {1..20}; do kubectl -n APP run pod$i --image=nginx:latest; kubectl -n APP expose pod pod$i --port=80 --name=service$i; kubectl create -n APP ingress secure-ingress-$i --class=APP-nginx-ingress-controller --rule="APP.com/service$i=service$i:80"; done

In our case, the memory spiked from around 0.75GB to 2GB. We also noticed the Nginx process continued to retain the 2GB memory without releasing it back after successful reloads. Attached Prometheus Screenshot for reference
spike_during_reloads

  1. Execute the following command to restart ingress deployment and to free up the memory.
	kubectl -n APP scale deploy ingress-controller --replicas=0
	kubectl -n APP scale deploy ingress-controller --replicas=1

In our case, the memory was back at 0.75GB. However, Nginx required multiple restarts as it wasn't able to load the backend configs. Attaching sample logs and Prometheus screenshot for reference.
after_restart
nginx_ingress_after_restart.log

We tried simulating the same scenario with the latest release helm-chart-4.0.18 and the result was same.

Notably, the spike was observed only with backend reloads and without sending any load to endpoints.

@longwuyuan
Copy link
Contributor

Hi @rmathagiarun ,

Spike in memory usage is expected in the procedure you have described. I don't see a memory leak or bug in that procedure.

Would you like to make the reproduce steps more elaborate and detailed. Something closer to real-use would help the developers.

@rmathagiarun
Copy link

@longwuyuan

Agree that spike in memory usage is expected due to multiple backend reloads. Our concern is why is the Pod not releasing the memory back even after successful reload. The Pod continues to hold the memory until it is restarted.

This is a real-case scenario, as we can have multiple ingresses created over a period of time on a cluster. Meanwhile, the Pod will continue to block the memory it consumed during each re-load ultimately causing the Pod to crash due to OOM.

@longwuyuan
Copy link
Contributor

  • I am not capable of discussions that have similarities with diving down a rabbit hole. So I can mostly discuss around data and prefer to avoid theories as the ingress-controller code is clearly capable of some things and incapable of other things

  • Based on the reproduce procedure you typed, the behaviour observed is expected

  • Why memory is not released can not be answered because there is no data on where the memory is used

  • Your expectation is understood.

  • I would gladly love to be proved wrong when I say that there is not a single software out there, that will allocate and also release memory, at your expected rate, while processing a infinite for loop, without sleep, at the speed of recent multicore Intel/AMD CPUs, processing that infinite for loop on linux, and malloc'ing the exact same amount of memory, for exactly the same functionality as the ingress-nginx controller

  • That means someone needs to compare other ingress-controllers with exact same use-case and see what happens

  • I will be happy to contribute or comment, if your reproduce procedure can get be more real-world

  • We fixed a performance issue some months ago and I don't see those patterns here. Also, we are not getting this report from users of different use cases. Hence I think we need to get more precise and detailed on the reproduce steps

@venkatmarella
Copy link

@rmathagiarun @longwuyuan we see the same behavior after upgrading the ingress helm chart to 4.0.17.

@longwuyuan
Copy link
Contributor

I am not certain how to proceed. We need a step-by-step procedure to reproduce the problem and then gather information as to where the memory is getting used.

@bmv126
Copy link

bmv126 commented Apr 14, 2022

@rmathagiarun
Can you try to disable the metrics by setting
--enable-metrics=false

You can use kubectl top to check pod memory usage

@rmathagiarun
Copy link

@bmv126 We already tried this and it didn't work out. I can see few others have also reported this issue, but, the response from community always seems to not to agree that Nginx has an issue during backend reload.

Alternatively, We are in the process of migrating to envoy based ingress controllers.

@longwuyuan
Copy link
Contributor

@rmathagiarun backend reload is a event that would occur on vanilla nginx as well as the nginx component of the ingress-nginx-controller and that implies that based on the size of the data that needs to be be read and reloaded, has a direct relevance to the impact on the usage of cpu/memory resources, which impacts the user experience ultimately.

If you add configuration to vanilla nginx, without Kubernetes, in a infinite while true loop, with each loop adding a a new virtual host with custom directives, I am sure you will experience the same issue.

Hence, the deep dive into a reproduce procedure becomes critically important to make progress here.

The project faced performance issues earlier, and that was addressed by clearly arriving at a reproduce procedure. We even got core dumps, created during those reproduce efforts.

@ramanNarasimhan77
Copy link

ramanNarasimhan77 commented Apr 25, 2022

@longwuyuan would it be possible to add the steps to get all the core dumps, attaching debuggers to get traces in the Troubleshooting Guide so that people facing such problems can give the community adequate information to debug and analyze?

@longwuyuan
Copy link
Contributor

longwuyuan commented Apr 25, 2022

@ramanNarasimhan77 I think that requires some deep dive work. It will help someone if they don't know how to get info out of a core file. But not sure if one doc here will apply to all scenarios.

But this is not a trivial process. I think that someone taking up this task will either already know how to work with core files, and very likely have some dev skills in C/C++ etc.

On a different note, the reproduce procedure described earlier was generating ingress objects in a while loop, with no sleep. That seems to far away from real use case. If we can come up with a practical & real use case test, it could help make progress. One observation, rather obvious, is that if there is a bug, then many people report experiences related to it and personally I don't see several reports of this high memory usage. That is why my comments on coming up with a practical real use test.

I did a search on the issue list and found at least one issue which describes some steps on how others worked on core files. For example #7080 . There are other issues as well.

@moraesjeremias
Copy link

Hey guys,
Thank you for all information provided in this issue!
Currently we're facing same high memory consumption issue reported by @rmathagiarun, @ParagPatil96 and @pdefreitas (in #8362). There's a clear memory leak pattern in 2 different kubernetes clusters and both are running on top of:

NGINX Ingress controller
Release: v1.1.1
Build: a17181e
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.19.9
Helm Chart version: v4.0.17

Cluster 1 Version:

Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.16-eks-25803e", GitCommit:"25803e8d008d5fa99b8a35a77d99b705722c0c8c", GitTreeState:"clean", BuildDate:"2022-02-16T23:37:16Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}

Cluster 2 Version:

Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.9", GitCommit:"9dd794e454ac32d97cde41ae10be801ae98f75df", GitTreeState:"clean", BuildDate:"2021-04-05T13:26:12Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}

Find below average memory consumption within a 4 day period. Peaks in both graphs represents each NGINX config reload, while baseline growing indicates a memory leak.

Cluster 1 Average Memory Consumption

Cluster 1 Mem Consumption

Cluster 2 Average Memory Consumption

Cluster 2 Mem Consumption

Just wanted to highlight this while we get more info as mentioned by @longwuyuan. If anyone has found some reasonable explanation on this, could you please update us?

Thank you for all your support!

@odinsy
Copy link

odinsy commented Nov 24, 2022

@procinger in my case memory leak happens when RPS is over 20k RPS (it's not a workload for node with 16vCPU/32GB RAM)
I've found two things

  • /nginx-ingress-controller process leaks
  • the reason is that metrics can't be exposed in time (next one request from prometheus came before previous one was processed, and established connections count to prometheus-nginx.socket are growing (just checked ss -m / netstat -anp)
    I've tried to reduce established timeout, but it had no effect.

I found similar issue, but it was an old version of controller
#7438

Tried three last versions of controller 1.5.1, 1.4.0, 1.3.1 and had this problem on each.
And I don't use modesecurity or other security features.

@bo-tu
Copy link

bo-tu commented Dec 15, 2022

The information posted in this issue is not extremely productive or helpful, in the context of the problem description. For example, one way to produce this problem is in a for loop. If you send json payload for whatever objective in a for loop, without even a second of sleep, then CPU/Memory usage to handle that kind of train of events is expected.

The developers are already aware of one scenario where the volume of change is unusually high, like thousands of ingress objects. This leads to performance issues during an event of reloading the nginx.conf .

So in this issue, it helps to track that some users have a performance problem but the progress is going to be really slow because the information precision is lacking. The description of the problem to be solved and some sort of a reasonable reproduce procedure is a much needed aspect here.

Hello @longwuyuan ,

We're experiencing the same memory issue. I saw the same oom-kill message when running dmesg in the nginx-ingress-controller pod.

There are 1453 ingress objects in our kubernetes clusters(v 1.22), and we are using ingress-nginx(chart v4.1.0, app version v1.2.0)

You mentioned The developers are already aware of one scenario where the volume of change is unusually high, like thousands of ingress objects. This leads to performance issues during an event of reloading the nginx.conf, just wondering if there is any existing github issue to track it? Is there any workaround for this problem before having a fix? Thanks a lot

Best regards,
Bo

@longwuyuan
Copy link
Contributor

There are issues. I have to search for the numbers and find out if they are open or closed.

One one side, there is hope that nginx upstream is working on the reload problem.
In this project, there is work in progress to split control-plane and data-plane. After that is done, there will an opportunity to approach the reload problem differently.

@bo-tu
Copy link

bo-tu commented Dec 16, 2022

There are issues. I have to search for the numbers and find out if they are open or closed.

Maybe this one - #7234 (comment)?
Because the current strategy is to render the whole configuration file, so if there are many ingress resources, the configuration will be very large and frequent changes can cause excessive consumption.

@max-rocket-internet
Copy link

We are also seeing some very high memory usage with large spikes when the controllers reload.

We have these Ingress resources in our cluster:

  • 11x with 1 host 81 paths
  • 11x with 1 host 33 paths
  • About 100 others that are quite simple with 1 host and 1-4 paths

There is almost no traffic as this is not a live environment, less than 30 RPS.

The controller pods consume about 2-3GB of memory normally, which seems high as it is, but when an Ingress resource is updated and a controller reload is triggered, memory usage will go well over 10GB and then we have OOMKills.

We are using the bitnami Helm chart with docker image docker.io/bitnami/nginx-ingress-controller:1.6.0-debian-11-r11.

Are these memory usage figures normal? Or are we seeing the issue described here?

@longwuyuan
Copy link
Contributor

  • This project does not test the bitnami image or the bitnami chart in the CI so its not practical to discuss that here. Please re-open the issue if you find a problem or a bug data instance in this project's manifests or iamges
  • It would be interesting to know if you had multiple ingress objects with the same host field value but different path values. That would trigger diff limited to the change you make
  • Reloading large configs is a upstream nginx related known problem. There is work in progress to split control-plane and data-plane to see if such issues can be addressed

/remove-kind bug
/close

@k8s-ci-robot
Copy link
Contributor

@longwuyuan: Those labels are not set on the issue: kind/bug

In response to this:

  • This project does not test the bitnami image or the bitnami chart in the CI so its not practical to discuss that here. Please re-open the issue if you find a problem or a bug data instance in this project's manifests or iamges
  • It would be interesting to know if you had multiple ingress objects with the same host field value but different path values. That would trigger diff limited to the change you make
  • Reloading large configs is a upstream nginx related known problem. There is work in progress to split control-plane and data-plane to see if such issues can be addressed

/remove-kind bug
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

@longwuyuan: Closing this issue.

In response to this:

  • This project does not test the bitnami image or the bitnami chart in the CI so its not practical to discuss that here. Please re-open the issue if you find a problem or a bug data instance in this project's manifests or iamges
  • It would be interesting to know if you had multiple ingress objects with the same host field value but different path values. That would trigger diff limited to the change you make
  • Reloading large configs is a upstream nginx related known problem. There is work in progress to split control-plane and data-plane to see if such issues can be addressed

/remove-kind bug
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@max-rocket-internet
Copy link

@longwuyuan

This project does not test the bitnami image or the bitnami chart in the CI so its not practical to discuss that here

100% understandable 💙 I will try to test it with main chart and image.

Reloading large configs is a upstream nginx related known problem

Can you give me a link for this known issue?

It would be interesting to know if you had multiple ingress objects with the same host field value but different path values

There's a few like this, about 10.

@longwuyuan
Copy link
Contributor

/re-open

I will have to search for the issues. But I want to avoid hard quoting. Reason is that the details differ and makes it difficult to have conversations that are practical. The gist though, is that if you have a vanilla nginx server, without kubernetes, and a config file that was relatively large like huge Megs, then reloading that config file, with multiple factors involved (containerization, platforming of some sort, live traffic, established connections, cpu/memory availability for bursts etc etc), you would observe similar delays in reload. And then if you factor in state mainenance of platform like kubernetes, you add another oddity.

At least one issue we have dealt with in the past, was related to a bug outside kubernetes. That was worked on by the developer of the related project. Can't recall if the buggy component was lua or nginx or something else. A search of issues will take effort but yield that info.

Hence all such performance issues make for 2 aspect of diligence. One is to triage until the smallest possible detail is clear on root-cause, like a bug or a resource crunch or something else. On the other hand some work is being done on splitting control-plane and data-plane, that will bring in design changes to the workflow and hence, improve things for sure, at least from instrumentation angle.

@longwuyuan
Copy link
Contributor

/reopen

I was talking about 33 ingress resources and 88 ingresses resources where you have multiple paths in one ingress. That would make for a large data-structure to be re-read even if one path was changed. Saying that event will be different from one small ingress with one small path being-re-read.

@k8s-ci-robot
Copy link
Contributor

@longwuyuan: Reopened this issue.

In response to this:

/reopen

I was talking about 33 ingress resources and 88 ingresses resources where you have multiple paths in one ingress. That would make for a large data-structure to be re-read even if one path was changed. Saying that event will be different from one small ingress with one small path being-re-read.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Jan 24, 2023
@max-rocket-internet
Copy link

@longwuyuan

Thank you very much for taking the time to reply, it's very much appreciated, I know supporting such a popular project must be a lot of work 🙂

But I want to avoid hard quoting

What is "hard quoting"?

and a config file that was relatively large like huge Megs

Ours is 64MB. Is that huge?

you would observe similar delays in reload

Delays are fine. There is free CPU for the controllers, many of them, but each controller pod is not consuming more than 1 CPU core during reload.

And then if you factor in state mainenance of platform like kubernetes, you add another oddity.

Sure. We are just wanting perhaps some steps for how we can debug it further ourselves.

On the other hand some work is being done on splitting control-plane and data-plane

Do you mean in #8034 ?

I was talking about 33 ingress resources and 88 ingresses resources where you have multiple paths in one ingress. That would make for a large data-structure to be re-read even if one path was changed

OK well perhaps some basic debugging we can do is just delete half of them and see if behaviour changes.

@odinsy
Copy link

odinsy commented Jan 25, 2023

i've disabled metrics, but still have memory leaks on RPS > 20k
We have only 4 Ingress resources
image
our config
image

@Volatus
Copy link
Contributor

Volatus commented Jan 26, 2023

@odinsy @max-rocket-internet @procinger @bo-tu @moraesjeremias @ramanNarasimhan77 @rmathagiarun @bmv126 @venkatmarella

This issue should largely be resolved in the next release of Ingress NGINX as #9330 made it so that the bundled version of ModSecurity points to a fixed commit which has the memory leak fix but has not been released yet.

The release has been delayed due to a breaking change, but should be rolling out not too late from now. Hope that helps.

@odinsy
Copy link

odinsy commented Feb 19, 2023

Switched to v1.6.4, looks better
image

@longwuyuan
Copy link
Contributor

@odinsy thanks for update. Seems an important update.

@Volatus
Copy link
Contributor

Volatus commented Feb 20, 2023

v1.6.4 should largely fix this issue. Closing.

/close

@k8s-ci-robot
Copy link
Contributor

@Volatus: Closing this issue.

In response to this:

v1.6.4 should largely fix this issue. Closing.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mKeRix
Copy link

mKeRix commented Feb 28, 2023

Just to add some more details for those finding this thread in the future: in our case (see what @max-rocket-internet posted above), the memory issues weren't from modsecurity, as we don't have that enabled. In our metrics we observed correlations between the memory spikes and the number of worker processes. Since nginx reloads will leave old processes hanging in graceful shutdown for a little while this is to be expected. The usage spikes got especially bad when multiple reloads were triggered in short succession of each other (e.g. through deploying multiple ingress changes in short intervals).

We ended up changing two settings which ended up making memory use much less spiky, which have been running smoothly for about a month now. They are:

  1. We explicitly configured config.worker-processes as 8, as some of our nodes would spawn an (for our purposes) overkill amount of processes due to their high CPU core counts.
  2. We tuned extraArgs.sync-rate-limit to leave a larger gap between syncs, to give the old processes some time to bleed out. We are currently running 0.008, which should allow 1 reload per 125 seconds. The meaning of this value is explained here. For us, the delay in applying new paths/configs if they are fired off in short succession was much more preferable than having crashing controller pods.

@JosefWN
Copy link

JosefWN commented May 26, 2023

We explicitly configured config.worker-processes as 8, as some of our nodes would spawn an (for our purposes) overkill amount of processes due to their high CPU core counts.

worker_processes 96;
kubectl top po -n ingress-nginx
NAME                             CPU(cores)   MEMORY(bytes)   
ingress-nginx-controller-k7nts   11m          1438Mi 

In the same boat, tried 8 workers just like you:

worker_processes 8;
kubectl top po -n ingress-nginx                                                                                                                                                          master
NAME                             CPU(cores)   MEMORY(bytes)   
ingress-nginx-controller-k7nts   3m           167Mi 

Oh lala! 😏

@fiskhest
Copy link

fiskhest commented Jun 16, 2023

Just to add some more details for those finding this thread in the future: in our case (see what @max-rocket-internet posted above), the memory issues weren't from modsecurity, as we don't have that enabled. In our metrics we observed correlations between the memory spikes and the number of worker processes. Since nginx reloads will leave old processes hanging in graceful shutdown for a little while this is to be expected. The usage spikes got especially bad when multiple reloads were triggered in short succession of each other (e.g. through deploying multiple ingress changes in short intervals).

We ended up changing two settings which ended up making memory use much less spiky, which have been running smoothly for about a month now. They are:

1. We explicitly configured `config.worker-processes` as `8`, as some of our nodes would spawn an (for our purposes) overkill amount of processes due to their high CPU core counts.

2. We tuned `extraArgs.sync-rate-limit` to leave a larger gap between syncs, to give the old processes some time to bleed out. We are currently running `0.008`, which should allow 1 reload per 125 seconds. The meaning of this value is explained [here](https://pkg.go.dev/golang.org/x/time/rate#Limit). For us, the delay in applying new paths/configs if they are fired off in short succession was much more preferable than having crashing controller pods.

We had changes to our environment so the physical nodes were being rolled with more CPU cores, and no explicit configuration of worker_processes was defined, defaulting us to the auto value which would assign based on CPU cores. All of a sudden we see ingress-nginx consuming more than double the memory for the same load. This explanation helped us understand and resolve the issue, thanks! 🙏

@cep21
Copy link
Contributor

cep21 commented Dec 13, 2023

For people that find this in the future: We tried updating sync-rate-limit to reduce syncs and reduce memory usage, but ran into issues where frequently restarting pod deployments would result in 5xx errors during the pod restarts until the sync finishes. It's not just on ingress changes (which aren't frequent for us), but pod restarts as well. We think we narrowed it down to these locations:

The dynamic LUA is maybe also not updating backends frequently enough with this setting?

@maximumG
Copy link

maximumG commented Feb 6, 2024

For people that find this in the future: We tried updating sync-rate-limit to reduce syncs and reduce memory usage, but ran into issues where frequently restarting pod deployments would result in 5xx errors during the pod restarts until the sync finishes.

Ingress-nginx watches for K8S Endpoints to get the real Pod IP instead of using the K8S ClusterIP as upstream server, and for many reasons.

If you are using only basic ingress-nginx features, I think you can avoid those 5XX errors by telling ingress-nginx to use the K8S ClusterIP as upstream instead of watching for every Endpoints : https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#service-upstream

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/stabilization Work for increasing stabilization of the ingress-nginx codebase kind/support Categorizes issue or PR as a support question. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

No branches or pull requests