Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory consumption for Kubernetes source configuration #1425

Merged
merged 3 commits into from
Apr 3, 2024

Conversation

pkosiec
Copy link
Member

@pkosiec pkosiec commented Apr 2, 2024

Description

Changes proposed in this pull request:

  • Reduce memory consumption for Kubernetes source configuration

This change groups Kubernetes source configurations by different kubeconfigs and creates a single dynamicInformerFactory per group, which essentially brings the implementation close to the 0.18 Kubernetes source one.

Stats

Memory consumption shortly after startup:

Description v1.9.1 This PR
No sources enabled 21.4 MiB 23.1 MiB
1 source (k8s-all-events2 from the PR instruction) 78.9MiB 86.6 MiB
12 sources (from the PR instruction) 504MB 92.9 MiB

Screenshots

v1.9.1

12 sources:

256Mi request, 384MB limit

Too low, source plugin restarts itself.

Screenshot 2024-04-02 at 19 11 11 Screenshot 2024-04-02 at 19 11 17

Screenshot 2024-04-02 at 19 11 04

1Gi request + 2Gi request

Plugin runs withourt restarting, still multiple DeltaFIFO logs

image

image

image

1 source (k8s-all-events2 from the testing instruction)

image

No sources enabled

image

After changes in this PR:

12 Sources (from the testing instruction)

Requests 128Mi, limit 256Mi:

Screenshot 2024-04-03 at 10 43 05

Screenshot 2024-04-03 at 10 43 20

No DeltaFIFO logs, just a warning about client-side throttling:

{"level":"error","logger":"stderr","msg":"I0402 17:30:32.285424      13 request.go:697] Waited for 1.198321417s due to client-side throttling, not priority and fairness, request: GET:https://10.43.0.1:443/apis/networking.k8s.io/v1/ingresses?limit=500\u0026resourceVersion=0","plugin":"botkube/kubernetes","time":"2024-04-02T17:30:32Z"}

And Slack API rate limits:

{"level":"error","msg":"while sending bot message: 1 error occurred:\n\t* while sending Slack message to channel \"priv-channel\": while posting Slack message: slack rate limit exceeded, retry after 1s","time":"2024-04-03T07:51:21Z"}
{"level":"error","msg":"while sending bot message: 1 error occurred:\n\t* while sending Slack message to channel \"priv-channel\": while posting Slack message: slack rate limit exceeded, retry after 1s","time":"2024-04-03T07:51:21Z"}

1 source (k8s-all-events2 from the testing instruction)

Screenshot 2024-04-03 at 10 37 07

No sources enabled

image

Testing

Check out this PR:

gh pr checkout 1425

Create k3d cluster

k3d cluster create

Build and serve plugins:

make build-plugins
PLUGIN_SERVER_HOST=http://host.k3d.internal go run ./hack/target/serve-plugins/main.go

Install kube-prometheus-stack:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prom prometheus-community/kube-prometheus-stack

Forward Grafana port:

kubectl port-forward svc/prom-grafana 3003:80

Navigate to http://localhost:3003 and log in with admin / prom-operator credentials.

Install ArgoCD + Flux on the cluster:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
kubectl apply -f https://github.com/fluxcd/flux2/releases/latest/download/install.yaml

Install previous Botkube

Install previous Botkube version with the following configuration (fill the Slack tokens):

Details

cat <<EOF > /tmp/prev-values.yaml
settings:
  clusterName: "prev"

sources:
  'k8s-all-events2':
    displayName: "All events"
    botkube/kubernetes: &k8s-all-events
      context:
        rbac:
          group:
            type: Static
            prefix: ""
            static:
              values: [ "botkube-plugins-default" ]
      enabled: true
      config:
        namespaces:
          include:
            - ".*"
        event:
          types:
            - error
            - create
            - update
            - delete
        resources:
          - type: apiextensions.k8s.io/v1/customresourcedefinitions
          - type: argoproj.io/v1alpha1/applications
          - type: argoproj.io/v1alpha1/appprojects
          - type: notification.toolkit.fluxcd.io/v1beta1/alerts
          - type: source.toolkit.fluxcd.io/v1beta1/buckets
          - type: source.toolkit.fluxcd.io/v1beta1/gitrepositories
          - type: source.toolkit.fluxcd.io/v1beta1/helmcharts
          - type: helm.toolkit.fluxcd.io/v2beta1/helmreleases
          - type: source.toolkit.fluxcd.io/v1beta1/helmrepositories
          - type: image.toolkit.fluxcd.io/v1beta1/imagepolicies
          - type: image.toolkit.fluxcd.io/v1beta1/imagerepositories
          - type: image.toolkit.fluxcd.io/v1beta1/imageupdateautomations
          - type: kustomize.toolkit.fluxcd.io/v1beta1/kustomizations
          - type: source.toolkit.fluxcd.io/v1beta2/ocirepositories
          - type: notification.toolkit.fluxcd.io/v1beta1/providers
          - type: notification.toolkit.fluxcd.io/v1beta1/receivers
          - type: v1/services
          - type: networking.k8s.io/v1/ingresses
          - type: v1/nodes
            event:
              message:
                exclude:
                  - ".*nf_conntrack_buckets.*" # Ignore node related noisy messages from GKE clusters
          - type: v1/namespaces
          - type: v1/persistentvolumes
          - type: v1/persistentvolumeclaims
          - type: v1/configmaps
          - type: rbac.authorization.k8s.io/v1/roles
          - type: rbac.authorization.k8s.io/v1/rolebindings
          - type: rbac.authorization.k8s.io/v1/clusterrolebindings
          - type: rbac.authorization.k8s.io/v1/clusterroles
          - type: apps/v1/daemonsets
            event: # Overrides 'source'.kubernetes.event
              types:
                - create
                - update
                - delete
                - error
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec.containers[*].image
                - status.numberReady
          - type: batch/v1/jobs
            event: # Overrides 'source'.kubernetes.event
              types:
                - create
                - update
                - delete
                - error
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec.containers[*].image
                - status.conditions[*].type
          - type: apps/v1/deployments
            event: # Overrides 'source'.kubernetes.event
              types:
                - create
                - update
                - delete
                - error
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec.containers[*].image
                - status.availableReplicas
          - type: apps/v1/statefulsets
            event: # Overrides 'source'.kubernetes.event
              types:
                - create
                - update
                - delete
                - error
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec.containers[*].image
                - status.readyReplicas   
  'k8s-all-events3':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events4':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events      
  'k8s-all-events5':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events6':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events7':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events8':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events9':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events10':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events11':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events12':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events    
communications:
  'default-group':
    socketSlack:
      enabled: true
      appToken: "xapp-..."
      botToken: "xoxb-..."
      channels:
        'default':
          name: 'priv-channel'
          bindings:
            sources:
              - k8s-err-events
              - k8s-recommendation-events
              - k8s-err-with-logs-events
              - k8s-create-events
              - k8s-all-events
              - k8s-all-events2
              - k8s-all-events3
              - k8s-all-events4
              - k8s-all-events5
              - k8s-all-events6
              - k8s-all-events7
              - k8s-all-events8
              - k8s-all-events9
              - k8s-all-events10
              - k8s-all-events11
              - k8s-all-events12              
resources:
  limits:
    cpu: 500m
    memory: 1Gi
  requests:
    cpu: 200m
    memory: 512Mi
EOF

botkube install --version v1.9.1 -f /tmp/prev-values.yaml

In the Grafana dashboard, navigate to the 1. Kubernetes / Compute Resources / Pod dashboard and select the Botkube pod. Observe the memory consumption.

Install current Botkube

Install current Botkube version with the following configuration (fill the Slack tokens):

Details

cat <<EOF > /tmp/new-values.yaml 
image:
  repository: kubeshop/pr/botkube
  tag: 1425-PR

settings:
  clusterName: "new"

plugins:
  repositories:
    botkube:
       url: http://host.k3d.internal:3010/botkube.yaml

sources:
  'k8s-all-events2':
    displayName: "All events"
    botkube/kubernetes: &k8s-all-events
      context:
        rbac:
          group:
            type: Static
            prefix: ""
            static:
              values: [ "botkube-plugins-default" ]
      enabled: true
      config:
        namespaces:
          include:
            - ".*"
        event:
          types:
            - error
            - create
            - update
            - delete
        resources:
          - type: apiextensions.k8s.io/v1/customresourcedefinitions
          - type: argoproj.io/v1alpha1/applications
          - type: argoproj.io/v1alpha1/appprojects
          - type: notification.toolkit.fluxcd.io/v1beta1/alerts
          - type: source.toolkit.fluxcd.io/v1beta1/buckets
          - type: source.toolkit.fluxcd.io/v1beta1/gitrepositories
          - type: source.toolkit.fluxcd.io/v1beta1/helmcharts
          - type: helm.toolkit.fluxcd.io/v2beta1/helmreleases
          - type: source.toolkit.fluxcd.io/v1beta1/helmrepositories
          - type: image.toolkit.fluxcd.io/v1beta1/imagepolicies
          - type: image.toolkit.fluxcd.io/v1beta1/imagerepositories
          - type: image.toolkit.fluxcd.io/v1beta1/imageupdateautomations
          - type: kustomize.toolkit.fluxcd.io/v1beta1/kustomizations
          - type: source.toolkit.fluxcd.io/v1beta2/ocirepositories
          - type: notification.toolkit.fluxcd.io/v1beta1/providers
          - type: notification.toolkit.fluxcd.io/v1beta1/receivers
          - type: v1/services
          - type: networking.k8s.io/v1/ingresses
          - type: v1/nodes
            event:
              message:
                exclude:
                  - ".*nf_conntrack_buckets.*" # Ignore node related noisy messages from GKE clusters
          - type: v1/namespaces
          - type: v1/persistentvolumes
          - type: v1/persistentvolumeclaims
          - type: v1/configmaps
          - type: rbac.authorization.k8s.io/v1/roles
          - type: rbac.authorization.k8s.io/v1/rolebindings
          - type: rbac.authorization.k8s.io/v1/clusterrolebindings
          - type: rbac.authorization.k8s.io/v1/clusterroles
          - type: apps/v1/daemonsets
            event: # Overrides 'source'.kubernetes.event
              types:
                - create
                - update
                - delete
                - error
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec.containers[*].image
                - status.numberReady
          - type: batch/v1/jobs
            event: # Overrides 'source'.kubernetes.event
              types:
                - create
                - update
                - delete
                - error
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec.containers[*].image
                - status.conditions[*].type
          - type: apps/v1/deployments
            event: # Overrides 'source'.kubernetes.event
              types:
                - create
                - update
                - delete
                - error
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec.containers[*].image
                - status.availableReplicas
          - type: apps/v1/statefulsets
            event: # Overrides 'source'.kubernetes.event
              types:
                - create
                - update
                - delete
                - error
            updateSetting:
              includeDiff: true
              fields:
                - spec.template.spec.containers[*].image
                - status.readyReplicas   
  'k8s-all-events3':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events4':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events      
  'k8s-all-events5':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events6':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events7':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events8':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events9':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events10':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events11':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events
  'k8s-all-events12':
    displayName: "All events"
    botkube/kubernetes: *k8s-all-events    
communications:
  'default-group':
    socketSlack:
      enabled: true
      appToken: "xapp-..."
      botToken: "xoxb-..."
      channels:
        'default':
          name: 'priv-channel'
          bindings:
            sources:
              - k8s-err-events
              - k8s-recommendation-events
              - k8s-err-with-logs-events
              - k8s-create-events
              - k8s-all-events
              - k8s-all-events2
              - k8s-all-events3
              - k8s-all-events4
              - k8s-all-events5
              - k8s-all-events6
              - k8s-all-events7
              - k8s-all-events8
              - k8s-all-events9
              - k8s-all-events10
              - k8s-all-events11
              - k8s-all-events12                
resources:
  limits:
    cpu: 200m
    memory: 256Mi
  requests:
    cpu: 100m
    memory: 128Mi
EOF

botkube install --version v1.9.1 -f /tmp/new-values.yaml

In the Grafana dashboard, navigate to the 1. Kubernetes / Compute Resources / Pod dashboard and select the Botkube pod. Observe the memory consumption.

Related issue(s)

Resolves #1403

@pkosiec pkosiec added enhancement New feature or request wip labels Apr 2, 2024
@pkosiec pkosiec requested a review from mszostok April 3, 2024 09:06
@pkosiec pkosiec marked this pull request as ready for review April 3, 2024 09:07
@pkosiec pkosiec requested review from PrasadG193 and a team as code owners April 3, 2024 09:07
@pkosiec pkosiec removed the wip label Apr 3, 2024
Copy link
Contributor

@mszostok mszostok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the code comments. I will approve it once I will execute it on my end 👍

internal/source/dispatcher.go Show resolved Hide resolved
internal/source/scheduler.go Outdated Show resolved Hide resolved
internal/source/kubernetes/configuration_store.go Outdated Show resolved Hide resolved
internal/source/kubernetes/configuration_store.go Outdated Show resolved Hide resolved
internal/source/kubernetes/configuration_store.go Outdated Show resolved Hide resolved
internal/source/kubernetes/source.go Show resolved Hide resolved
internal/source/kubernetes/source.go Outdated Show resolved Hide resolved
internal/source/kubernetes/source.go Outdated Show resolved Hide resolved
internal/source/kubernetes/source.go Outdated Show resolved Hide resolved
internal/source/kubernetes/source.go Outdated Show resolved Hide resolved
Copy link
Contributor

@mszostok mszostok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested e2e 👍 LGTM
Screenshot 2024-04-03 at 14 52 34

@pkosiec pkosiec merged commit 2cb5eac into kubeshop:main Apr 3, 2024
15 of 16 checks passed
@pkosiec pkosiec deleted the k8s-src-merge branch April 3, 2024 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Increased resources consumption configuring some sources
2 participants