Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

馃悰 Updates for the controller manager to watch everything #2000

Merged
merged 1 commit into from
Apr 1, 2024

Conversation

waltforme
Copy link
Contributor

Summary

This PR increases the memory limit of the controller manager, and adjusts the verbosity levels for the logging messages from the event handlers.

Changes in this PR are built into quay.io/junatibm/controller-manager:1711835357. I tested the image, with the increased memory limit, in the same environment as #1990. The controller ran for minutes without OOMKilled:

馃憠 Downloads $ oc cluster-info
Kubernetes control plane is running at https://c115-e.us-south.containers.cloud.ibm.com:30829

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
馃憠 Downloads $ oc -n wds2-system get deploy,po
NAME                                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/kubestellar-controller-manager   1/1     1            1           8d
deployment.apps/transport-controller             1/1     1            1           8d

NAME                                                 READY   STATUS    RESTARTS   AGE
pod/kubestellar-controller-manager-56ccb5d76-4l5fz   2/2     Running   0          6m40s
pod/transport-controller-656fcff866-26mgt            1/1     Running   0          8d
馃憠 Downloads $ oc -n wds2-system get po kubestellar-controller-manager-56ccb5d76-4l5fz -oyaml | yq .spec.containers[1]
args:
  - --health-probe-bind-address=:8081
  - --metrics-bind-address=127.0.0.1:8080
  - --leader-elect
  - --wds-name=wds2
  - -v=2
image: quay.io/junatibm/controller-manager:1711835357
imagePullPolicy: IfNotPresent
livenessProbe:
  failureThreshold: 3
  httpGet:
    path: /healthz
    port: 8081
    scheme: HTTP
  initialDelaySeconds: 15
  periodSeconds: 20
  successThreshold: 1
  timeoutSeconds: 1
name: manager
readinessProbe:
  failureThreshold: 3
  httpGet:
    path: /readyz
    port: 8081
    scheme: HTTP
  initialDelaySeconds: 5
  periodSeconds: 10
  successThreshold: 1
  timeoutSeconds: 1
resources:
  limits:
    cpu: 500m
    memory: 1Gi
  requests:
    cpu: 10m
    memory: 64Mi
securityContext:
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL
  runAsUser: 1001110000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
  - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
    name: kube-api-access-5zch7
    readOnly: true
馃憠 Downloads $ 

Also trying to hit #2000 by this PR.

Related issue(s)

Fixes #1990

@kcp-ci-bot kcp-ci-bot added dco-signoff: yes Indicates the PR's author has signed the DCO. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Apr 1, 2024
@@ -455,7 +455,7 @@ func (c *Controller) handleObject(obj any, resource string, eventType string) {
obj = typed.Obj
wasDeletedFinalStateUnknown = true
}
c.logger.Info("Got object event", "eventType", eventType,
c.logger.V(3).Info("Got object event", "eventType", eventType,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make these V levels even higher. At least 4, maybe 5.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to 4.

Copy link
Contributor

@MikeSpreitzer MikeSpreitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@kcp-ci-bot kcp-ci-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 1, 2024
@kcp-ci-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 1b7ea9e682c97e45e4370ee75a65e683b24ea4fb

@ezrasilvera
Copy link
Contributor

/approve

@kcp-ci-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ezrasilvera

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kcp-ci-bot kcp-ci-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 1, 2024
@kcp-ci-bot kcp-ci-bot merged commit 77f7d2a into kubestellar:main Apr 1, 2024
13 checks passed
@waltforme waltforme deleted the watchall branch April 1, 2024 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has signed the DCO. lgtm Indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

bug: Memory usage and logging issue of the controller manager when 'watch all'
4 participants