Skip to content

AWS LBC performance improvement #4240

Open
@oliviassss

Description

@oliviassss

Describe the feature you are requesting

Performance improve
Motivation

Running LBC in large scale cluster and stress testing its performance.
Describe the proposed solution you'd like

  1. When I tested LBC to provision 50 LBs with 1k targets in each, I observed Mem spike to upto 7GiB. We need to do further profiling on Memory and optimization if possible. Suspect it's the pod cache
  2. The LBC is listing all pods during startup, and the init list call cannot be paginated due to an known limit from k8s api server side. There's a workaround to limit the watch namespace via --watch-namespace flag, and the init list call will only list pods within the specified ns. But this flag only support 1 ns, as upstream controller-runtime supports multiple watch namespace, we need to improve the flag to support multiple namespaces as well.
  3. The service reconciler is watching and caching all types of service, like ClusterIP, NodePort and LoadBalancer, which is unnecessary for large scale cluster if they have thousands of service objects other than LoadBalancer type. We need to investigate to improve the service reconciler to only watch for LoadBalanacer type.
    Describe alternatives you've considered

Contribution Intention (Optional)

-[ ] Yes, I am willing to contribute a PR to implement this feature
-[ ] No, I cannot work on a PR at this time

Dashboard monitored for reference.
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions