Skip to content

Bug: Reconciler error during application pod rollout when using TargetGroupBinding with spec.targetGroupName #4438

@yukin01

Description

@yukin01

Bug Description

Using TargetGroupBinding with spec.targetGroupName works initially, but the controller fails to reconcile target changes during application pod rollouts (e.g., kubectl rollout restart deployment).

This results in a ValidationError: A target group ARN must be specified (likely from DescribeTargetHealth), suggesting the controller fails to resolve the ARN from targetGroupName during this specific reconciliation path.

Steps to Reproduce

  1. Create an AWS TargetGroup (e.g., my-existing-tg) manually via AWS Console or CLI.
  2. Apply a Service (e.g., my-service) and a Deployment (e.g., my-app) for the application.
  3. Apply a TargetGroupBinding (e.g., my-tgb) that references my-service and uses spec.targetGroupName: my-existing-tg.
  4. Observe that the initial binding succeeds. The controller correctly finds the application pods and registers them as targets in my-existing-tg. Traffic flows correctly.
  5. Force a rollout/restart of the application deployment (e.g., kubectl rollout restart deployment/my-app -n my-app).
  6. Observe the logs of the aws-load-balancer-controller. As the controller attempts to deregister old pods and register new pods, it fails.

Expected Behavior

When application pods are rolled out, the controller should correctly identify the TargetGroup (using spec.targetGroupName to find its ARN) and perform RegisterTargets/DeregisterTargets (or update health checks) using the resolved ARN.

Actual Behavior

During application pod rollouts, the controller reconciliation fails. The following ValidationError is logged consistently:

operation error Elastic Load Balancing v2: DescribeTargetHealth, https response error StatusCode: 400, RequestID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, api error ValidationError: A target group ARN must be specified
  • How often does this bug occur? Always (on application pod rollout).

Regression
N/A

Current Workarounds

Using spec.targetGroupARN instead of spec.targetGroupName

Environment

  • AWS Load Balancer controller version: v2.14.1
  • Kubernetes version: v1.31
  • Using EKS (yes/no), if so version?: Yes, v1.31, platform version: eks.21
  • Using Service or Ingress: Service (with TargetGroupBinding)
  • AWS region: ap-northeast-1
  • How was the aws-load-balancer-controller installed: Helm
    • If helm was used... helm get values
    clusterName: my-cluster
    serviceAccount:
      annotations:
        eks.amazonaws.com/role-arn: arn:aws:iam::<my-account-id>:role/aws-load-balancer-controller
      name: aws-load-balancer-controller
    

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions