feat(snd/p2p): SEI_NLB_TARGET_TYPE env var for per-cluster target-type selection#372
Conversation
…e selection Per-pod P2P endpoint Services previously hardcoded aws-load-balancer-nlb-target-type=ip. That's correct on VPC-CNI clusters (prod, dev) where pod IPs are VPC-routable, but breaks on harbor where Cilium cluster-pool uses 100.64.0.0/14 — AWS has no route for that CIDR so an NLB with target-type=ip can never reach the pod. Adds SEI_NLB_TARGET_TYPE (values: "ip" or "instance") on the controller. Read once in cmd/main.go: defaults to "ip" when unset, validated at startup (fails loudly on typo), then passed to the reconciler. The reconciler stamps the value verbatim into the Service annotation and also gates AllocateLoadBalancerNodePorts on it — ip keeps the NodePort range untouched (no NodePort hop), instance leaves the field nil so kube allocates one (the NLB→node→pod hop needs it). Harbor's overlay will set value: instance in a separate platform PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR SummaryMedium Risk Overview
Reviewed by Cursor Bugbot for commit dfc998a. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit dfc998a. Configure here.
| desiredNames := make(map[string]struct{}, group.Spec.Replicas) | ||
| for i := range int(group.Spec.Replicas) { | ||
| desired := generateP2PEndpointService(group, i, r.p2pEndpointHostname(group, i)) | ||
| desired := generateP2PEndpointService(group, i, r.p2pEndpointHostname(group, i), r.NLBTargetType) |
There was a problem hiding this comment.
Envtest reconciler missing NLBTargetType causes test failure
High Severity
The envtest reconciler in suite_test.go (line 202) does not initialize NLBTargetType, so it defaults to "". When reconcileP2PEndpoints passes r.NLBTargetType to generateP2PEndpointService, the annotation aws-load-balancer-nlb-target-type becomes empty string, and the AllocateLoadBalancerNodePorts logic skips the == NLBTargetTypeIP branch (producing nil instead of *false). The envtest at p2p_endpoint_test.go:98-99 asserts the annotation equals "ip" — this assertion will fail, breaking CI.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit dfc998a. Configure here.
PR #372 moved SEI_NLB_TARGET_TYPE default-resolution to cmd/main.go so the reconciler treats Spec.NLBTargetType as canonical. The envtest suite_test.go is a parallel "main" that constructs its own reconciler and never reads the env var — it was left with NLBTargetType="" after the rebase, which stamped an empty target-type annotation on the generated Service and broke TestP2PEndpointP2P_CreateWithTCP_ChildHasAddressAndServiceExists. Apply the same default in the envtest suite so it mirrors cmd/main.go's construction-site invariant. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>


Summary
The per-pod P2P endpoint Service (PR #365) previously hardcoded
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type=ip. That's the right call on VPC-CNI clusters (prod, dev) where pod IPs are VPC-routable — but it's a hard blocker on harbor, where Cilium cluster-pool puts pods in100.64.0.0/14(CGNAT, unreachable from any NLB in a VPC subnet).This PR adds a controller-level env var (
SEI_NLB_TARGET_TYPE) so each overlay picks the right mode without baking cluster-specific logic into the controller.Behaviour
SEI_NLB_TARGET_TYPE=ip(default) — pod IPs registered directly with the target group;AllocateLoadBalancerNodePorts: falseso the limited 30000-32767 NodePort range is preserved.SEI_NLB_TARGET_TYPE=instance— NLB targets the EC2 node at its NodePort;kube-proxy/ Cilium socket-LB does the NodePort→pod rewrite locally.AllocateLoadBalancerNodePortsleft nil so kube allocates one.Invalid SEI_NLB_TARGET_TYPEand exits 1 at startup.Per-cluster value lives in each overlay's
manager-patch.yaml, mirroring howSEI_P2P_ENDPOINT_DOMAIN,SEI_GATEWAY_DOMAIN, etc. are wired today.Validation + default placement
cmd/main.go. The reconciler'sNLBTargetTypefield is canonical — by the time any reconcile runs, the field is always eitheriporinstance.Files
cmd/main.go— env read + validation + default fill-in.internal/controller/nodedeployment/controller.go— newNLBTargetTypefield onSeiNodeDeploymentReconciler.internal/controller/nodedeployment/p2p_endpoint.go— exported constantsNLBTargetTypeIP,NLBTargetTypeInstance,DefaultNLBTargetType.generateP2PEndpointServicetakes target-type as a parameter;AllocateLoadBalancerNodePortsgated on it.internal/controller/nodedeployment/p2p_endpoint_test.go— existingiptest asserts NodePort allocation disabled; newTestGeneratePublishableService_InstanceTargetTypeasserts theinstanceshape (annotation + nil-NodePort-allocation).Downstream
SEI_NLB_TARGET_TYPE: instancetoclusters/harbor/sei-k8s-controller/manager-patch.yaml. After that lands and Flux applies, harbor's publishable-P2P Services will emittarget-type: instanceand the per-pod NLB → node:NodePort → Cilium-socket-LB → pod path becomes viable.ip(current behaviour).One-way doors
SEI_NLB_TARGET_TYPE. Once any overlay sets it, renaming is painful. Convention-matched to existingSEI_*env vars.Test plan
instance, NodePort allocated.ip→ existing behaviour).🤖 Generated with Claude Code