Skip to content

Nlb service cf#3791

Closed
ntner wants to merge 3 commits into
masterfrom
nlb-service-cf
Closed

Nlb service cf#3791
ntner wants to merge 3 commits into
masterfrom
nlb-service-cf

Conversation

@ntner
Copy link
Copy Markdown
Contributor

@ntner ntner commented Apr 18, 2026

What is the feature/update/fix?

Feature: Network Load Balancer Support for V2 Racks

We've added Network Load Balancer (NLB) support to V2 racks. Until now, every app and service on a V2 rack terminated traffic at the shared Application Load Balancer, which is HTTP/HTTPS only. With this release you can opt a rack into a Layer 4 path — public, internal, or both — and expose selected services through per-port NLB listeners declared directly in convox.yml. TLS termination, per-NLB security groups with a CIDR allowlist, cross-zone load balancing, preserve-client-IP, and deletion protection are all supported as rack-wide defaults, with per-port overrides available on individual services.

The feature ships in two layers: rack-level infrastructure you opt into with convox rack params set, and per-service listeners you declare in each app's convox.yml.

Rack parameters (all default No / 0.0.0.0/0; existing racks unaffected until opted in):

Parameter Default Purpose
NLB No Provisions a public NLB with per-AZ Elastic IPs
NLBInternal No Provisions an internal NLB (requires Internal=Yes)
NLBAllowCIDR 0.0.0.0/0 CIDR allowlist for the public NLB SG (up to 5, comma-separated)
NLBInternalAllowCIDR (VPC CIDR) CIDR allowlist for the internal NLB SG (up to 5, comma-separated)
NLBCrossZone No Enables cross-zone load balancing on the public NLB
NLBInternalCrossZone No Enables cross-zone load balancing on the internal NLB
NLBPreserveClientIP No Forwards real client IP to public NLB target tasks
NLBInternalPreserveClientIP No Forwards real client IP to internal NLB target tasks
NLBDeletionProtection No Prevents accidental deletion of the public NLB
NLBInternalDeletionProtection No Prevents accidental deletion of the internal NLB

Per-service fields on services[*].nlb[*] in convox.yml:

Field Type Required Notes
port int (1-65535) yes The port exposed on the NLB
protocol tcp | tls no (default tcp) TLS terminates at the NLB; target group stays TCP
containerPort int (1-65535) no (default = port) The container port the NLB forwards to
scheme public | internal no (default public) Selects which rack NLB the listener attaches to
certificate ACM or IAM ARN required when tls Pre-flight-checked at release promote; must be in the rack's region
cross_zone bool no (inherits rack) Per-listener override of NLBCrossZone / NLBInternalCrossZone
allow_cidr list of CIDRs no (inherits rack) Per-listener ingress, layered on top of the rack allowlist
preserve_client_ip bool no (inherits rack) Per-listener override of NLBPreserveClientIP / internal equivalent

TLS listener details. A tls listener uses the ELBSecurityPolicy-TLS13-1-2-2021-06 policy (TLS 1.2 and 1.3, ECDHE cipher suites). The NLB terminates TLS and forwards plaintext to your container; the target group stays on TCP. Certificate validity is checked at convox deploy — missing, non-ISSUED, cross-region, or cross-account certificates fail within seconds instead of as a stuck CloudFormation update.

Safety interlocks (all errors surface at convox rack params set or convox deploy, before any AWS change):

Scenario Outcome
NLB=Yes with InternalOnly=Yes Rejected
NLBInternal=Yes without Internal=Yes Rejected
Disabling NLB / NLBInternal while apps still declare nlb: ports Rejected; error lists the blocking app/service pairs
NLBDeletionProtection=Yes combined with NLB=No (same call) Rejected; must disable deletion protection first
convox rack uninstall while either deletion protection is enabled Rejected; must disable deletion protection first
NLBPreserveClientIP=Yes with a user-managed InstanceSecurityGroup Rejected in both directions (silent SG breakage prevented)
allow_cidr entry that is non-canonical, IPv6, duplicate, or >5 Rejected
Service declares scheme: public on a rack without NLB=Yes convox deploy fails with remediation command in the error

CLI changes. convox rack gains NLB / NLB Internal rows (hostname + EIPs for the public NLB). convox services gains an NLB PORTS column showing the declared listener ports, scheme, and a bracketed suffix (cz=..., allow=N, pcip=...) when per-port overrides are present. Enabling NLB=Yes or NLBInternal=Yes via convox rack params set prints a one-line hint that provisioning typically takes 5–10 minutes.


How to use it?

Update your rack:

$ convox rack update

Then enable NLB at the rack level. Parameters batch into a single call; never run them as sequential commands.

For just the public NLB:

$ convox rack params set NLB=Yes

For the internal NLB (requires Internal=Yes):

$ convox rack params set Internal=Yes NLBInternal=Yes

Both at once, with cross-zone on, a production CIDR allowlist, and deletion protection:

$ convox rack params set NLB=Yes NLBInternal=Yes Internal=Yes \
    NLBAllowCIDR="198.51.100.0/24,203.0.113.0/24" \
    NLBCrossZone=Yes NLBInternalCrossZone=Yes \
    NLBDeletionProtection=Yes NLBInternalDeletionProtection=Yes

After provisioning, convox rack shows the NLB hostname and associated EIPs:

$ convox rack
Name      my-rack
Provider  aws
Region    us-east-1
Router    my-rack-router-...elb.amazonaws.com
NLB       my-rack-nlb-...elb.us-east-1.amazonaws.com (203.0.113.10, 203.0.113.11, 203.0.113.12)
Status    running
Version   20260421192651

Next, declare nlb: entries on any service in your convox.yml and deploy:

services:
  api:
    image: example/api
    port: http:3000
    nlb:
      - port: 8443
        protocol: tls
        containerPort: 8443
        scheme: public
        certificate: arn:aws:acm:us-east-1:123456789012:certificate/abc12345-6789-def0-1234-567890abcdef
        cross_zone: true
        allow_cidr:
          - 10.0.0.0/24
          - 10.1.0.0/24
        preserve_client_ip: false
      - port: 50051
        protocol: tcp
        containerPort: 50051
        scheme: internal
$ convox deploy -a myapp

Point a Route 53 A (alias) record at the rack's public NLB DNS name — visible in convox rack — for the hostname you want to serve. convox services summarizes the NLB listeners alongside ALB listeners:

$ convox services -a myapp
SERVICE  DOMAIN                 PORTS             NLB PORTS
api      myapp-api.example.com  80:3000 443:3000  8443:8443(public)[cz=true allow=2 pcip=false] 50051:50051(internal)

A few common patterns:

  • gRPC over TLSprotocol: tls with a TCP target. Clients get TLS all the way to the NLB; your application process does not manage certificates. For plaintext gRPC between services on the same rack, use protocol: tcp, scheme: internal.
  • NLB-only service — omit port: entirely if the service does not need an ALB listener at all. The service runs behind the NLB with no HTTP route.
  • ALB + NLB coexisting on the same port — declare both port: http:3000 and an nlb: entry on the same container port. The rack deduplicates the ECS port mappings so both listeners route correctly.
  • Switching TCP to TLS — change protocol: tcp to protocol: tls and add a certificate: on the next deploy. AWS documents this as an in-place ModifyListener update; clients see a brief protocol-boundary disruption at switchover.
  • Per-service CIDR allowlist — set allow_cidr on a single NLB port to restrict access to one listener without changing the rack-wide NLBAllowCIDR. Up to 5 CIDRs per rack param (5 public + 5 internal), plus additional per-port CIDRs layered on top.
  • Per-service preserve-client-IP — set preserve_client_ip: true on a single listener so your application sees real client IPs. Applies to ip-type target groups (Fargate and isolate: true services) as well as instance-type target groups (standard gen2 services).

If you run convox deploy against a rack without the matching rack parameter set, or against a rack where a per-port policy conflicts with a rack-level guard, the deploy fails early with a message pointing at the exact remediation:

service api nlb port 8443: rack NLB is not enabled; run 'convox rack params set NLB=Yes' first

To disable NLB again, first remove every nlb: block from every app's convox.yml and redeploy, then disable deletion protection (if set) and flip the rack parameter:

$ convox rack params set NLBDeletionProtection=No NLB=No

To uninstall a rack entirely, deletion protection must be disabled on both NLBs first; the CLI will refuse convox rack uninstall with an actionable error otherwise.

Operational notes:

  • EIP quota. The public NLB allocates one EIP per Availability Zone (2 for a 2-AZ rack, 3 for a 3-AZ rack with HighAvailability=Yes). The default AWS EIP quota is 5 per region. If your rack already uses Private=Yes, NAT gateways consume 2-3 EIPs on top. Check current usage and request a quota increase before enabling on a Private=Yes + 3-AZ HA rack.
  • Provisioning time. NLB creation typically takes 5-10 minutes after convox rack params set returns; the CLI prints a hint reminding you of this on the Yes transition.
  • Listener quota. AWS imposes a limit of 50 listeners per NLB. The rack's shared NLB carries listeners for every app that opts in. Environments approaching 50 public listeners or 50 internal listeners across all apps should request a quota increase before broader adoption.
  • CIDR allowlist capacity. Rack parameters NLBAllowCIDR and NLBInternalAllowCIDR each accept up to 5 CIDRs (comma-separated). For fine-grained per-service allowlists beyond the rack default, use the per-port allow_cidr: field on individual nlb: entries.
  • TLS policy. The listener uses ELBSecurityPolicy-TLS13-1-2-2021-06 (TLS 1.2 and 1.3, ECDHE). A configurable policy will ship in a follow-up release if demand is there.
  • Certificate region. ACM certificates used on NLB listeners must be in the same AWS region as the rack. The pre-flight check surfaces a clear error if you pass a cross-region ARN.
  • User-managed instance SG. If you have set the rack-level InstanceSecurityGroup parameter to a SG you manage yourself, NLBPreserveClientIP=Yes is blocked — the NLB would forward real client IPs into an SG whose ingress rules you control, and we will not silently modify that posture. Disable preserve-client-IP in the same rack params set call if you need to change both.

Does it have a breaking change?

No breaking changes. All ten new rack parameters default to No / 0.0.0.0/0 / blank, so existing racks create zero new NLB resources on upgrade and behave identically until you opt in. Services without nlb: produce byte-identical CloudFormation output, and TCP-only NLB listeners without per-port overrides render identically to their earlier form. The convox rack output adds the NLB rows only when the rack has NLB enabled and exposes the corresponding stack outputs; the convox services output's new NLB PORTS column is suppressed entirely when no service in the app has NLB ports declared.

The new Docker labels added to the rack API task definitions produce a single task definition revision on first update, resulting in a normal rolling restart of the rack API containers — the same pattern used for every previous rack parameter addition. Gen1 apps are unaffected; NLB is gen2-only.

Downgrading to a rack version that predates these parameters requires setting NLB=No and NLBInternal=No first, disabling deletion protection if set, and removing nlb: blocks from any active services. CloudFormation will reject the downgrade cleanly if attempted with non-default values.


Requirements

To use this feature, you must update to rack version 20260421192651 or newer.

  • Check your rack's version with convox rack
  • Update your rack with convox rack update

ntner added a commit that referenced this pull request Apr 18, 2026
@ntner ntner mentioned this pull request Apr 18, 2026
@ntner ntner closed this in #3792 Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant