Skip to content

certificate: allow ephemeral subdomains of normal routes#807

Merged
evanphx merged 1 commit into
mainfrom
mir-1141-ephemeral-tls-cert-fallback-self-signed
May 14, 2026
Merged

certificate: allow ephemeral subdomains of normal routes#807
evanphx merged 1 commit into
mainfrom
mir-1141-ephemeral-tls-cert-fallback-self-signed

Conversation

@evanphx
Copy link
Copy Markdown
Contributor

@evanphx evanphx commented May 14, 2026

Summary

  • Fixes MIR-1141: ephemeral deploys served the cluster's self-signed fallback cert instead of a real TLS certificate.
  • The autocert controller's isAllowedHost only matched exact hosts and one-level wildcards (*.example.com). The request-routing layer (httpingress.lookupEphemeralRoute) was already stripping the first DNS label of incoming hosts and matching the parent against the route set, so a request to pr-33.app.example.com would route correctly to the app on app.example.com — but the TLS handshake's SNI lookup happened earlier in GetCertificate, missed the allow-list, and short-circuited to the self-signed fallback before autocert could provision anything.
  • Mirror the strip-and-match logic in isAllowedHost so the cert layer agrees with the routing layer on which hostnames belong to an app. Autocert will now attempt HTTP-01 provisioning inline on the first handshake for <label>.<route-host>.

Test plan

  • go test ./controllers/certificate/... — 19 passing, including two new cases covering the ephemeral-subdomain-of-normal-route path (TestAutocertController_IsAllowedHost_EphemeralSubdomain, TestAutocertController_HostPolicy_EphemeralSubdomain).
  • golangci-lint run ./controllers/certificate/... — clean.
  • Manual: deploy with --ephemeral pr-33, confirm the URL returned by the API serves a Let's Encrypt cert rather than the self-signed fallback.

Caveats

  • For the inline ACME provisioning to succeed, the user still needs DNS that resolves <label>.<route-host> to the cluster (e.g. a wildcard A record covering the route's domain). When DNS is missing, the handshake will still fall back to the self-signed cert, but that is now an external misconfiguration rather than a bug on our side.
  • Each new ephemeral label triggers a new Let's Encrypt order. The existing per-domain failure cooldown handles rate-limit backoff, but heavy churn on distinct labels could press up against LE order limits.

The autocert controller's allow-list only matched exact hosts and
one-level wildcards, so when an ephemeral deploy URL like
pr-33.app.example.com hit TLS, the SNI lookup missed and the handshake
fell back to the self-signed cert. The ingress request-routing layer
(httpingress.lookupEphemeralRoute) already strips the first DNS label
and matches the parent against the route set, so the two layers
disagreed on which hostnames belong to an app.

Mirror that strip-and-match logic in isAllowedHost so autocert will
provision a real certificate inline for ephemeral subdomains of
non-wildcard routes.

Fixes MIR-1141.
@evanphx evanphx requested a review from a team as a code owner May 14, 2026 18:45
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR extends the certificate autocert controller's host authorization logic to support ephemeral subdomains. The isAllowedHost function now permits any subdomain beneath an allowed parent domain—for example, pr-33.app.example.com is accepted when app.example.com is registered. The logic extracts the parent domain after the first dot and checks for wildcard coverage or direct parent presence in allowedHosts. Two new test cases verify both the isAllowedHost function and the mgr.HostPolicy integration handle ephemeral subdomains correctly.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@controllers/certificate/autocert_controller.go`:
- Around line 362-373: The current logic in GetCertificate (the block using
allowedHosts.Load("*."+parent) and allowedHosts.Load(parent)) incorrectly
permits any arbitrary subdomain when a parent route exists; change it to stop
deriving eligibility from parent alone and require an explicit ephemeral
allow-list: remove or disable the parent-only check (the call to
c.allowedHosts.Load(parent)) and instead check a dedicated store (e.g.
c.ephemeralAllowedHosts or c.allowedEphemeral.Load(host)) that contains explicit
ephemeral hostnames, or validate the host against a tight predicate/TTL-backed
registry of active ephemeral deploys before returning true; keep the wildcard
check for "*.parent" but ensure ephemeral-subdomain authorization is only
granted via the explicit ephemeral allow-list lookup in GetCertificate.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 772437a9-6686-4094-9653-a2e533e01bb5

📥 Commits

Reviewing files that changed from the base of the PR and between e99e71e and 6ca3671.

📒 Files selected for processing (2)
  • controllers/certificate/autocert_controller.go
  • controllers/certificate/autocert_controller_test.go

Comment thread controllers/certificate/autocert_controller.go
@evanphx evanphx merged commit cf3bb70 into main May 14, 2026
17 checks passed
@evanphx evanphx deleted the mir-1141-ephemeral-tls-cert-fallback-self-signed branch May 14, 2026 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants