Skip to content

feat(CC-0065): Add optional HTTPRoute sub-reconciler for API exposure#238

Merged
berendt merged 1 commit intomainfrom
feature/CC-0065
Apr 21, 2026
Merged

feat(CC-0065): Add optional HTTPRoute sub-reconciler for API exposure#238
berendt merged 1 commit intomainfrom
feature/CC-0065

Conversation

@berendt
Copy link
Copy Markdown
Contributor

@berendt berendt commented Apr 17, 2026

GitHub Issue #162: Add optional HTTPRoute sub-reconciler for external Keystone API exposure

Category: feature | Scope: Medium

Description: The Keystone API is currently only exposed via a ClusterIP Service on port 5000. External access (CLI users, Horizon, federation partners) requires manual setup outside the operator. Add a new reconcileHTTPRoute sub-reconciler that optionally creates a Gateway API HTTPRoute (gateway.networking.k8s.io/v1) when spec.gateway is configured.

The Kubernetes Ingress API (networking.k8s.io/v1) is frozen and receives no new features. The ingress-nginx controller has been archived as of March 2026. The Gateway API (GA since October 2023, currently v1.5.x) is the recommended successor and supported by all major implementations (Envoy Gateway, Cilium, Istio, Traefik, NGINX Gateway Fabric).

The operator follows the Gateway API role model: it only creates an HTTPRoute (application-developer role). The GatewayClass and Gateway are infrastructure concerns managed by the platform team.

New spec.gateway section:

spec:
  gateway:
    parentRef:
      name: openstack-gateway        # required, pre-existing Gateway
      namespace: openstack-infra      # optional, cross-namespace via ReferenceGrant
      sectionName: https              # optional, target a specific Gateway listener
    hostname: keystone.example.com    # required, SNI / Host header match
    path: /                           # optional, default "/"
    annotations: {}                   # optional, pass-through to HTTPRoute metadata

Behavior:

  • When spec.gateway is set, create an HTTPRoute targeting the existing {name}-api Service on port 5000
  • When spec.gateway is removed (nil), delete the HTTPRoute resource
  • Derive status.endpoint from the hostname: https://{hostname}/v3 (assume TLS is terminated at the Gateway)
  • The existing spec.publicEndpoint field takes precedence over the gateway-derived URL for --bootstrap-public-url if both are set
  • Add an HTTPRouteReady condition that reflects whether the route has been accepted by the Gateway controller (check status.parents[].conditions for Accepted=True)

Motivation: Keystone is the central authentication endpoint for all OpenStack services and CLI users. Without operator-managed external exposure, every deployment requires manual creation of routing resources, which drifts from the desired state and is error-prone. The spec.publicEndpoint field already exists as a workaround, but the operator cannot ensure that the corresponding network path actually exists. Managing an HTTPRoute declaratively closes this gap, gives the operator a single source of truth for status.endpoint, and uses the portable Gateway API standard instead of the frozen Ingress API or controller-specific annotations.

Affected Areas:

  • operators/keystone/api/v1alpha1/keystone_types.go (new GatewaySpec struct with ParentRef, Hostname, Path, Annotations)
  • operators/keystone/api/v1alpha1/keystone_webhook.go (validation: hostname required, parentRef.name required)
  • operators/keystone/internal/controller/reconcile_httproute.go (new file)
  • operators/keystone/internal/controller/keystone_controller.go (call reconcileHTTPRoute, add to owned resources, aggregate condition)
  • operators/keystone/internal/controller/reconcile_networkpolicy.go (extend ingress rules to allow traffic from the Gateway pod/namespace when both spec.gateway and spec.networkPolicy are set)
  • operators/keystone/config/crd/bases/keystone.openstack.c5c3.io_keystones.yaml (CRD regeneration)
  • operators/keystone/go.mod (add sigs.k8s.io/gateway-api dependency)

Acceptance Criteria:

  • HTTPRoute is created when spec.gateway is set with parentRef.name and hostname
  • HTTPRoute is deleted when spec.gateway is removed
  • HTTPRoute references the existing {name}-api Service on port 5000 as backendRef
  • parentRef supports name, namespace, and sectionName
  • status.endpoint reflects https://{hostname}/v3 when gateway is configured
  • spec.publicEndpoint takes precedence over gateway-derived URL for bootstrap
  • HTTPRouteReady condition tracks the Accepted condition from status.parents[].conditions
  • spec.gateway.annotations are passed through to the HTTPRoute metadata
  • NetworkPolicy is updated to allow traffic from the Gateway when both spec.gateway and spec.networkPolicy are set
  • RBAC markers include gateway.networking.k8s.io/httproutes permissions
  • Webhook validates that hostname and parentRef.name are non-empty
  • Unit tests cover creation, deletion, status derivation, and annotation pass-through
  • Integration test verifies end-to-end HTTPRoute lifecycle

Supersedes: #161 (Ingress-based approach)

Labels: enhancement


Source: #162

berendt added a commit that referenced this pull request Apr 17, 2026
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In buildKeystoneNetworkPolicy the inline comment still contains Reviewer: please verify.; consider removing this leftover note now that the behavior is codified and covered by tests.
  • In keystoneStatusEndpoint, when spec.gateway is non-nil but hostname is empty you silently fall back to the cluster-local URL; if this situation should never occur post-admission, it might be safer to log or surface this as a validation/error to avoid misconfigurations being masked.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `buildKeystoneNetworkPolicy` the inline comment still contains `Reviewer: please verify.`; consider removing this leftover note now that the behavior is codified and covered by tests.
- In `keystoneStatusEndpoint`, when `spec.gateway` is non-nil but `hostname` is empty you silently fall back to the cluster-local URL; if this situation should never occur post-admission, it might be safer to log or surface this as a validation/error to avoid misconfigurations being masked.

## Individual Comments

### Comment 1
<location path="operators/keystone/internal/controller/reconcile_httproute.go" line_range="138-140" />
<code_context>
+		parentRef.SectionName = ptr.To(gatewayv1.SectionName(gw.ParentRef.SectionName))
+	}
+
+	path := gw.Path
+	if path == "" {
+		path = defaultHTTPRoutePath
+	}
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Gateway path is used verbatim without basic validation or normalization (e.g. leading slash), which may lead to invalid HTTPRoute specs.

Because `gw.Path` is passed directly into `HTTPPathMatch.Value`, a value without a leading `/` or in an invalid format can cause the generated HTTPRoute to be rejected. Consider enforcing a leading `/` (and other basic format checks) in the webhook or normalizing here (e.g. prepend `/` when missing) and rejecting clearly invalid values so it truly behaves as a URL path prefix.

Suggested implementation:

```golang
	if gw.ParentRef.SectionName != "" {
		parentRef.SectionName = ptr.To(gatewayv1.SectionName(gw.ParentRef.SectionName))
	}

	path := gw.Path
	if path == "" {
		path = defaultHTTPRoutePath
	}
	if !strings.HasPrefix(path, "/") {
		path = "/" + path
	}

package controller

import (
	"context"

```

```golang
	"fmt"

	"strings"

	apiequality "k8s.io/apimachinery/pkg/api/equality"

```

1. Find all usages where `gw.Path` is passed into the HTTPRoute (e.g. in `HTTPPathMatch.Value`) and replace them with the normalized `path` variable introduced above.
   For example, change:
   `Value: ptr.To(gw.Path),`
   to:
   `Value: ptr.To(path),`
2. If there are any other places in this reconcile function where `gw.Path` is used to construct HTTPRoute rules or matches, update those to use the normalized `path` variable as well, to keep behavior consistent.
</issue_to_address>

### Comment 2
<location path="operators/keystone/internal/controller/reconcile_networkpolicy.go" line_range="115-103" />
<code_context>
+	// ParentRef lookup semantics. When spec.networkPolicy is nil we take the
+	// delete path above — no extra rule is needed. Reviewer: please verify.
+	// (CC-0065, REQ-008)
+	if keystone.Spec.Gateway != nil {
+		gatewayNS := keystone.Spec.Gateway.ParentRef.Namespace
+		if gatewayNS == "" {
+			gatewayNS = keystone.Namespace
+		}
+		peers = append(peers, networkingv1.NetworkPolicyPeer{
+			NamespaceSelector: &metav1.LabelSelector{
+				MatchLabels: map[string]string{
+					"kubernetes.io/metadata.name": gatewayNS,
+				},
+			},
+		})
+	}
+
 	port5000 := intstr.FromInt32(5000)
</code_context>
<issue_to_address>
**🚨 question (security):** Automatically allowing all pods in the Gateway namespace may weaken the intended isolation expressed by spec.networkPolicy.

When `spec.gateway` is set, this always adds an ingress peer for the entire Gateway namespace, even if `spec.networkPolicy` is intended to be more restrictive. In shared Gateway namespaces, this can grant broader access than the CR author expects. Consider guarding this with an explicit opt-in (e.g. a field under `spec.networkPolicy` or `spec.gateway`), or only adding this peer when no other ingress peers are defined, so that the CR’s `networkPolicy` remains the complete source of truth for ingress policy.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread operators/keystone/internal/controller/reconcile_httproute.go
@@ -102,6 +102,30 @@ func buildKeystoneNetworkPolicy(keystone *keystonev1alpha1.Keystone) *networking
peers = append(peers, peer)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 question (security): Automatically allowing all pods in the Gateway namespace may weaken the intended isolation expressed by spec.networkPolicy.

When spec.gateway is set, this always adds an ingress peer for the entire Gateway namespace, even if spec.networkPolicy is intended to be more restrictive. In shared Gateway namespaces, this can grant broader access than the CR author expects. Consider guarding this with an explicit opt-in (e.g. a field under spec.networkPolicy or spec.gateway), or only adding this peer when no other ingress peers are defined, so that the CR’s networkPolicy remains the complete source of truth for ingress policy.

berendt added a commit that referenced this pull request Apr 17, 2026
Addresses sourcery-ai[bot] review #1 (external review 1) on PR #238.

- operators/keystone/internal/controller/reconcile_httproute.go: Normalize spec.gateway.path in buildKeystoneHTTPRoute to prepend a leading "/" when missing (e.g. "identity" -> "/identity") so generated HTTPRoutes are not rejected by Gateway controllers; add "strings" import. Remove the silent cluster-local fallback in keystoneStatusEndpoint when spec.gateway is set but hostname is empty, so post-admission misconfiguration (e.g. webhook bypass) surfaces loudly as https:///v3 instead of being masked; expand the doc comment to explain the CRD + webhook guarantees and the fail-loud rationale.
- operators/keystone/internal/controller/reconcile_httproute_test.go: Add TestBuildKeystoneHTTPRoute_PathWithoutLeadingSlash_Normalized asserting that "identity" is normalized to "/identity" in the resulting HTTPPathMatch.Value (CC-0065, REQ-001).
- operators/keystone/internal/controller/reconcile_networkpolicy.go: Remove the leftover "Reviewer: please verify." note from the CC-0065 REQ-008 block (behavior is codified and tested). Add a REVIEWER-DISAGREEMENT block documenting why the namespace-wide Gateway-namespace ingress peer remains automatic rather than opt-in: REQ-008 mandates it so Gateway data-plane pods can reach Keystone when both spec.gateway and spec.networkPolicy are set, pod-level selection would leak Gateway implementation (Envoy/Kong/NGINX) into the operator, and an opt-in field would alter REQ-008 semantics — deferred out of this review round.

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
@berendt berendt requested review from gndrmnn April 17, 2026 19:26
@berendt
Copy link
Copy Markdown
Contributor Author

berendt commented Apr 17, 2026

Reviewed by planwerk-review 342b9d8 with Claude Code

CRITICAL (1)

C-001: Health check probes external URL when spec.gateway is set

File: operators/keystone/internal/controller/reconcile_healthcheck.go:57Fix: ASK — Confidence: verified

Problem: keystoneStatusEndpoint(keystone) returns https://{hostname}/v3 whenever spec.gateway != nil. reconcileHealthCheck reads keystone.Status.Endpoint verbatim and performs http.Client.Do against it. After this PR the operator pod must resolve the public hostname via cluster DNS, have egress to the external VIP, and successfully traverse the entire Gateway → Service → Pod path before KeystoneAPIReady=True. This conflates ingress/DNS/cert/Gateway health with API readiness, requires Internet egress, may fail TLS verification against self-signed Gateway certs, and changes the Status.Endpoint contract for consumers.

Code:

endpoint := keystone.Status.Endpoint
resp, err := r.HTTPClient.Do(req)

Action Required: Keep Status.Endpoint as the externally advertised endpoint (the user-facing contract is good) but have reconcileHealthCheck always probe the cluster-local Service URL — e.g. introduce an unexported internalAPIURL(keystone) and have the health check use that. Add a regression test asserting the health check target is always http://{name}-api.{ns}.svc.cluster.local:5000/v3 regardless of spec.gateway.

Suggested Fix:

Introduce internalAPIURL(keystone) returning http://{name}-api.{ns}.svc.cluster.local:5000/v3 and use it in reconcileHealthCheck instead of keystone.Status.Endpoint. Keep keystoneStatusEndpoint for the public Status.Endpoint contract.

WARNING (3)

W-001: Annotation removal silently ignored on HTTPRoute

File: operators/keystone/internal/controller/reconcile_httproute.go:221-293Fix: ASK — Confidence: verified

Problem: buildKeystoneHTTPRoute only assigns route.Annotations when len(gw.Annotations) > 0. ensureHTTPRoute's merge block is gated on if route.Annotations != nil. Removing every key from spec.gateway.annotations therefore leaves the previously-applied annotations on the live HTTPRoute indefinitely — the additive merge never gets a chance to overwrite them, and there is no removal path. Gateway controllers (Kong, Istio, Envoy Gateway, Cilium) all consume implementation-specific annotations to enable/disable behavior; toggling those off via the CR will appear to silently fail. The same concern applies to labels.

Code:

if len(gw.Annotations) > 0 {
    route.Annotations = gw.Annotations
}
...
if route.Annotations != nil {
    // additive merge
}

Action Required: Apply operator-managed annotations using a known-key set (track with a keystone.openstack.c5c3.io/managed-annotations annotation listing managed keys) and remove keys not present in the desired set, OR explicitly delete the well-known operator-managed keys before re-applying. Add a test that removes a previously set annotation and asserts the live HTTPRoute no longer carries it. Apply the same fix to labels.

Suggested Fix:

Track operator-managed annotation/label keys via a sentinel annotation (e.g. keystone.openstack.c5c3.io/managed-annotations) and reconcile by removing keys present in the previous set but absent from the desired set, then applying the desired set. Add a test for annotation removal.

W-002: sigs.k8s.io/gateway-api v1.1.0 is significantly behind

File: operators/keystone/go.mod:19Fix: ASK — Confidence: verified — Pattern: outdated dependency

Problem: The PR body itself acknowledges current upstream is v1.5.x. v1.1.0 (Aug 2024) predates important fixes in Accepted-condition semantics and the BackendObjectReference.Port typing tightening.

Code:

sigs.k8s.io/gateway-api v1.1.0

Action Required: Pin to the latest minor that matches the controller-runtime/k8s.io versions in use, and bump the version variable in hack/deploy-infra.sh.

Suggested Fix:

Bump sigs.k8s.io/gateway-api in operators/keystone/go.mod to the latest v1.5.x release compatible with the project's controller-runtime/k8s.io versions, and update the corresponding version variable in hack/deploy-infra.sh.

W-003: Missing Chainsaw E2E test for Gateway/HTTPRoute lifecycle

File: tests/e2e/keystone/:1Fix: ASK — Confidence: verified — Pattern: E2E parity

Problem: 29 E2E test directories exist (network-policy, healthcheck, policy-overrides, …) but no gateway/ or httproute/ directory. Per the project pattern that introduced E2E parity for prior CC-XXXX features, a new sub-reconciler exposing a CRD field needs a Chainsaw test.

Code:

tests/e2e/keystone/  # no gateway/ or httproute/ directory present

Action Required: Add a Chainsaw test under tests/e2e/keystone/httproute/ that at minimum: creates a Keystone with spec.gateway, asserts the HTTPRoute is created with the right spec, asserts HTTPRouteReady flips True after a fake/real Accepted condition, and asserts the route is deleted when spec.gateway is unset.

Suggested Fix:

Create tests/e2e/keystone/httproute/ with a Chainsaw scenario covering route creation, HTTPRouteReady transition, and route deletion when spec.gateway is unset.

INFO (2)

I-001: Leftover REVIEWER-DISAGREEMENT and Reviewer: please verify comments

File: operators/keystone/internal/controller/reconcile_networkpolicy.go:115-229Fix: AUTO-FIX — Confidence: verified — Pattern: CC-0065 review-1 cleanup

Problem: Three comments are review-process scaffolding that should not survive into main: L115-120 contains a // REVIEWER-DISAGREEMENT: sourcery-ai[bot] suggested ... block whose rationale is fine but the meta-framing referencing a bot review belongs in the PR thread; L182 and L229 contain leftover // ... Reviewer: please verify. (CC-0039) prompts.

Code:

// REVIEWER-DISAGREEMENT: sourcery-ai[bot] suggested ...  (L115-120)
// ... Reviewer: please verify. (CC-0039)  (L182)
// ... Reviewer: please verify. (CC-0039)  (L229)

Action Required: Remove the meta-prompts to a reviewer while keeping the technical decision text intact, per the project's CC-0065 review-1 cleanup pattern.

Suggested Fix:

Delete the `REVIEWER-DISAGREEMENT:` framing line at L115 (keeping the technical decision rationale that follows it) and remove the trailing `Reviewer: please verify. (CC-0039)` sentences from the comments at L182 and L229.

I-002: Partial r.Client.Get to r.Get migration in test file

File: operators/keystone/internal/controller/reconcile_networkpolicy_test.go:414-679Fix: AUTO-FIX — Confidence: verified — Pattern: CC-0059 partial-refactor

Problem: Mixed style within the same file: lines 414/649/679 use the receiver shortcut r.Get(...); lines 473/547/639 still use r.Client.Get(...). This leaves a half-finished migration in the diff, contrary to the CC-0059 partial-refactor pattern.

Code:

// L414, L649, L679: r.Get(ctx, key, obj)
// L473, L547, L639: r.Client.Get(ctx, key, obj)

Action Required: Migrate all r.Client.Get(...) occurrences in this file (and ideally elsewhere in the package) to r.Get(...), or revert the new ones for consistency with the surrounding code.

Suggested Fix:

r.Get(ctx, key, obj)

Summary

Solid plumbing for the optional HTTPRoute sub-reconciler with good pattern compliance (dual-copy CRD, RBAC sync, webhook+CEL parity, idempotency tests, RESTMapper-based graceful degradation). However, one cross-reconciler interaction was not fully thought through (health check now probes the external URL), an additive label/annotation merge has no removal path, the gateway-api dependency is significantly outdated, and the new sub-reconciler lacks a Chainsaw E2E test required by project pattern.

Severity Count
BLOCKING 0
CRITICAL 1
WARNING 3
INFO 2

Important

Recommendation: Do not merge as-is. Block on C-1 (health check now external) and W-1 (annotation removal). Fix W-2 (gateway-api version bump) and W-3 (add E2E test) before merge, and clean up I-1 and I-2 in the same pass.

- Add spec.gateway (GatewaySpec with ParentRef, Hostname, Path,
Annotations) on Keystone to opt into external API exposure via Gateway
API
- Introduce reconcileHTTPRoute sub-reconciler that creates, updates, or
deletes the {name}-api HTTPRoute based on spec.gateway presence, with
managed-annotation/label tracking so removed keys get pruned
- Detect the gateway.networking.k8s.io/v1 HTTPRoute CRD at
SetupWithManager time and gate the HTTPRoute watch on its availability
so the operator still starts on clusters without Gateway API
- Surface an HTTPRouteReady condition with dedicated reasons
(HTTPRouteAccepted, HTTPRouteNotAccepted, HTTPRouteNotRequired,
GatewayAPINotInstalled) reflecting the parent status written by the
Gateway controller
- Derive status.endpoint from spec.gateway.hostname as
https://{hostname}/v3 when gateway is set, while keeping the internal
cluster-local URL for reconcileHealthCheck so API readiness never
depends on the ingress path
- Append a Gateway-namespace ingress peer on port 5000 to the managed
NetworkPolicy when both spec.gateway and spec.networkPolicy are set
- Add defense-in-depth webhook validation requiring
spec.gateway.hostname and spec.gateway.parentRef.name, alongside CRD
MinLength markers
- Register the gateway-api scheme, add RBAC for httproutes and
httproutes/status (controller + Helm chart templates and unit tests),
and install Gateway API CRDs from hack/deploy-infra.sh before the
operator release
- Add envtest scheme registration, controller unit/integration tests,
reconcile_httproute_test.go, and a Chainsaw E2E covering the full
create, Accepted-flip, and delete lifecycle
- Document spec.gateway, the HTTPRouteReady condition, and
e2e-deployment Gateway API prerequisites in the CRD, reconciler,
quick-start, and infrastructure references

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants