Skip to content

k8s: serve template apps on both webapps.openms.de and .org#372

Merged
t0mdavid-m merged 8 commits intomainfrom
feature/dual-host-k8s
Apr 21, 2026
Merged

k8s: serve template apps on both webapps.openms.de and .org#372
t0mdavid-m merged 8 commits intomainfrom
feature/dual-host-k8s

Conversation

@t0mdavid-m
Copy link
Copy Markdown
Member

@t0mdavid-m t0mdavid-m commented Apr 21, 2026

Summary

  • Template-derived apps now serve on both <app>.webapps.openms.de and <app>.webapps.openms.org simultaneously (dual-serve, no redirect).
  • Traefik IngressRoute overlay match becomes (Host(`…de`) || Host(`…org`)) && PathPrefix(`/`). Outer parens preserve precedence (&& binds tighter than ||). The nginx Ingress fallback mirrors the pattern with two rules[] entries.
  • CI gains a dual-host curl assertion on the existing nginx kind integration AND a new traefik-integration job that exercises the IngressRoute end-to-end (closes a long-standing gap where the IngressRoute was lint-skipped and apply-stripped).
  • Bonus fix: existing nginx integration silently passed because the cinder-csi PVC kept pods Pending and the kubectl wait was masked by || true. Both deploy steps now sed-patch storageClassName: cinder-csi → standard so the workspace PVC actually binds in kind.
  • Skill configure-k8s-deployment and docs/kubernetes-deployment.md updated for the dual-host pattern, the precedence-paren caveat, and the per-host sticky-cookie behaviour.

The change is additive — existing single-host setups keep working because Host(A) || Host(B) accepts either. Existing forks (quantms-web, umetaflow, FLASHApp) live in their own forks and adopt by pulling the template; the manifest changes are merge-friendly (overlay diff is a single-line value replacement; base ingress is a pure append).

Test plan

  • lint-manifests passes.
  • build matrix (full + simple) passes; the new "Curl both hostnames via nginx ingress" step shows streamlit.openms.example.de -> 200 OK and streamlit.openms.example.org -> 200 OK (these are the nginx Ingress's base hostnames).
  • traefik-integration passes; "Curl both hostnames via Traefik" shows template.webapps.openms.de -> 200 OK and template.webapps.openms.org -> 200 OK (these are the IngressRoute hostnames patched by the overlay).
  • Operationally, before relying on .org: confirm <app>.webapps.openms.org DNS points at the cluster ingress (out of scope for CI, your DNS responsibility).

Summary by CodeRabbit

  • New Features

    • Added support for routing applications through both .de and .org domain variants with automatic sticky-session handling across domains.
  • Tests

    • Added comprehensive health verification for multi-domain ingress configurations.
    • Introduced Traefik ingress controller integration testing.
  • Documentation

    • Updated deployment guides to reflect dual-domain routing setup and cross-domain session handling.

Updates the Traefik IngressRoute match in the template-app overlay to
accept both Host() values, and mirrors the same dual-host pattern in
the nginx Ingress fallback (two rules entries, same backend).

Outer parentheses on the || group are required for correct precedence
against PathPrefix.
Adds a dual-host curl assertion to the existing nginx kind integration
and a new traefik-integration job that brings up Traefik via Helm,
deploys the full overlay (no IngressRoute filter), and curls both
hostnames through the IngressRoute.

The traefik-integration job runs once on Dockerfile_simple — ingress
routing is image-agnostic, and adding the full image variant would
double the runtime without catching new regressions.
The cinder-csi storage class isn't available in kind clusters. Patch
it to 'standard' (kind's default local-path-provisioner) at apply
time, alongside the existing imagePullPolicy substitution. Without
this, the workspace PVC stays unbound, streamlit and rq-worker pods
stay Pending, and the new dual-host curl assertions fail with 503.

The existing 'Verify all deployments are available' step had been
masking this with '|| true' since the integration test was added.

Also wire up a trap-based EXIT cleanup for the kubectl port-forward
processes; the previous trailing 'kill' line was unreachable under
set -e if any curl assertion failed.
Updates the overlay-edit step to require editing both Host() values
(.de and .org) plus the parallel nginx Ingress two-rules pattern.
Updates the verification checklist accordingly.
…ginx patch

CommonMark code spans don't process backslash escapes for backticks,
so `Host(\`…\`)` rendered as broken fragments. Wrap with double
backticks instead — the inner backticks are then literal.

Also clarify the nginx fallback note: 'patch both rules[].host
entries' could be misread as directly editing the shared base file;
'add an overlay patch for both rules[].host entries' is unambiguous.
Updates the architecture diagram, manifest reference, customization
table, and CI/CD section to describe the dual-host (.de + .org)
default. Adds a short subsection on the per-host stroute cookie and
why cross-TLD switches are harmless.
…ches in Job 3

Two factual errors caught in review:

- "both jobs run on pull requests" was true with 2 jobs, but there
  are now 3 (lint-manifests, build, traefik-integration). All three
  run on PRs.
- Job 3's description omitted that the deploy step still patches
  imagePullPolicy and storageClassName for kind compatibility, even
  though it doesn't filter the IngressRoute. Job 2's description
  already mentions both patches; Job 3 should be parallel.
The nginx Ingress is unpatched by the overlay, so it retains its base
hostnames (streamlit.openms.example.de / .org) from k8s/base/ingress.yaml.
The previous curl step used the Traefik IngressRoute hostnames
(template.webapps.openms.*), which the nginx ingress controller does
not match — every request 404'd.

Traefik's curl step is unchanged: the IngressRoute IS patched to the
template.webapps.openms.* hostnames, so those are correct there.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 21, 2026

📝 Walkthrough

Walkthrough

The changes extend Kubernetes ingress routing to support dual TLDs (.de and .org). Updates include adding a second ingress rule to the base Ingress manifest, modifying the Traefik IngressRoute patch to match either hostname using logical OR, extending CI workflows to validate both TLDs via HTTP health checks, and updating documentation and skill guides accordingly.

Changes

Cohort / File(s) Summary
Kubernetes Manifests
k8s/base/ingress.yaml, k8s/overlays/template-app/kustomization.yaml
Added second Ingress rule for .org TLD; updated Traefik IngressRoute patch to match both .de and .org hosts using `
CI/Workflow Enhancements
.github/workflows/build-and-test.yml
Added storage class replacement (cinder-csistandard); introduced HTTP health validation for nginx deployment (dual-host curl checks via port-forward); added new traefik-integration job with Helm-based Traefik setup and dual-host verification.
Documentation & Guidance
docs/kubernetes-deployment.md, .claude/skills/configure-k8s-deployment.md
Updated dual TLD routing documentation, sticky-cookie behavior across hosts, nginx overlay patching guidance, and CI pipeline job descriptions; revised skill guide instructions for IngressRoute and nginx Ingress configuration.

Possibly related PRs

Poem

🐰 Two domains hop along with glee,
.de and .org, both wild and free,
The ingress routes with OR delight,
Each path now glows with double-bright!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately summarizes the main change: adding dual-host (.de and .org) support for serving template apps on the webapps.openms domain.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/dual-host-k8s

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/configure-k8s-deployment.md:
- Line 39: Update the overlay guidance to note that for nginx-only clusters you
must also remove or exclude the Traefik CRD resource (the IngressRoute) from the
kustomize overlay; patching only the base Ingress's rules[].host entries is
insufficient because kubectl apply -k will fail if the Traefik IngressRoute
resource is present but the CRD is missing. Instruct users to either add a
kustomize transformer/patch that deletes or filters out the IngressRoute
resource (referencing the IngressRoute kind/name) or to install the Traefik CRD
in the target cluster, and clarify this alongside the instructions for patching
the base Ingress rules[].host entries (.de / .org pattern).

In @.github/workflows/build-and-test.yml:
- Around line 172-181: The workflow builds and loads openms-streamlit:test but
kustomize renders the image as ghcr.io/openms/streamlit-template:main-full, so
update the Docker build and kind load steps to build/tag and load the exact
rendered image reference (e.g., docker build -t
ghcr.io/openms/streamlit-template:main-full ...) and then run kind load
docker-image ghcr.io/openms/streamlit-template:main-full --name traefik-test (or
programmatically extract the image from kubectl kustomize
k8s/overlays/template-app/ and use that tag) so the cluster has the exact image
referenced by the deployment (respecting imagePullPolicy: Never).

In `@docs/kubernetes-deployment.md`:
- Line 182: Update the docs to tell nginx-only users how to handle Traefik CRD
artifacts: explain that the base includes an IngressRoute and that if Traefik is
not installed they must remove or filter the IngressRoute resource (or add a
kustomize patch to delete it) and instead patch the nginx Ingress rules[].host
entries (the two host entries using the .de / .org pattern) so the Ingress (not
IngressRoute) is correctly configured; refer explicitly to the IngressRoute
resource name and the Ingress.rules[].host fields when describing the required
overlay changes.
- Around line 89-91: Update the paragraph under "Sticky cookie behaviour across
hosts" to correct the pod-affinity explanation: state that the per-host
`stroute` sticky cookie (not pod affinity) binds a user to a specific Streamlit
pod, and change the sentence referencing "Pod affinity exists to keep the
WebSocket warm and reuse Streamlit's in-process script cache" to explain that
pod affinity is used to co-locate PVC-using pods on the same node (helping
locality and performance), not to preserve user session affinity; keep the note
that session correctness relies on shared Redis and PVC.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f93694c3-18e7-4c89-9573-af1055540c55

📥 Commits

Reviewing files that changed from the base of the PR and between 25ed32c and e840358.

📒 Files selected for processing (5)
  • .claude/skills/configure-k8s-deployment.md
  • .github/workflows/build-and-test.yml
  • docs/kubernetes-deployment.md
  • k8s/base/ingress.yaml
  • k8s/overlays/template-app/kustomization.yaml

- In both Deployment patches (`streamlit` and `rq-worker`), update the Redis URL from `redis://template-app-redis:6379/0` to `redis://<your-app-name>-redis:6379/0`

The overlay leaves the nginx `Ingress` unpatched because production deployments use Traefik. If you are deploying to an nginx-only cluster, substitute an Ingress host patch for the IngressRoute patch.
The overlay leaves the nginx `Ingress` unpatched because production deployments use Traefik. If you are deploying to an nginx-only cluster, add an overlay patch for both `rules[].host` entries in the base `Ingress` (same `.de` / `.org` pattern) instead of the IngressRoute.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Mention removing or excluding the Traefik IngressRoute for nginx-only clusters.

Patching the nginx rules[].host entries is not sufficient if the target cluster lacks the Traefik CRD; kubectl apply -k can still fail on the included IngressRoute. Please add that nginx-only overlays must also remove/filter the Traefik resource or install the CRD.

📝 Proposed wording
-   The overlay leaves the nginx `Ingress` unpatched because production deployments use Traefik. If you are deploying to an nginx-only cluster, add an overlay patch for both `rules[].host` entries in the base `Ingress` (same `.de` / `.org` pattern) instead of the IngressRoute.
+   The overlay leaves the nginx `Ingress` unpatched because production deployments use Traefik. If you are deploying to an nginx-only cluster, remove or filter the Traefik `IngressRoute` resource unless the CRD is installed, and add an overlay patch for both `rules[].host` entries in the base `Ingress` (same `.de` / `.org` pattern).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
The overlay leaves the nginx `Ingress` unpatched because production deployments use Traefik. If you are deploying to an nginx-only cluster, add an overlay patch for both `rules[].host` entries in the base `Ingress` (same `.de` / `.org` pattern) instead of the IngressRoute.
The overlay leaves the nginx `Ingress` unpatched because production deployments use Traefik. If you are deploying to an nginx-only cluster, remove or filter the Traefik `IngressRoute` resource unless the CRD is installed, and add an overlay patch for both `rules[].host` entries in the base `Ingress` (same `.de` / `.org` pattern).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/configure-k8s-deployment.md at line 39, Update the overlay
guidance to note that for nginx-only clusters you must also remove or exclude
the Traefik CRD resource (the IngressRoute) from the kustomize overlay; patching
only the base Ingress's rules[].host entries is insufficient because kubectl
apply -k will fail if the Traefik IngressRoute resource is present but the CRD
is missing. Instruct users to either add a kustomize transformer/patch that
deletes or filters out the IngressRoute resource (referencing the IngressRoute
kind/name) or to install the Traefik CRD in the target cluster, and clarify this
alongside the instructions for patching the base Ingress rules[].host entries
(.de / .org pattern).

Comment on lines +172 to +181
- name: Build image (simple variant; routing is image-agnostic)
run: docker build -t openms-streamlit:test -f Dockerfile_simple .

- name: Create kind cluster
uses: helm/kind-action@v1
with:
cluster_name: traefik-test

- name: Load image into kind cluster
run: kind load docker-image openms-streamlit:test --name traefik-test
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Load the image tag that the rendered overlay actually uses.

The new Traefik job builds/loads openms-streamlit:test, but kubectl kustomize k8s/overlays/template-app/ renders the image as ghcr.io/openms/streamlit-template:main-full from the overlay. Since line 198 forces imagePullPolicy: Never, kind will not pull that rendered tag and the deployment can fail with an image-not-found state.

🐛 Proposed fix: tag and load the rendered image reference
       - name: Build image (simple variant; routing is image-agnostic)
-        run: docker build -t openms-streamlit:test -f Dockerfile_simple .
+        run: docker build -t ghcr.io/openms/streamlit-template:main-full -f Dockerfile_simple .

@@
       - name: Load image into kind cluster
-        run: kind load docker-image openms-streamlit:test --name traefik-test
+        run: kind load docker-image ghcr.io/openms/streamlit-template:main-full --name traefik-test

Also applies to: 195-199

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/build-and-test.yml around lines 172 - 181, The workflow
builds and loads openms-streamlit:test but kustomize renders the image as
ghcr.io/openms/streamlit-template:main-full, so update the Docker build and kind
load steps to build/tag and load the exact rendered image reference (e.g.,
docker build -t ghcr.io/openms/streamlit-template:main-full ...) and then run
kind load docker-image ghcr.io/openms/streamlit-template:main-full --name
traefik-test (or programmatically extract the image from kubectl kustomize
k8s/overlays/template-app/ and use that tag) so the cluster has the exact image
referenced by the deployment (respecting imagePullPolicy: Never).

Comment on lines +89 to +91
#### Sticky cookie behaviour across hosts

Both Traefik and nginx attach a per-host `stroute` sticky cookie to bind a user to a specific Streamlit pod. Because cookies are scoped to the host that set them, a user who switches mid-session from `<app>.webapps.openms.de` to `<app>.webapps.openms.org` will be re-stuck to a (potentially different) pod. This is harmless: workspace and queue state live in Redis and the shared workspace PVC, so the new pod sees the same data. Pod affinity exists to keep the WebSocket warm and reuse Streamlit's in-process script cache, not for correctness.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Correct the pod-affinity explanation.

Pod affinity does not keep a user on the same Streamlit pod; the sticky cookie does. Pod affinity is for co-locating PVC-using pods on the same node.

📝 Proposed wording
-Both Traefik and nginx attach a per-host `stroute` sticky cookie to bind a user to a specific Streamlit pod. Because cookies are scoped to the host that set them, a user who switches mid-session from `<app>.webapps.openms.de` to `<app>.webapps.openms.org` will be re-stuck to a (potentially different) pod. This is harmless: workspace and queue state live in Redis and the shared workspace PVC, so the new pod sees the same data. Pod affinity exists to keep the WebSocket warm and reuse Streamlit's in-process script cache, not for correctness.
+Both Traefik and nginx attach a per-host `stroute` sticky cookie to bind a user to a specific Streamlit pod. Because cookies are scoped to the host that set them, a user who switches mid-session from `<app>.webapps.openms.de` to `<app>.webapps.openms.org` will be re-stuck to a (potentially different) pod. This is harmless: workspace and queue state live in Redis and the shared workspace PVC, so the new pod sees the same data. Sticky sessions keep the WebSocket warm and reuse Streamlit's in-process script cache; pod affinity only co-locates PVC-using pods on the same node.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#### Sticky cookie behaviour across hosts
Both Traefik and nginx attach a per-host `stroute` sticky cookie to bind a user to a specific Streamlit pod. Because cookies are scoped to the host that set them, a user who switches mid-session from `<app>.webapps.openms.de` to `<app>.webapps.openms.org` will be re-stuck to a (potentially different) pod. This is harmless: workspace and queue state live in Redis and the shared workspace PVC, so the new pod sees the same data. Pod affinity exists to keep the WebSocket warm and reuse Streamlit's in-process script cache, not for correctness.
#### Sticky cookie behaviour across hosts
Both Traefik and nginx attach a per-host `stroute` sticky cookie to bind a user to a specific Streamlit pod. Because cookies are scoped to the host that set them, a user who switches mid-session from `<app>.webapps.openms.de` to `<app>.webapps.openms.org` will be re-stuck to a (potentially different) pod. This is harmless: workspace and queue state live in Redis and the shared workspace PVC, so the new pod sees the same data. Sticky sessions keep the WebSocket warm and reuse Streamlit's in-process script cache; pod affinity only co-locates PVC-using pods on the same node.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/kubernetes-deployment.md` around lines 89 - 91, Update the paragraph
under "Sticky cookie behaviour across hosts" to correct the pod-affinity
explanation: state that the per-host `stroute` sticky cookie (not pod affinity)
binds a user to a specific Streamlit pod, and change the sentence referencing
"Pod affinity exists to keep the WebSocket warm and reuse Streamlit's in-process
script cache" to explain that pod affinity is used to co-locate PVC-using pods
on the same node (helping locality and performance), not to preserve user
session affinity; keep the note that session correctness relies on shared Redis
and PVC.

| Redis URL in both Deployment patches (`redis://template-app-redis:6379/0`) | `redis://<your-app-name>-redis:6379/0` |

The overlay leaves the nginx `Ingress` unpatched because Traefik is the production ingress. If you are deploying to an nginx-only cluster, substitute an Ingress host patch for the IngressRoute patch.
The overlay leaves the nginx `Ingress` unpatched because Traefik is the production ingress. If you are deploying to an nginx-only cluster, add an overlay patch for both `rules[].host` entries in the base `Ingress` (same `.de` / `.org` pattern) instead of the IngressRoute patch.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Document the required Traefik resource handling for nginx-only clusters.

If the Traefik CRD is not installed, applying this base still includes an IngressRoute, so nginx-only users need guidance to remove/filter that resource in addition to patching the nginx hosts.

📝 Proposed wording
-The overlay leaves the nginx `Ingress` unpatched because Traefik is the production ingress. If you are deploying to an nginx-only cluster, add an overlay patch for both `rules[].host` entries in the base `Ingress` (same `.de` / `.org` pattern) instead of the IngressRoute patch.
+The overlay leaves the nginx `Ingress` unpatched because Traefik is the production ingress. If you are deploying to an nginx-only cluster, remove or filter the Traefik `IngressRoute` resource unless the CRD is installed, and add an overlay patch for both `rules[].host` entries in the base `Ingress` (same `.de` / `.org` pattern).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
The overlay leaves the nginx `Ingress` unpatched because Traefik is the production ingress. If you are deploying to an nginx-only cluster, add an overlay patch for both `rules[].host` entries in the base `Ingress` (same `.de` / `.org` pattern) instead of the IngressRoute patch.
The overlay leaves the nginx `Ingress` unpatched because Traefik is the production ingress. If you are deploying to an nginx-only cluster, remove or filter the Traefik `IngressRoute` resource unless the CRD is installed, and add an overlay patch for both `rules[].host` entries in the base `Ingress` (same `.de` / `.org` pattern).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/kubernetes-deployment.md` at line 182, Update the docs to tell
nginx-only users how to handle Traefik CRD artifacts: explain that the base
includes an IngressRoute and that if Traefik is not installed they must remove
or filter the IngressRoute resource (or add a kustomize patch to delete it) and
instead patch the nginx Ingress rules[].host entries (the two host entries using
the .de / .org pattern) so the Ingress (not IngressRoute) is correctly
configured; refer explicitly to the IngressRoute resource name and the
Ingress.rules[].host fields when describing the required overlay changes.

@t0mdavid-m t0mdavid-m merged commit 636da9d into main Apr 21, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant