Fix volume bind by t0mdavid-m · Pull Request #20 · OpenMS/quantms-web

t0mdavid-m · 2026-04-27T07:22:39Z

No description provided.

* Add Matomo Tag Manager as third analytics tracking mode Adds Matomo Tag Manager support alongside existing Google Analytics and Piwik Pro integrations. Includes settings.json configuration (url + tag), build-time script injection via hook-analytics.py, Klaro GDPR consent banner integration, and runtime consent granting via MTM data layer API. https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * Fix Matomo Tag Manager snippet to match official docs - Accept full container JS URL instead of separate url + tag fields, supporting both self-hosted and Matomo Cloud URL patterns - Match the official snippet: var _mtm alias, _mtm.push shorthand - Remove redundant type="text/javascript" attribute - Remove unused "tag" field from settings.json https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * Split Matomo config into base url + tag fields Separate the Matomo setting into `url` (base URL, e.g. https://cdn.matomo.cloud/openms.matomo.cloud) and `tag` (container ID, e.g. yDGK8bfY), consistent with how other providers use a tag field. The script constructs the full path: {url}/container_{tag}.js https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * install matomo tag --------- Co-authored-by: Claude <noreply@anthropic.com>

* Initial plan * fix: remove duplicate address entry in config.toml Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com>

…til.SameFileError (#349) * Initial plan * Fix integration test failures: restore sys.modules mocks, handle SameFileError, update CI workflow Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> * Remove unnecessary pyopenms mock from test_topp_workflow_parameter.py, simplify test_parameter_presets.py Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> * Fix Windows build: correct site-packages path in cleanup step Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com>

…(#351) On Windows, 0.0.0.0 is not a valid connect address — the browser fails to open http://0.0.0.0:8501. By removing the address entry from the bundled .streamlit/config.toml, Streamlit defaults to localhost, which works correctly for local deployments. Docker deployments are unaffected as they pass --server.address 0.0.0.0 on the command line. https://claude.ai/code/session_016amsLCZeFogTksmtk1geb5 Co-authored-by: Claude <noreply@anthropic.com>

* Add CLAUDE.md and Claude Code skills for webapp development Adds project documentation (CLAUDE.md) and 6 skills to help developers scaffold and extend OpenMS web applications built from this template: - /create-page: add a new Streamlit page with proper registration - /create-workflow: scaffold a full TOPP workflow (class + 4 pages) - /add-python-tool: add a custom Python analysis script with auto-UI - /add-presets: add parameter presets for workflows - /configure-deployment: set up Docker and CI/CD for a new app - /add-visualization: add pyopenms-viz or OpenMS-Insight visualizations https://claude.ai/code/session_01WYotmLfqRtB8WJXj1Eosiz * Strengthen MS domain context in CLAUDE.md and skills Make it clear to Claude that this is THE framework for building mass spectrometry web applications for proteomics and metabolomics research. Add domain-specific context about MS data types, TOPP tool pipelines, and scientific visualization needs. https://claude.ai/code/session_01WYotmLfqRtB8WJXj1Eosiz --------- Co-authored-by: Claude <noreply@anthropic.com>

* Add Kubernetes manifests and CI workflows for de.NBI migration Decompose the monolithic Docker container into Kubernetes workloads: - Streamlit Deployment with health probes and session affinity - Redis Deployment + Service for job queue - RQ Worker Deployment for background workflows - CronJob for workspace cleanup - Ingress with WebSocket support and cookie-based sticky sessions - Shared PVC (ReadWriteMany) for workspace data - ConfigMap for runtime configuration (replaces build-time settings) - Kustomize base + template-app overlay for multi-app deployment Code changes: - Remove unsafe enableCORS=false and enableXsrfProtection=false from config.toml - Make workspace path configurable via WORKSPACES_DIR env var in clean-up-workspaces.py CI/CD: - Add build-and-push-image.yml to push Docker images to ghcr.io - Add k8s-manifests-ci.yml for manifest validation and kind integration tests https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix kubeconform validation to skip kustomization.yaml kustomization.yaml is a Kustomize config file, not a standard K8s resource, so kubeconform has no schema for it. Exclude it via -ignore-filename-pattern. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add matrix strategy to test both Dockerfiles in integration tests The integration-test job now uses a matrix with Dockerfile_simple and Dockerfile. Each matrix entry checks if its Dockerfile exists before running — all steps are guarded with an `if` condition so they skip gracefully when a Dockerfile is absent. This allows downstream forks that only have one Dockerfile to pass CI without errors. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Adapt K8s base manifests for de.NBI Cinder CSI storage - Switch workspace PVC from ReadWriteMany to ReadWriteOnce with cinder-csi storage class (required by de.NBI KKP cluster) - Increase PVC storage to 500Gi - Add namespace: openms to kustomization.yaml - Reduce pod resource requests (1Gi/500m) and limits (8Gi/4 CPU) so all workspace-mounting pods fit on a single node https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add pod affinity rules to co-locate all workspace pods on same node The workspaces PVC uses ReadWriteOnce (Cinder CSI block storage) which requires all pods mounting it to run on the same node. Without explicit affinity rules, the scheduler was failing silently, leaving pods in Pending state with no events. Adds a `volume-group: workspaces` label and podAffinity with requiredDuringSchedulingIgnoredDuringExecution to streamlit deployment, rq-worker deployment, and cleanup cronjob. This ensures the scheduler explicitly co-locates all workspace-consuming pods on the same node. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: wait for ingress-nginx admission webhook before deploying The controller pod being Ready doesn't guarantee the admission webhook service is accepting connections. Add a polling loop that waits for the webhook endpoint to have an IP assigned before applying the Ingress resource, preventing "connection refused" errors during kustomize apply. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: add -n openms namespace to integration test steps The kustomize overlay deploys into the openms namespace, but the verification steps (Redis wait, Redis ping, deployment checks) were querying the default namespace, causing "no matching resources found". https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: retry kustomize deploy for webhook readiness Replace the unreliable endpoint-IP polling with a retry loop on kubectl apply (up to 5 attempts with backoff). This handles the race where the ingress-nginx admission webhook has an endpoint IP but isn't yet accepting TCP connections. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ --------- Co-authored-by: Claude <noreply@anthropic.com>

* Add Kubernetes manifests and CI workflows for de.NBI migration Decompose the monolithic Docker container into Kubernetes workloads: - Streamlit Deployment with health probes and session affinity - Redis Deployment + Service for job queue - RQ Worker Deployment for background workflows - CronJob for workspace cleanup - Ingress with WebSocket support and cookie-based sticky sessions - Shared PVC (ReadWriteMany) for workspace data - ConfigMap for runtime configuration (replaces build-time settings) - Kustomize base + template-app overlay for multi-app deployment Code changes: - Remove unsafe enableCORS=false and enableXsrfProtection=false from config.toml - Make workspace path configurable via WORKSPACES_DIR env var in clean-up-workspaces.py CI/CD: - Add build-and-push-image.yml to push Docker images to ghcr.io - Add k8s-manifests-ci.yml for manifest validation and kind integration tests https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix kubeconform validation to skip kustomization.yaml kustomization.yaml is a Kustomize config file, not a standard K8s resource, so kubeconform has no schema for it. Exclude it via -ignore-filename-pattern. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add matrix strategy to test both Dockerfiles in integration tests The integration-test job now uses a matrix with Dockerfile_simple and Dockerfile. Each matrix entry checks if its Dockerfile exists before running — all steps are guarded with an `if` condition so they skip gracefully when a Dockerfile is absent. This allows downstream forks that only have one Dockerfile to pass CI without errors. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Adapt K8s base manifests for de.NBI Cinder CSI storage - Switch workspace PVC from ReadWriteMany to ReadWriteOnce with cinder-csi storage class (required by de.NBI KKP cluster) - Increase PVC storage to 500Gi - Add namespace: openms to kustomization.yaml - Reduce pod resource requests (1Gi/500m) and limits (8Gi/4 CPU) so all workspace-mounting pods fit on a single node https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add pod affinity rules to co-locate all workspace pods on same node The workspaces PVC uses ReadWriteOnce (Cinder CSI block storage) which requires all pods mounting it to run on the same node. Without explicit affinity rules, the scheduler was failing silently, leaving pods in Pending state with no events. Adds a `volume-group: workspaces` label and podAffinity with requiredDuringSchedulingIgnoredDuringExecution to streamlit deployment, rq-worker deployment, and cleanup cronjob. This ensures the scheduler explicitly co-locates all workspace-consuming pods on the same node. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: wait for ingress-nginx admission webhook before deploying The controller pod being Ready doesn't guarantee the admission webhook service is accepting connections. Add a polling loop that waits for the webhook endpoint to have an IP assigned before applying the Ingress resource, preventing "connection refused" errors during kustomize apply. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: add -n openms namespace to integration test steps The kustomize overlay deploys into the openms namespace, but the verification steps (Redis wait, Redis ping, deployment checks) were querying the default namespace, causing "no matching resources found". https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: retry kustomize deploy for webhook readiness Replace the unreliable endpoint-IP polling with a retry loop on kubectl apply (up to 5 attempts with backoff). This handles the race where the ingress-nginx admission webhook has an endpoint IP but isn't yet accepting TCP connections. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix REDIS_URL to use prefixed service name in overlay Kustomize namePrefix renames the Redis service to template-app-redis, but the REDIS_URL env var in streamlit and rq-worker deployments still referenced the unprefixed name "redis", causing the rq-worker to CrashLoopBackOff with "Name or service not known". Add JSON patches in the overlay to set the correct prefixed hostname. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add Traefik IngressRoute for direct LB IP access The cluster uses Traefik, not nginx, so the nginx Ingress annotations are ignored. Add a Traefik IngressRoute with PathPrefix(/) catch-all routing and sticky session cookie for Streamlit session affinity. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: skip Traefik IngressRoute CRD in validation and integration tests kubeconform doesn't know the Traefik IngressRoute CRD schema, and the kind cluster in integration tests doesn't have Traefik installed. Skip the IngressRoute in kubeconform validation and filter it out with yq before applying to the kind cluster. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix IngressRoute service name for kustomize namePrefix Kustomize namePrefix doesn't rewrite service references inside CRDs, so the IngressRoute was pointing to 'streamlit' instead of 'template-app-streamlit', causing Traefik to return 404. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: use ConfigMap as settings override instead of full replacement The ConfigMap was replacing the entire settings.json, losing keys like "version" and "repository-name" that the app expects (causing KeyError). Now the ConfigMap only contains deployment-specific overrides, which are merged into the Docker image's base settings.json at container startup using jq. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: add set -euo pipefail to fail fast on settings merge error Addresses CodeRabbit review: if jq merge fails, the container should not start with unmerged settings. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ --------- Co-authored-by: Claude <noreply@anthropic.com>

* Add Kubernetes manifests and CI workflows for de.NBI migration Decompose the monolithic Docker container into Kubernetes workloads: - Streamlit Deployment with health probes and session affinity - Redis Deployment + Service for job queue - RQ Worker Deployment for background workflows - CronJob for workspace cleanup - Ingress with WebSocket support and cookie-based sticky sessions - Shared PVC (ReadWriteMany) for workspace data - ConfigMap for runtime configuration (replaces build-time settings) - Kustomize base + template-app overlay for multi-app deployment Code changes: - Remove unsafe enableCORS=false and enableXsrfProtection=false from config.toml - Make workspace path configurable via WORKSPACES_DIR env var in clean-up-workspaces.py CI/CD: - Add build-and-push-image.yml to push Docker images to ghcr.io - Add k8s-manifests-ci.yml for manifest validation and kind integration tests https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix kubeconform validation to skip kustomization.yaml kustomization.yaml is a Kustomize config file, not a standard K8s resource, so kubeconform has no schema for it. Exclude it via -ignore-filename-pattern. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add matrix strategy to test both Dockerfiles in integration tests The integration-test job now uses a matrix with Dockerfile_simple and Dockerfile. Each matrix entry checks if its Dockerfile exists before running — all steps are guarded with an `if` condition so they skip gracefully when a Dockerfile is absent. This allows downstream forks that only have one Dockerfile to pass CI without errors. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Adapt K8s base manifests for de.NBI Cinder CSI storage - Switch workspace PVC from ReadWriteMany to ReadWriteOnce with cinder-csi storage class (required by de.NBI KKP cluster) - Increase PVC storage to 500Gi - Add namespace: openms to kustomization.yaml - Reduce pod resource requests (1Gi/500m) and limits (8Gi/4 CPU) so all workspace-mounting pods fit on a single node https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add pod affinity rules to co-locate all workspace pods on same node The workspaces PVC uses ReadWriteOnce (Cinder CSI block storage) which requires all pods mounting it to run on the same node. Without explicit affinity rules, the scheduler was failing silently, leaving pods in Pending state with no events. Adds a `volume-group: workspaces` label and podAffinity with requiredDuringSchedulingIgnoredDuringExecution to streamlit deployment, rq-worker deployment, and cleanup cronjob. This ensures the scheduler explicitly co-locates all workspace-consuming pods on the same node. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: wait for ingress-nginx admission webhook before deploying The controller pod being Ready doesn't guarantee the admission webhook service is accepting connections. Add a polling loop that waits for the webhook endpoint to have an IP assigned before applying the Ingress resource, preventing "connection refused" errors during kustomize apply. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: add -n openms namespace to integration test steps The kustomize overlay deploys into the openms namespace, but the verification steps (Redis wait, Redis ping, deployment checks) were querying the default namespace, causing "no matching resources found". https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: retry kustomize deploy for webhook readiness Replace the unreliable endpoint-IP polling with a retry loop on kubectl apply (up to 5 attempts with backoff). This handles the race where the ingress-nginx admission webhook has an endpoint IP but isn't yet accepting TCP connections. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix REDIS_URL to use prefixed service name in overlay Kustomize namePrefix renames the Redis service to template-app-redis, but the REDIS_URL env var in streamlit and rq-worker deployments still referenced the unprefixed name "redis", causing the rq-worker to CrashLoopBackOff with "Name or service not known". Add JSON patches in the overlay to set the correct prefixed hostname. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add Traefik IngressRoute for direct LB IP access The cluster uses Traefik, not nginx, so the nginx Ingress annotations are ignored. Add a Traefik IngressRoute with PathPrefix(/) catch-all routing and sticky session cookie for Streamlit session affinity. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: skip Traefik IngressRoute CRD in validation and integration tests kubeconform doesn't know the Traefik IngressRoute CRD schema, and the kind cluster in integration tests doesn't have Traefik installed. Skip the IngressRoute in kubeconform validation and filter it out with yq before applying to the kind cluster. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix IngressRoute service name for kustomize namePrefix Kustomize namePrefix doesn't rewrite service references inside CRDs, so the IngressRoute was pointing to 'streamlit' instead of 'template-app-streamlit', causing Traefik to return 404. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: use ConfigMap as settings override instead of full replacement The ConfigMap was replacing the entire settings.json, losing keys like "version" and "repository-name" that the app expects (causing KeyError). Now the ConfigMap only contains deployment-specific overrides, which are merged into the Docker image's base settings.json at container startup using jq. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: add set -euo pipefail to fail fast on settings merge error Addresses CodeRabbit review: if jq merge fails, the container should not start with unmerged settings. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: change imagePullPolicy to Always for mutable main tag With IfNotPresent, rollout restarts reuse the cached image even when a new version has been pushed with the same tag. Always ensures Kubernetes pulls the latest image on every pod start. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: build full Dockerfile instead of Dockerfile_simple Switch CI to build the full Docker image with OpenMS and TOPP tools, not the lightweight pyOpenMS-only image. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ --------- Co-authored-by: Claude <noreply@anthropic.com>

* Add Kubernetes manifests and CI workflows for de.NBI migration Decompose the monolithic Docker container into Kubernetes workloads: - Streamlit Deployment with health probes and session affinity - Redis Deployment + Service for job queue - RQ Worker Deployment for background workflows - CronJob for workspace cleanup - Ingress with WebSocket support and cookie-based sticky sessions - Shared PVC (ReadWriteMany) for workspace data - ConfigMap for runtime configuration (replaces build-time settings) - Kustomize base + template-app overlay for multi-app deployment Code changes: - Remove unsafe enableCORS=false and enableXsrfProtection=false from config.toml - Make workspace path configurable via WORKSPACES_DIR env var in clean-up-workspaces.py CI/CD: - Add build-and-push-image.yml to push Docker images to ghcr.io - Add k8s-manifests-ci.yml for manifest validation and kind integration tests https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix kubeconform validation to skip kustomization.yaml kustomization.yaml is a Kustomize config file, not a standard K8s resource, so kubeconform has no schema for it. Exclude it via -ignore-filename-pattern. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add matrix strategy to test both Dockerfiles in integration tests The integration-test job now uses a matrix with Dockerfile_simple and Dockerfile. Each matrix entry checks if its Dockerfile exists before running — all steps are guarded with an `if` condition so they skip gracefully when a Dockerfile is absent. This allows downstream forks that only have one Dockerfile to pass CI without errors. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Adapt K8s base manifests for de.NBI Cinder CSI storage - Switch workspace PVC from ReadWriteMany to ReadWriteOnce with cinder-csi storage class (required by de.NBI KKP cluster) - Increase PVC storage to 500Gi - Add namespace: openms to kustomization.yaml - Reduce pod resource requests (1Gi/500m) and limits (8Gi/4 CPU) so all workspace-mounting pods fit on a single node https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add pod affinity rules to co-locate all workspace pods on same node The workspaces PVC uses ReadWriteOnce (Cinder CSI block storage) which requires all pods mounting it to run on the same node. Without explicit affinity rules, the scheduler was failing silently, leaving pods in Pending state with no events. Adds a `volume-group: workspaces` label and podAffinity with requiredDuringSchedulingIgnoredDuringExecution to streamlit deployment, rq-worker deployment, and cleanup cronjob. This ensures the scheduler explicitly co-locates all workspace-consuming pods on the same node. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: wait for ingress-nginx admission webhook before deploying The controller pod being Ready doesn't guarantee the admission webhook service is accepting connections. Add a polling loop that waits for the webhook endpoint to have an IP assigned before applying the Ingress resource, preventing "connection refused" errors during kustomize apply. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: add -n openms namespace to integration test steps The kustomize overlay deploys into the openms namespace, but the verification steps (Redis wait, Redis ping, deployment checks) were querying the default namespace, causing "no matching resources found". https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: retry kustomize deploy for webhook readiness Replace the unreliable endpoint-IP polling with a retry loop on kubectl apply (up to 5 attempts with backoff). This handles the race where the ingress-nginx admission webhook has an endpoint IP but isn't yet accepting TCP connections. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix REDIS_URL to use prefixed service name in overlay Kustomize namePrefix renames the Redis service to template-app-redis, but the REDIS_URL env var in streamlit and rq-worker deployments still referenced the unprefixed name "redis", causing the rq-worker to CrashLoopBackOff with "Name or service not known". Add JSON patches in the overlay to set the correct prefixed hostname. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add Traefik IngressRoute for direct LB IP access The cluster uses Traefik, not nginx, so the nginx Ingress annotations are ignored. Add a Traefik IngressRoute with PathPrefix(/) catch-all routing and sticky session cookie for Streamlit session affinity. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: skip Traefik IngressRoute CRD in validation and integration tests kubeconform doesn't know the Traefik IngressRoute CRD schema, and the kind cluster in integration tests doesn't have Traefik installed. Skip the IngressRoute in kubeconform validation and filter it out with yq before applying to the kind cluster. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix IngressRoute service name for kustomize namePrefix Kustomize namePrefix doesn't rewrite service references inside CRDs, so the IngressRoute was pointing to 'streamlit' instead of 'template-app-streamlit', causing Traefik to return 404. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: use ConfigMap as settings override instead of full replacement The ConfigMap was replacing the entire settings.json, losing keys like "version" and "repository-name" that the app expects (causing KeyError). Now the ConfigMap only contains deployment-specific overrides, which are merged into the Docker image's base settings.json at container startup using jq. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: add set -euo pipefail to fail fast on settings merge error Addresses CodeRabbit review: if jq merge fails, the container should not start with unmerged settings. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: change imagePullPolicy to Always for mutable main tag With IfNotPresent, rollout restarts reuse the cached image even when a new version has been pushed with the same tag. Always ensures Kubernetes pulls the latest image on every pod start. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: build full Dockerfile instead of Dockerfile_simple Switch CI to build the full Docker image with OpenMS and TOPP tools, not the lightweight pyOpenMS-only image. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Scope IngressRoute to hostname and drop unused nginx Ingress Traefik is the only ingress controller on the cluster; the nginx Ingress in k8s/base/ingress.yaml was orphaned (no nginx class available) and the overlay was patching it instead of the active Traefik IngressRoute. - Add Host() match to the base IngressRoute (placeholder filled by overlays) - template-app overlay patches the IngressRoute with template.webapps.openms.de - Remove ingress.yaml from the base kustomization resources list (file kept in the repo for nginx-based consumers) https://claude.ai/code/session_01YNDYJTx1eSKaL9vQe1GQzV * fix: use PVC mount for workspaces in online mode In online mode, src/common/common.py hard-coded workspaces_dir to the literal ".." which, from WORKDIR /app, resolved to /. Workspace UUID directories were therefore created on each pod's ephemeral local filesystem instead of the shared PVC mounted at /workspaces-streamlit-template, so the Streamlit pod and the RQ worker each saw their own disconnected copy. The worker's params.json load in tasks.py then hit an empty dict, producing `KeyError: 'mzML-files'` as soon as Workflow.execution() ran. - common.py: in the online branch, use WORKSPACES_DIR env var (default /workspaces-streamlit-template) so Streamlit, the RQ worker, and the cleanup cronjob (which already reads WORKSPACES_DIR) all agree on one location. - k8s streamlit & rq-worker deployments: set WORKSPACES_DIR explicitly so the env is overridable and visible at deploy time. - WorkflowManager.start_workflow: call save_parameters() before dispatch so the latest session state is flushed to disk, closing a small race where a fragment rerun could leave params.json stale when the worker picked up the job. https://claude.ai/code/session_01TsxtENPpuCZ1Ap3mX2ZpHr --------- Co-authored-by: Claude <noreply@anthropic.com>

* fix(ci): pin OpenMS contrib download to matching release tag The Windows build step downloaded contrib_build-Windows.tar.gz from OpenMS/contrib without a --tag, always pulling the latest release. When the GH Actions cache (7-day eviction) expired, a newer contrib got pulled that was incompatible with the pinned OpenMS release/3.5.0 source tree, breaking MSVC compilation in DIAPrescoring.cpp. Pin the download to release/${OPENMS_VERSION} and tie the cache key to the OpenMS version so contrib stays in lockstep with the source. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): pass release tag as positional arg to gh release download `gh release download` takes the tag as a positional argument, not a `--tag` flag. Silently failed to match on Windows with the system error "The system cannot find the file specified". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: allow contrib version override via OPENMS_CONTRIB_VERSION Adds OPENMS_CONTRIB_VERSION env var that falls back to OPENMS_VERSION when empty. Lets us point OPENMS_VERSION at a non-release branch (e.g. develop) while keeping the Windows contrib download pinned to a known release tag, so CI doesn't fail on a missing contrib release. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: ignore docs/superpowers/ (local design notes) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Remove stale patches from template-app overlay The Deployment/streamlit patch with Ingress-shaped path /spec/rules/0/host never applied and produced a silent no-op. The duplicate IngressRoute service-name patch was redundant with the first IngressRoute patch block. This brings the on-disk overlay in line with the production cluster's running version. * Rename configure-deployment skill to configure-docker-compose-deployment First step of splitting the skill into three focused skills (configure-app-settings, configure-docker-compose-deployment, configure-k8s-deployment). Rename is in its own commit so git log --follow traces the docker-compose content cleanly. * Scope docker-compose skill to docker-compose-only Removes app-level content (settings.json, Dockerfile choice, production app examples) that will live in configure-app-settings. Adds a prerequisite note pointing to configure-app-settings. * Add configure-app-settings skill Covers app-level configuration (settings.json, Dockerfile choice, README, dependencies) shared by every deployment mode. Prerequisite for configure-docker-compose-deployment and configure-k8s-deployment. * Fix settings.json key-field list inconsistency The Key fields prose listed max_threads (not in the JSON sample) and omitted enable_workspaces (which is in the sample). Align the prose with the sample and describe max_threads separately since it is a nested object rather than a flat field. * Add configure-k8s-deployment skill New skill walking through Kustomize overlay creation and kubectl apply for deploying a forked app to Kubernetes. Patch list reflects the three-patch canonical shape (IngressRoute match + service, streamlit Redis URL, rq-worker Redis URL). * Fix inline-code rendering in k8s skill The Host(`...`) escape syntax produced literal backslashes that broke the inline-code span when rendered by markdown parsers. Rewrite as Host(...) without nested backticks so the span renders cleanly. * Add K8s deployment doc — overview and architecture sections * Add K8s deployment doc — manifest reference section * Add K8s deployment doc — fork-and-deploy guide * Add K8s deployment doc — CI/CD pipeline section * Clarify PR-blocking behavior depends on branch protection The workflow does not block merges directly — it produces a check status that a branch-protection rule can gate on. Make the preconditions explicit. * Register Kubernetes Deployment page in Streamlit documentation * Cross-link docs/deployment.md to Kubernetes deployment page Adds a preamble listing both deployment paths and introduces a ## Docker Compose heading above the existing content. The existing docker-compose content is preserved verbatim. * Add smoke test for Kubernetes Deployment documentation page Extends the parametrized test_documentation cases to cover the new Documentation page added by this branch, closing the gap where it was the only selectbox entry without test coverage.

ci: unified docker workflow (shadow mode)

github.repository preserves the original casing (OpenMS/streamlit-template). Docker OCI references require lowercase, so cache-from/cache-to fail with 'invalid reference format'. docker/metadata-action handles this internally for tags, but the cache refs bypass it. Compute IMAGE_NAME_LC once and use it in both cache refs.

ci: lowercase image name for OCI cache refs

With push: true, docker/build-push-action pushes every tag in its tags input. A bare name like 'openms-streamlit:simple-test' (no registry prefix) gets resolved to Docker Hub and fails with 401 unauthorized, because the workflow's GHCR token has no rights on docker.io. The local tag was only needed for the kind retag step. Since load: true already loads the image into the runner's docker daemon, we can create the stable local alias with a plain 'docker tag' step after build, picking any tag from docker/metadata-action's output.

ci: don't pass unprefixed local tag to buildx push

ci: cut over from old docker workflows to build-and-test

@V3

The @V3 floating tag does not exist on snok/container-retention-policy (v2 is the latest floating major tag; v3 only has v3.0.0 and v3.0.1 as exact version tags). The workflow fails to resolve the action with 'unable to find version v3'. Pin to v3.0.1 (latest v3 release).

ci: pin container-retention-policy to v3.0.1

The ENV GH_TOKEN=${GITHUB_TOKEN} at the top baked the per-run token into an early layer, so every workflow run rebuilt from scratch. Moved the ARG next to the one RUN that uses it (gh release download) so earlier layers stay cacheable.

Mirrors the base example with overlay-specific guidance: `namePrefix` only rewrites Kustomize-managed resources, so imperative Secrets must still use the literal name `streamlit-secrets`.

k8s: mount admin password from streamlit-secrets Secret

Factor node placement and memory sizing out of the base manifests into reusable Kustomize components (memory-tier-low / memory-tier-high), so each fork picks its tier with a single line in its overlay. - base: remove per-pod `resources` from streamlit and rq-worker Deployments; sizing now comes from the tier component - base: promote redis to Guaranteed QoS (requests == limits for both cpu and memory) so it bottoms the kernel OOM list - base: add LimitRange so containers without explicit resources inherit safe defaults (512Mi/250m request, 2Gi/2 limit, 64Gi/16 max) - components/memory-tier-low: nodeSelector=low, streamlit 512Mi/2Gi, rq-worker 1Gi/16Gi (Burstable) - components/memory-tier-high: nodeSelector=high, streamlit 512Mi/4Gi, rq-worker 2Gi/180Gi (Burstable — uniform across heavy workers so a single active app can burst into the shared pool) - overlays: rename template-app/ to prod/ (one overlay per repo; the repo itself identifies the app) and pull in memory-tier-low - docs & skill: document the new overlays/prod/ path and the one-line tier selector; update CI to kustomize the renamed overlay https://claude.ai/code/session_01LW4iBWt5YftuqFGc3jM5ZP

…oNnJ

The memory-tier-low component adds nodeSelector openms.de/memory-tier=low to every Deployment. kind clusters have no such label, so after the rename to overlays/prod all pods stayed Pending and 'Wait for Redis to be ready' timed out. Label --all kind nodes in both the nginx and Traefik integration jobs before deploying so the nodeSelector matches. Also raise the LimitRange max.memory from 64Gi to 200Gi. The original cap was written before memory-tier-high settled on a 180Gi rq-worker limit; without the bump, a high-tier fork (e.g. OpenDIAKiosk) would be rejected by admission when deployed into the shared openms namespace after the template's LimitRange is applied. https://claude.ai/code/session_01LW4iBWt5YftuqFGc3jM5ZP

…tp://127.0.0.1:34609/git/OpenMS/streamlit-template into claude/parallel-webapp-memory-optimization-RoNnJ

Completes the overlay rename started in 6c61365 now that the branch has merged main, which added the example file under the old path. Also rewrite two remaining docs references to overlays/<your-app-name>/ and the CI description to the new prod overlay. https://claude.ai/code/session_01LW4iBWt5YftuqFGc3jM5ZP

Spin up a 2-node kind cluster (control-plane labeled memory-tier=low + ingress-ready, worker labeled memory-tier=high) so the Build-and-Test job passes regardless of which memory-tier component a fork's overlay pulls in. Previously we labeled --all nodes with a single tier after creation, which broke as soon as a fork flipped memory-tier-low to memory-tier-high. - .github/kind-config.yaml: 2-node topology with per-node labels. - .github/workflows/build-and-test.yml: point both helm/kind-action invocations (nginx build + traefik-integration) at the config and drop the now-redundant dynamic label step. https://claude.ai/code/session_01LW4iBWt5YftuqFGc3jM5ZP

Previous run (2f28ed9) showed build + traefik-integration jobs still timing out on 'Wait for Redis'. Root cause: multi-node kind clusters apply node-role.kubernetes.io/control-plane:NoSchedule to the control-plane, which untolerated app pods can't land on even though the nodeSelector matches. The single-node kind used previously had no such taint, which is why CI worked until we added a second node. Add a kubeadmConfigPatches stanza setting nodeRegistration.taints to the empty list so the control-plane is schedulable. Labels and cluster shape (1 control-plane + 1 worker) stay the same. https://claude.ai/code/session_01LW4iBWt5YftuqFGc3jM5ZP

…imization-RoNnJ Refactor K8s deployment to use memory-tier components

Adds a seed-demos initContainer to the Streamlit Deployment that merges image-shipped demos into /workspaces-streamlit-template/.demos/ with cp -rn, so new demos in an image appear after redeploy while admin-saved demos and edits persist across redeploys. - Point demo_workspaces.source_dirs at the PV path via the ConfigMap override (both streamlit and rq-worker pick this up through the jq settings merge at startup). - Make get_demo_target_dir() settings-driven so "Save as Demo" writes to the PV, with backwards-compatible fallbacks for the legacy source_dir string and for environments without settings (tests). - Skip hidden top-level dirs in clean-up-workspaces.py so the nightly cron does not garbage-collect .demos/. - Document the .demos/ layout and the re-seed flow. https://claude.ai/code/session_01Y87aULHSdyBobPdaD4L6tW

…-azhkG Support configurable demo workspace source directories

The Secret used to be an out-of-band copy-the-example step, so forgetting the resources-list edit left the pod booting with an empty admin-secrets mount and a user-facing "Admin not configured" error for a feature that was never wired up in the first place. Now the Secret is committed to the base with an empty admin password and included in k8s/base/kustomization.yaml, so kubectl apply -k always creates it. The "Save as Demo" expander is gated on a non-empty password and is hidden entirely (no error box) when not configured. Operators enable the feature by patching the live Secret or by editing the file locally with git update-index --skip-worktree, both documented. Exception handling in is_admin_configured() is tightened to also catch StreamlitSecretNotFoundError so a missing secrets file never raises. https://claude.ai/code/session_01V1noocAR7uXWjWsC9oLGhz

Hide Save-as-Demo UI when admin password is not configured

Split the build+test flow into three stages so the traefik ingress test no longer rebuilds Dockerfile_simple from scratch: build (matrix: full, simple) -> uploads each image as a workflow artifact test-nginx (matrix: full, simple) -> downloads artifact, kind loads, tests nginx ingress test-traefik (simple only) -> downloads simple artifact, kind loads, tests traefik ingress Artifacts (not GHCR) are used because the build job only pushes on non-PR events and fork PRs cannot auth to GHCR at all, so registry sharing would not work for every PR path.

Mirror the build/test-nginx matrix so the traefik ingress test also covers the full and simple variants instead of just simple.

test-traefik (simple) failed in the combined "Wait for Redis and deployments to be ready" step because the deployment took longer than 120s to become available, and unlike the test-nginx wait the failure was not soft. Align test-traefik with test-nginx: - Split Redis wait (hard, 60s) from deployment wait (soft, `|| true`). - Bump deployment timeout 120s -> 180s in both jobs. - Widen the curl warm-up loop from 5x2s to 30x2s in both jobs so a marginally late deployment is tolerated; a real failure still surfaces via the trailing unconditional curl.

The previous skill was a manual find-and-replace checklist that assumed Claude could run kubectl against the cluster. Restructure it as an interview-driven file-editing guide with a clear handoff to a human operator (or CI) for cluster apply. - Drop kubectl, kubectl kustomize, and rollout-verification steps that Claude can't actually execute. - Drop nginx ingress fallback; production is Traefik-only. - Add a Step 1 recon over a fixed set of base/overlay/CI files so defaults are derived from the repo, and the skill bails on layouts it doesn't recognize. - Replace the manual checklist with six interview questions, each paired with what it controls in the running deployment, the proposed default, and the reasoning. Slug, GHCR ref, image tag, ingress subdomain, memory tier, workspace storage size. - Make storage a single 1-line edit to k8s/base/workspace-pvc.yaml when the user picks a non-default size; keep the PVC base name unchanged (namePrefix scopes it per-fork, no collisions). - Pin the default storage size to 500 Gi to match the stock base, so the default needs zero file edits. - Explain that images[0].name is a Kustomize match key and must not change.

Refactor CI workflow to build images once and reuse across jobs

Refactor k8s deployment skill to interview-driven overlay editing

The shared volume-group: workspaces label and required pod-affinity attracted every fork's workspace pods onto a single node per memory tier and deadlocked the first replica of any fork landing on an otherwise-empty tier (no peer pod for the required affinity to match). Per-fork RWO PVCs (<slug>-workspaces-pvc) already constrain all of a fork's workspace-using pods to the node the volume is attached to via the scheduler's VolumeBinding plugin, so the explicit affinity adds nothing on top. Removing it scopes co-location naturally to one fork and lets a fresh tier bootstrap without manual affinity-strip. NodeSelector continues to pick the memory tier; the RWO mount picks the specific node within that tier.

The kind integration jobs in build-and-test.yml hardcoded `template-app` as the slug label and `template.webapps.openms.{de,org}` as the Traefik hostnames. The configure-k8s-deployment skill rewrites those values when a fork customizes its overlay, after which `kubectl wait -l app=...` returns "no matching resources found" and Traefik curl tests hit the wrong Host header. This broke OpenMS/quantms-web PR #19 on its first overlay PR (run 24964475081). Have test-nginx and test-traefik discover SLUG (from `commonLabels.app`) and TRAEFIK_HOSTS (parsed from the rendered IngressRoute match) right after deploy, and substitute them into the wait/curl steps. The nginx hostnames stay hardcoded — they come from `k8s/base/ingress.yaml`, which the skill never edits and Kustomize doesn't rewrite. Update the configure-k8s-deployment skill to (a) check during recon that the workflow uses dynamic discovery, (b) flag forks still on the old hardcoded shape so the skill applies the patch before editing the overlay, and (c) note in the handoff that no fork-specific workflow edits are needed.

Remove pod-affinity rules; rely on RWO PVC for co-location

Make CI integration tests discover app slug and hosts dynamically

coderabbitai · 2026-04-27T07:22:47Z

Warning

Rate limit exceeded

@t0mdavid-m has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 29 minutes and 15 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ef600ab5-1661-47ff-b0d1-66b02deadef7

📥 Commits

Reviewing files that changed from the base of the PR and between 93ab8ec and e83f62e.

📒 Files selected for processing (6)

.claude/skills/configure-k8s-deployment.md
.github/workflows/build-and-test.yml
docs/kubernetes-deployment.md
k8s/base/cleanup-cronjob.yaml
k8s/base/rq-worker-deployment.yaml
k8s/base/streamlit-deployment.yaml

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix_volume_bind

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

t0mdavid-m and others added 30 commits February 20, 2026 15:24

reenable cross origin protection

128da6d

ci: add ghcr-cleanup workflow (scheduled disabled, dry-run default)

12f8e0d

ci: scaffold build-and-test workflow with lint-manifests job

0633790

ci: add build job skeleton with matrix, buildx, ghcr login

aa585c2

ci: add metadata extraction, build-push, and registry cache

7a2dc86

ci: add kind integration steps to build job

e7f6c08

Merge pull request #363 from OpenMS/ci/unify-docker-workflows

fca45ee

ci: unified docker workflow (shadow mode)

Merge pull request #364 from OpenMS/ci/fix-lowercase-cache-ref

d8d3d03

ci: lowercase image name for OCI cache refs

Merge pull request #365 from OpenMS/ci/fix-local-tag-push-to-hub

fa46191

ci: don't pass unprefixed local tag to buildx push

ci: delete old docker workflows now superseded by build-and-test

dd32bd1

k8s: pin overlay image tag to main-full (new CI scheme)

9ce9585

docs(skill): update k8s deploy skill for unified CI workflow

93fb2a4

docs(k8s): update deployment doc for unified CI workflow

29d94b4

Merge pull request #366 from OpenMS/ci/cutover-old-workflows

e1cc9d4

ci: cut over from old docker workflows to build-and-test

Merge pull request #367 from OpenMS/ci/pin-retention-action-to-v3.0.1

bd9bf5e

ci: pin container-retention-policy to v3.0.1

fix(docker): stop cache-busting on GITHUB_TOKEN

859e481

The ENV GH_TOKEN=${GITHUB_TOKEN} at the top baked the per-run token into an early layer, so every workflow run rebuilt from scratch. Moved the ARG next to the one RUN that uses it (gh release download) so earlier layers stay cacheable.

t0mdavid-m and others added 27 commits April 24, 2026 11:29

docs(k8s): add streamlit-secrets example to template-app overlay

2cb4813

Mirrors the base example with overlay-specific guidance: `namePrefix` only rewrites Kustomize-managed resources, so imperative Secrets must still use the literal name `streamlit-secrets`.

Merge pull request #373 from OpenMS/claude/hide-demo-password-uB77g

971cfdd

k8s: mount admin password from streamlit-secrets Secret

Merge branch 'main' into claude/parallel-webapp-memory-optimization-R…

11ff5cd

…oNnJ

Merge branch 'claude/parallel-webapp-memory-optimization-RoNnJ' of ht…

8abb90a

…tp://127.0.0.1:34609/git/OpenMS/streamlit-template into claude/parallel-webapp-memory-optimization-RoNnJ

Merge pull request #375 from OpenMS/claude/parallel-webapp-memory-opt…

64f43e2

…imization-RoNnJ Refactor K8s deployment to use memory-tier components

Merge pull request #376 from OpenMS/claude/demo-workspace-storage-k8s…

5bd5898

…-azhkG Support configurable demo workspace source directories

Merge pull request #377 from OpenMS/claude/fix-streamlit-secrets-61u1H

3387b9c

Hide Save-as-Demo UI when admin password is not configured

ci: run test-traefik against both image variants

eeb8e3f

Mirror the build/test-nginx matrix so the traefik ingress test also covers the full and simple variants instead of just simple.

Merge pull request #378 from OpenMS/claude/reuse-docker-images-ci-DNvEO

03eff39

Refactor CI workflow to build images once and reuse across jobs

Merge pull request #379 from OpenMS/claude/review-k8s-skill-UtYEm

8e096ac

Refactor k8s deployment skill to interview-driven overlay editing

Merge pull request #380 from OpenMS/claude/fix-pod-affinity-labels-YvnlN

9f819a8

Remove pod-affinity rules; rely on RWO PVC for co-location

Merge branch 'main' into claude/fix-k8s-deployment-ci-XI82S

99b6663

Merge pull request #381 from OpenMS/claude/fix-k8s-deployment-ci-XI82S

36d2af7

Make CI integration tests discover app slug and hosts dynamically

Merge remote-tracking branch 'template/main' into fix_volume_bind

08086fc

refix ci

b2f2740

refix admin panel

e83f62e

t0mdavid-m merged commit 915f894 into main Apr 27, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix volume bind#20

Fix volume bind#20
t0mdavid-m merged 79 commits intomainfrom
fix_volume_bind

t0mdavid-m commented Apr 27, 2026

Uh oh!

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

t0mdavid-m commented Apr 27, 2026

Uh oh!

coderabbitai Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai Bot commented Apr 27, 2026 •

edited

Loading