Fix volume bind#20
Conversation
* Add Matomo Tag Manager as third analytics tracking mode Adds Matomo Tag Manager support alongside existing Google Analytics and Piwik Pro integrations. Includes settings.json configuration (url + tag), build-time script injection via hook-analytics.py, Klaro GDPR consent banner integration, and runtime consent granting via MTM data layer API. https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * Fix Matomo Tag Manager snippet to match official docs - Accept full container JS URL instead of separate url + tag fields, supporting both self-hosted and Matomo Cloud URL patterns - Match the official snippet: var _mtm alias, _mtm.push shorthand - Remove redundant type="text/javascript" attribute - Remove unused "tag" field from settings.json https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * Split Matomo config into base url + tag fields Separate the Matomo setting into `url` (base URL, e.g. https://cdn.matomo.cloud/openms.matomo.cloud) and `tag` (container ID, e.g. yDGK8bfY), consistent with how other providers use a tag field. The script constructs the full path: {url}/container_{tag}.js https://claude.ai/code/session_0165AXHkmRZ6bx23n7Tbyz8h * install matomo tag --------- Co-authored-by: Claude <noreply@anthropic.com>
* Initial plan * fix: remove duplicate address entry in config.toml Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com>
…til.SameFileError (#349) * Initial plan * Fix integration test failures: restore sys.modules mocks, handle SameFileError, update CI workflow Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> * Remove unnecessary pyopenms mock from test_topp_workflow_parameter.py, simplify test_parameter_presets.py Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> * Fix Windows build: correct site-packages path in cleanup step Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: t0mdavid-m <57191390+t0mdavid-m@users.noreply.github.com>
…(#351) On Windows, 0.0.0.0 is not a valid connect address — the browser fails to open http://0.0.0.0:8501. By removing the address entry from the bundled .streamlit/config.toml, Streamlit defaults to localhost, which works correctly for local deployments. Docker deployments are unaffected as they pass --server.address 0.0.0.0 on the command line. https://claude.ai/code/session_016amsLCZeFogTksmtk1geb5 Co-authored-by: Claude <noreply@anthropic.com>
* Add CLAUDE.md and Claude Code skills for webapp development Adds project documentation (CLAUDE.md) and 6 skills to help developers scaffold and extend OpenMS web applications built from this template: - /create-page: add a new Streamlit page with proper registration - /create-workflow: scaffold a full TOPP workflow (class + 4 pages) - /add-python-tool: add a custom Python analysis script with auto-UI - /add-presets: add parameter presets for workflows - /configure-deployment: set up Docker and CI/CD for a new app - /add-visualization: add pyopenms-viz or OpenMS-Insight visualizations https://claude.ai/code/session_01WYotmLfqRtB8WJXj1Eosiz * Strengthen MS domain context in CLAUDE.md and skills Make it clear to Claude that this is THE framework for building mass spectrometry web applications for proteomics and metabolomics research. Add domain-specific context about MS data types, TOPP tool pipelines, and scientific visualization needs. https://claude.ai/code/session_01WYotmLfqRtB8WJXj1Eosiz --------- Co-authored-by: Claude <noreply@anthropic.com>
* Add Kubernetes manifests and CI workflows for de.NBI migration Decompose the monolithic Docker container into Kubernetes workloads: - Streamlit Deployment with health probes and session affinity - Redis Deployment + Service for job queue - RQ Worker Deployment for background workflows - CronJob for workspace cleanup - Ingress with WebSocket support and cookie-based sticky sessions - Shared PVC (ReadWriteMany) for workspace data - ConfigMap for runtime configuration (replaces build-time settings) - Kustomize base + template-app overlay for multi-app deployment Code changes: - Remove unsafe enableCORS=false and enableXsrfProtection=false from config.toml - Make workspace path configurable via WORKSPACES_DIR env var in clean-up-workspaces.py CI/CD: - Add build-and-push-image.yml to push Docker images to ghcr.io - Add k8s-manifests-ci.yml for manifest validation and kind integration tests https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix kubeconform validation to skip kustomization.yaml kustomization.yaml is a Kustomize config file, not a standard K8s resource, so kubeconform has no schema for it. Exclude it via -ignore-filename-pattern. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add matrix strategy to test both Dockerfiles in integration tests The integration-test job now uses a matrix with Dockerfile_simple and Dockerfile. Each matrix entry checks if its Dockerfile exists before running — all steps are guarded with an `if` condition so they skip gracefully when a Dockerfile is absent. This allows downstream forks that only have one Dockerfile to pass CI without errors. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Adapt K8s base manifests for de.NBI Cinder CSI storage - Switch workspace PVC from ReadWriteMany to ReadWriteOnce with cinder-csi storage class (required by de.NBI KKP cluster) - Increase PVC storage to 500Gi - Add namespace: openms to kustomization.yaml - Reduce pod resource requests (1Gi/500m) and limits (8Gi/4 CPU) so all workspace-mounting pods fit on a single node https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add pod affinity rules to co-locate all workspace pods on same node The workspaces PVC uses ReadWriteOnce (Cinder CSI block storage) which requires all pods mounting it to run on the same node. Without explicit affinity rules, the scheduler was failing silently, leaving pods in Pending state with no events. Adds a `volume-group: workspaces` label and podAffinity with requiredDuringSchedulingIgnoredDuringExecution to streamlit deployment, rq-worker deployment, and cleanup cronjob. This ensures the scheduler explicitly co-locates all workspace-consuming pods on the same node. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: wait for ingress-nginx admission webhook before deploying The controller pod being Ready doesn't guarantee the admission webhook service is accepting connections. Add a polling loop that waits for the webhook endpoint to have an IP assigned before applying the Ingress resource, preventing "connection refused" errors during kustomize apply. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: add -n openms namespace to integration test steps The kustomize overlay deploys into the openms namespace, but the verification steps (Redis wait, Redis ping, deployment checks) were querying the default namespace, causing "no matching resources found". https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: retry kustomize deploy for webhook readiness Replace the unreliable endpoint-IP polling with a retry loop on kubectl apply (up to 5 attempts with backoff). This handles the race where the ingress-nginx admission webhook has an endpoint IP but isn't yet accepting TCP connections. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ --------- Co-authored-by: Claude <noreply@anthropic.com>
* Add Kubernetes manifests and CI workflows for de.NBI migration Decompose the monolithic Docker container into Kubernetes workloads: - Streamlit Deployment with health probes and session affinity - Redis Deployment + Service for job queue - RQ Worker Deployment for background workflows - CronJob for workspace cleanup - Ingress with WebSocket support and cookie-based sticky sessions - Shared PVC (ReadWriteMany) for workspace data - ConfigMap for runtime configuration (replaces build-time settings) - Kustomize base + template-app overlay for multi-app deployment Code changes: - Remove unsafe enableCORS=false and enableXsrfProtection=false from config.toml - Make workspace path configurable via WORKSPACES_DIR env var in clean-up-workspaces.py CI/CD: - Add build-and-push-image.yml to push Docker images to ghcr.io - Add k8s-manifests-ci.yml for manifest validation and kind integration tests https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix kubeconform validation to skip kustomization.yaml kustomization.yaml is a Kustomize config file, not a standard K8s resource, so kubeconform has no schema for it. Exclude it via -ignore-filename-pattern. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add matrix strategy to test both Dockerfiles in integration tests The integration-test job now uses a matrix with Dockerfile_simple and Dockerfile. Each matrix entry checks if its Dockerfile exists before running — all steps are guarded with an `if` condition so they skip gracefully when a Dockerfile is absent. This allows downstream forks that only have one Dockerfile to pass CI without errors. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Adapt K8s base manifests for de.NBI Cinder CSI storage - Switch workspace PVC from ReadWriteMany to ReadWriteOnce with cinder-csi storage class (required by de.NBI KKP cluster) - Increase PVC storage to 500Gi - Add namespace: openms to kustomization.yaml - Reduce pod resource requests (1Gi/500m) and limits (8Gi/4 CPU) so all workspace-mounting pods fit on a single node https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add pod affinity rules to co-locate all workspace pods on same node The workspaces PVC uses ReadWriteOnce (Cinder CSI block storage) which requires all pods mounting it to run on the same node. Without explicit affinity rules, the scheduler was failing silently, leaving pods in Pending state with no events. Adds a `volume-group: workspaces` label and podAffinity with requiredDuringSchedulingIgnoredDuringExecution to streamlit deployment, rq-worker deployment, and cleanup cronjob. This ensures the scheduler explicitly co-locates all workspace-consuming pods on the same node. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: wait for ingress-nginx admission webhook before deploying The controller pod being Ready doesn't guarantee the admission webhook service is accepting connections. Add a polling loop that waits for the webhook endpoint to have an IP assigned before applying the Ingress resource, preventing "connection refused" errors during kustomize apply. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: add -n openms namespace to integration test steps The kustomize overlay deploys into the openms namespace, but the verification steps (Redis wait, Redis ping, deployment checks) were querying the default namespace, causing "no matching resources found". https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: retry kustomize deploy for webhook readiness Replace the unreliable endpoint-IP polling with a retry loop on kubectl apply (up to 5 attempts with backoff). This handles the race where the ingress-nginx admission webhook has an endpoint IP but isn't yet accepting TCP connections. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix REDIS_URL to use prefixed service name in overlay Kustomize namePrefix renames the Redis service to template-app-redis, but the REDIS_URL env var in streamlit and rq-worker deployments still referenced the unprefixed name "redis", causing the rq-worker to CrashLoopBackOff with "Name or service not known". Add JSON patches in the overlay to set the correct prefixed hostname. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add Traefik IngressRoute for direct LB IP access The cluster uses Traefik, not nginx, so the nginx Ingress annotations are ignored. Add a Traefik IngressRoute with PathPrefix(/) catch-all routing and sticky session cookie for Streamlit session affinity. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: skip Traefik IngressRoute CRD in validation and integration tests kubeconform doesn't know the Traefik IngressRoute CRD schema, and the kind cluster in integration tests doesn't have Traefik installed. Skip the IngressRoute in kubeconform validation and filter it out with yq before applying to the kind cluster. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix IngressRoute service name for kustomize namePrefix Kustomize namePrefix doesn't rewrite service references inside CRDs, so the IngressRoute was pointing to 'streamlit' instead of 'template-app-streamlit', causing Traefik to return 404. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: use ConfigMap as settings override instead of full replacement The ConfigMap was replacing the entire settings.json, losing keys like "version" and "repository-name" that the app expects (causing KeyError). Now the ConfigMap only contains deployment-specific overrides, which are merged into the Docker image's base settings.json at container startup using jq. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: add set -euo pipefail to fail fast on settings merge error Addresses CodeRabbit review: if jq merge fails, the container should not start with unmerged settings. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ --------- Co-authored-by: Claude <noreply@anthropic.com>
* Add Kubernetes manifests and CI workflows for de.NBI migration Decompose the monolithic Docker container into Kubernetes workloads: - Streamlit Deployment with health probes and session affinity - Redis Deployment + Service for job queue - RQ Worker Deployment for background workflows - CronJob for workspace cleanup - Ingress with WebSocket support and cookie-based sticky sessions - Shared PVC (ReadWriteMany) for workspace data - ConfigMap for runtime configuration (replaces build-time settings) - Kustomize base + template-app overlay for multi-app deployment Code changes: - Remove unsafe enableCORS=false and enableXsrfProtection=false from config.toml - Make workspace path configurable via WORKSPACES_DIR env var in clean-up-workspaces.py CI/CD: - Add build-and-push-image.yml to push Docker images to ghcr.io - Add k8s-manifests-ci.yml for manifest validation and kind integration tests https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix kubeconform validation to skip kustomization.yaml kustomization.yaml is a Kustomize config file, not a standard K8s resource, so kubeconform has no schema for it. Exclude it via -ignore-filename-pattern. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add matrix strategy to test both Dockerfiles in integration tests The integration-test job now uses a matrix with Dockerfile_simple and Dockerfile. Each matrix entry checks if its Dockerfile exists before running — all steps are guarded with an `if` condition so they skip gracefully when a Dockerfile is absent. This allows downstream forks that only have one Dockerfile to pass CI without errors. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Adapt K8s base manifests for de.NBI Cinder CSI storage - Switch workspace PVC from ReadWriteMany to ReadWriteOnce with cinder-csi storage class (required by de.NBI KKP cluster) - Increase PVC storage to 500Gi - Add namespace: openms to kustomization.yaml - Reduce pod resource requests (1Gi/500m) and limits (8Gi/4 CPU) so all workspace-mounting pods fit on a single node https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add pod affinity rules to co-locate all workspace pods on same node The workspaces PVC uses ReadWriteOnce (Cinder CSI block storage) which requires all pods mounting it to run on the same node. Without explicit affinity rules, the scheduler was failing silently, leaving pods in Pending state with no events. Adds a `volume-group: workspaces` label and podAffinity with requiredDuringSchedulingIgnoredDuringExecution to streamlit deployment, rq-worker deployment, and cleanup cronjob. This ensures the scheduler explicitly co-locates all workspace-consuming pods on the same node. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: wait for ingress-nginx admission webhook before deploying The controller pod being Ready doesn't guarantee the admission webhook service is accepting connections. Add a polling loop that waits for the webhook endpoint to have an IP assigned before applying the Ingress resource, preventing "connection refused" errors during kustomize apply. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: add -n openms namespace to integration test steps The kustomize overlay deploys into the openms namespace, but the verification steps (Redis wait, Redis ping, deployment checks) were querying the default namespace, causing "no matching resources found". https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: retry kustomize deploy for webhook readiness Replace the unreliable endpoint-IP polling with a retry loop on kubectl apply (up to 5 attempts with backoff). This handles the race where the ingress-nginx admission webhook has an endpoint IP but isn't yet accepting TCP connections. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix REDIS_URL to use prefixed service name in overlay Kustomize namePrefix renames the Redis service to template-app-redis, but the REDIS_URL env var in streamlit and rq-worker deployments still referenced the unprefixed name "redis", causing the rq-worker to CrashLoopBackOff with "Name or service not known". Add JSON patches in the overlay to set the correct prefixed hostname. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add Traefik IngressRoute for direct LB IP access The cluster uses Traefik, not nginx, so the nginx Ingress annotations are ignored. Add a Traefik IngressRoute with PathPrefix(/) catch-all routing and sticky session cookie for Streamlit session affinity. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: skip Traefik IngressRoute CRD in validation and integration tests kubeconform doesn't know the Traefik IngressRoute CRD schema, and the kind cluster in integration tests doesn't have Traefik installed. Skip the IngressRoute in kubeconform validation and filter it out with yq before applying to the kind cluster. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix IngressRoute service name for kustomize namePrefix Kustomize namePrefix doesn't rewrite service references inside CRDs, so the IngressRoute was pointing to 'streamlit' instead of 'template-app-streamlit', causing Traefik to return 404. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: use ConfigMap as settings override instead of full replacement The ConfigMap was replacing the entire settings.json, losing keys like "version" and "repository-name" that the app expects (causing KeyError). Now the ConfigMap only contains deployment-specific overrides, which are merged into the Docker image's base settings.json at container startup using jq. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: add set -euo pipefail to fail fast on settings merge error Addresses CodeRabbit review: if jq merge fails, the container should not start with unmerged settings. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: change imagePullPolicy to Always for mutable main tag With IfNotPresent, rollout restarts reuse the cached image even when a new version has been pushed with the same tag. Always ensures Kubernetes pulls the latest image on every pod start. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: build full Dockerfile instead of Dockerfile_simple Switch CI to build the full Docker image with OpenMS and TOPP tools, not the lightweight pyOpenMS-only image. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ --------- Co-authored-by: Claude <noreply@anthropic.com>
* Add Kubernetes manifests and CI workflows for de.NBI migration Decompose the monolithic Docker container into Kubernetes workloads: - Streamlit Deployment with health probes and session affinity - Redis Deployment + Service for job queue - RQ Worker Deployment for background workflows - CronJob for workspace cleanup - Ingress with WebSocket support and cookie-based sticky sessions - Shared PVC (ReadWriteMany) for workspace data - ConfigMap for runtime configuration (replaces build-time settings) - Kustomize base + template-app overlay for multi-app deployment Code changes: - Remove unsafe enableCORS=false and enableXsrfProtection=false from config.toml - Make workspace path configurable via WORKSPACES_DIR env var in clean-up-workspaces.py CI/CD: - Add build-and-push-image.yml to push Docker images to ghcr.io - Add k8s-manifests-ci.yml for manifest validation and kind integration tests https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix kubeconform validation to skip kustomization.yaml kustomization.yaml is a Kustomize config file, not a standard K8s resource, so kubeconform has no schema for it. Exclude it via -ignore-filename-pattern. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add matrix strategy to test both Dockerfiles in integration tests The integration-test job now uses a matrix with Dockerfile_simple and Dockerfile. Each matrix entry checks if its Dockerfile exists before running — all steps are guarded with an `if` condition so they skip gracefully when a Dockerfile is absent. This allows downstream forks that only have one Dockerfile to pass CI without errors. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Adapt K8s base manifests for de.NBI Cinder CSI storage - Switch workspace PVC from ReadWriteMany to ReadWriteOnce with cinder-csi storage class (required by de.NBI KKP cluster) - Increase PVC storage to 500Gi - Add namespace: openms to kustomization.yaml - Reduce pod resource requests (1Gi/500m) and limits (8Gi/4 CPU) so all workspace-mounting pods fit on a single node https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add pod affinity rules to co-locate all workspace pods on same node The workspaces PVC uses ReadWriteOnce (Cinder CSI block storage) which requires all pods mounting it to run on the same node. Without explicit affinity rules, the scheduler was failing silently, leaving pods in Pending state with no events. Adds a `volume-group: workspaces` label and podAffinity with requiredDuringSchedulingIgnoredDuringExecution to streamlit deployment, rq-worker deployment, and cleanup cronjob. This ensures the scheduler explicitly co-locates all workspace-consuming pods on the same node. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: wait for ingress-nginx admission webhook before deploying The controller pod being Ready doesn't guarantee the admission webhook service is accepting connections. Add a polling loop that waits for the webhook endpoint to have an IP assigned before applying the Ingress resource, preventing "connection refused" errors during kustomize apply. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: add -n openms namespace to integration test steps The kustomize overlay deploys into the openms namespace, but the verification steps (Redis wait, Redis ping, deployment checks) were querying the default namespace, causing "no matching resources found". https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: retry kustomize deploy for webhook readiness Replace the unreliable endpoint-IP polling with a retry loop on kubectl apply (up to 5 attempts with backoff). This handles the race where the ingress-nginx admission webhook has an endpoint IP but isn't yet accepting TCP connections. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix REDIS_URL to use prefixed service name in overlay Kustomize namePrefix renames the Redis service to template-app-redis, but the REDIS_URL env var in streamlit and rq-worker deployments still referenced the unprefixed name "redis", causing the rq-worker to CrashLoopBackOff with "Name or service not known". Add JSON patches in the overlay to set the correct prefixed hostname. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Add Traefik IngressRoute for direct LB IP access The cluster uses Traefik, not nginx, so the nginx Ingress annotations are ignored. Add a Traefik IngressRoute with PathPrefix(/) catch-all routing and sticky session cookie for Streamlit session affinity. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix CI: skip Traefik IngressRoute CRD in validation and integration tests kubeconform doesn't know the Traefik IngressRoute CRD schema, and the kind cluster in integration tests doesn't have Traefik installed. Skip the IngressRoute in kubeconform validation and filter it out with yq before applying to the kind cluster. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Fix IngressRoute service name for kustomize namePrefix Kustomize namePrefix doesn't rewrite service references inside CRDs, so the IngressRoute was pointing to 'streamlit' instead of 'template-app-streamlit', causing Traefik to return 404. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: use ConfigMap as settings override instead of full replacement The ConfigMap was replacing the entire settings.json, losing keys like "version" and "repository-name" that the app expects (causing KeyError). Now the ConfigMap only contains deployment-specific overrides, which are merged into the Docker image's base settings.json at container startup using jq. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: add set -euo pipefail to fail fast on settings merge error Addresses CodeRabbit review: if jq merge fails, the container should not start with unmerged settings. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: change imagePullPolicy to Always for mutable main tag With IfNotPresent, rollout restarts reuse the cached image even when a new version has been pushed with the same tag. Always ensures Kubernetes pulls the latest image on every pod start. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * fix: build full Dockerfile instead of Dockerfile_simple Switch CI to build the full Docker image with OpenMS and TOPP tools, not the lightweight pyOpenMS-only image. https://claude.ai/code/session_01RNJ3dVjV1VTHcC9ugE3FQJ * Scope IngressRoute to hostname and drop unused nginx Ingress Traefik is the only ingress controller on the cluster; the nginx Ingress in k8s/base/ingress.yaml was orphaned (no nginx class available) and the overlay was patching it instead of the active Traefik IngressRoute. - Add Host() match to the base IngressRoute (placeholder filled by overlays) - template-app overlay patches the IngressRoute with template.webapps.openms.de - Remove ingress.yaml from the base kustomization resources list (file kept in the repo for nginx-based consumers) https://claude.ai/code/session_01YNDYJTx1eSKaL9vQe1GQzV * fix: use PVC mount for workspaces in online mode In online mode, src/common/common.py hard-coded workspaces_dir to the literal ".." which, from WORKDIR /app, resolved to /. Workspace UUID directories were therefore created on each pod's ephemeral local filesystem instead of the shared PVC mounted at /workspaces-streamlit-template, so the Streamlit pod and the RQ worker each saw their own disconnected copy. The worker's params.json load in tasks.py then hit an empty dict, producing `KeyError: 'mzML-files'` as soon as Workflow.execution() ran. - common.py: in the online branch, use WORKSPACES_DIR env var (default /workspaces-streamlit-template) so Streamlit, the RQ worker, and the cleanup cronjob (which already reads WORKSPACES_DIR) all agree on one location. - k8s streamlit & rq-worker deployments: set WORKSPACES_DIR explicitly so the env is overridable and visible at deploy time. - WorkflowManager.start_workflow: call save_parameters() before dispatch so the latest session state is flushed to disk, closing a small race where a fragment rerun could leave params.json stale when the worker picked up the job. https://claude.ai/code/session_01TsxtENPpuCZ1Ap3mX2ZpHr --------- Co-authored-by: Claude <noreply@anthropic.com>
* fix(ci): pin OpenMS contrib download to matching release tag
The Windows build step downloaded contrib_build-Windows.tar.gz from
OpenMS/contrib without a --tag, always pulling the latest release.
When the GH Actions cache (7-day eviction) expired, a newer contrib
got pulled that was incompatible with the pinned OpenMS release/3.5.0
source tree, breaking MSVC compilation in DIAPrescoring.cpp.
Pin the download to release/${OPENMS_VERSION} and tie the cache key
to the OpenMS version so contrib stays in lockstep with the source.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ci): pass release tag as positional arg to gh release download
`gh release download` takes the tag as a positional argument, not a
`--tag` flag. Silently failed to match on Windows with the system error
"The system cannot find the file specified".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* ci: allow contrib version override via OPENMS_CONTRIB_VERSION
Adds OPENMS_CONTRIB_VERSION env var that falls back to OPENMS_VERSION
when empty. Lets us point OPENMS_VERSION at a non-release branch (e.g.
develop) while keeping the Windows contrib download pinned to a known
release tag, so CI doesn't fail on a missing contrib release.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: ignore docs/superpowers/ (local design notes)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Remove stale patches from template-app overlay The Deployment/streamlit patch with Ingress-shaped path /spec/rules/0/host never applied and produced a silent no-op. The duplicate IngressRoute service-name patch was redundant with the first IngressRoute patch block. This brings the on-disk overlay in line with the production cluster's running version. * Rename configure-deployment skill to configure-docker-compose-deployment First step of splitting the skill into three focused skills (configure-app-settings, configure-docker-compose-deployment, configure-k8s-deployment). Rename is in its own commit so git log --follow traces the docker-compose content cleanly. * Scope docker-compose skill to docker-compose-only Removes app-level content (settings.json, Dockerfile choice, production app examples) that will live in configure-app-settings. Adds a prerequisite note pointing to configure-app-settings. * Add configure-app-settings skill Covers app-level configuration (settings.json, Dockerfile choice, README, dependencies) shared by every deployment mode. Prerequisite for configure-docker-compose-deployment and configure-k8s-deployment. * Fix settings.json key-field list inconsistency The Key fields prose listed max_threads (not in the JSON sample) and omitted enable_workspaces (which is in the sample). Align the prose with the sample and describe max_threads separately since it is a nested object rather than a flat field. * Add configure-k8s-deployment skill New skill walking through Kustomize overlay creation and kubectl apply for deploying a forked app to Kubernetes. Patch list reflects the three-patch canonical shape (IngressRoute match + service, streamlit Redis URL, rq-worker Redis URL). * Fix inline-code rendering in k8s skill The Host(`...`) escape syntax produced literal backslashes that broke the inline-code span when rendered by markdown parsers. Rewrite as Host(...) without nested backticks so the span renders cleanly. * Add K8s deployment doc — overview and architecture sections * Add K8s deployment doc — manifest reference section * Add K8s deployment doc — fork-and-deploy guide * Add K8s deployment doc — CI/CD pipeline section * Clarify PR-blocking behavior depends on branch protection The workflow does not block merges directly — it produces a check status that a branch-protection rule can gate on. Make the preconditions explicit. * Register Kubernetes Deployment page in Streamlit documentation * Cross-link docs/deployment.md to Kubernetes deployment page Adds a preamble listing both deployment paths and introduces a ## Docker Compose heading above the existing content. The existing docker-compose content is preserved verbatim. * Add smoke test for Kubernetes Deployment documentation page Extends the parametrized test_documentation cases to cover the new Documentation page added by this branch, closing the gap where it was the only selectbox entry without test coverage.
ci: unified docker workflow (shadow mode)
github.repository preserves the original casing (OpenMS/streamlit-template). Docker OCI references require lowercase, so cache-from/cache-to fail with 'invalid reference format'. docker/metadata-action handles this internally for tags, but the cache refs bypass it. Compute IMAGE_NAME_LC once and use it in both cache refs.
ci: lowercase image name for OCI cache refs
With push: true, docker/build-push-action pushes every tag in its tags input. A bare name like 'openms-streamlit:simple-test' (no registry prefix) gets resolved to Docker Hub and fails with 401 unauthorized, because the workflow's GHCR token has no rights on docker.io. The local tag was only needed for the kind retag step. Since load: true already loads the image into the runner's docker daemon, we can create the stable local alias with a plain 'docker tag' step after build, picking any tag from docker/metadata-action's output.
ci: don't pass unprefixed local tag to buildx push
ci: cut over from old docker workflows to build-and-test
The @V3 floating tag does not exist on snok/container-retention-policy (v2 is the latest floating major tag; v3 only has v3.0.0 and v3.0.1 as exact version tags). The workflow fails to resolve the action with 'unable to find version v3'. Pin to v3.0.1 (latest v3 release).
ci: pin container-retention-policy to v3.0.1
The ENV GH_TOKEN=${GITHUB_TOKEN} at the top baked the per-run token
into an early layer, so every workflow run rebuilt from scratch.
Moved the ARG next to the one RUN that uses it (gh release download)
so earlier layers stay cacheable.
Mirrors the base example with overlay-specific guidance: `namePrefix` only rewrites Kustomize-managed resources, so imperative Secrets must still use the literal name `streamlit-secrets`.
k8s: mount admin password from streamlit-secrets Secret
Factor node placement and memory sizing out of the base manifests into reusable Kustomize components (memory-tier-low / memory-tier-high), so each fork picks its tier with a single line in its overlay. - base: remove per-pod `resources` from streamlit and rq-worker Deployments; sizing now comes from the tier component - base: promote redis to Guaranteed QoS (requests == limits for both cpu and memory) so it bottoms the kernel OOM list - base: add LimitRange so containers without explicit resources inherit safe defaults (512Mi/250m request, 2Gi/2 limit, 64Gi/16 max) - components/memory-tier-low: nodeSelector=low, streamlit 512Mi/2Gi, rq-worker 1Gi/16Gi (Burstable) - components/memory-tier-high: nodeSelector=high, streamlit 512Mi/4Gi, rq-worker 2Gi/180Gi (Burstable — uniform across heavy workers so a single active app can burst into the shared pool) - overlays: rename template-app/ to prod/ (one overlay per repo; the repo itself identifies the app) and pull in memory-tier-low - docs & skill: document the new overlays/prod/ path and the one-line tier selector; update CI to kustomize the renamed overlay https://claude.ai/code/session_01LW4iBWt5YftuqFGc3jM5ZP
The memory-tier-low component adds nodeSelector openms.de/memory-tier=low to every Deployment. kind clusters have no such label, so after the rename to overlays/prod all pods stayed Pending and 'Wait for Redis to be ready' timed out. Label --all kind nodes in both the nginx and Traefik integration jobs before deploying so the nodeSelector matches. Also raise the LimitRange max.memory from 64Gi to 200Gi. The original cap was written before memory-tier-high settled on a 180Gi rq-worker limit; without the bump, a high-tier fork (e.g. OpenDIAKiosk) would be rejected by admission when deployed into the shared openms namespace after the template's LimitRange is applied. https://claude.ai/code/session_01LW4iBWt5YftuqFGc3jM5ZP
…tp://127.0.0.1:34609/git/OpenMS/streamlit-template into claude/parallel-webapp-memory-optimization-RoNnJ
Completes the overlay rename started in 6c61365 now that the branch has merged main, which added the example file under the old path. Also rewrite two remaining docs references to overlays/<your-app-name>/ and the CI description to the new prod overlay. https://claude.ai/code/session_01LW4iBWt5YftuqFGc3jM5ZP
Spin up a 2-node kind cluster (control-plane labeled memory-tier=low + ingress-ready, worker labeled memory-tier=high) so the Build-and-Test job passes regardless of which memory-tier component a fork's overlay pulls in. Previously we labeled --all nodes with a single tier after creation, which broke as soon as a fork flipped memory-tier-low to memory-tier-high. - .github/kind-config.yaml: 2-node topology with per-node labels. - .github/workflows/build-and-test.yml: point both helm/kind-action invocations (nginx build + traefik-integration) at the config and drop the now-redundant dynamic label step. https://claude.ai/code/session_01LW4iBWt5YftuqFGc3jM5ZP
Previous run (2f28ed9) showed build + traefik-integration jobs still timing out on 'Wait for Redis'. Root cause: multi-node kind clusters apply node-role.kubernetes.io/control-plane:NoSchedule to the control-plane, which untolerated app pods can't land on even though the nodeSelector matches. The single-node kind used previously had no such taint, which is why CI worked until we added a second node. Add a kubeadmConfigPatches stanza setting nodeRegistration.taints to the empty list so the control-plane is schedulable. Labels and cluster shape (1 control-plane + 1 worker) stay the same. https://claude.ai/code/session_01LW4iBWt5YftuqFGc3jM5ZP
…imization-RoNnJ Refactor K8s deployment to use memory-tier components
Adds a seed-demos initContainer to the Streamlit Deployment that merges image-shipped demos into /workspaces-streamlit-template/.demos/ with cp -rn, so new demos in an image appear after redeploy while admin-saved demos and edits persist across redeploys. - Point demo_workspaces.source_dirs at the PV path via the ConfigMap override (both streamlit and rq-worker pick this up through the jq settings merge at startup). - Make get_demo_target_dir() settings-driven so "Save as Demo" writes to the PV, with backwards-compatible fallbacks for the legacy source_dir string and for environments without settings (tests). - Skip hidden top-level dirs in clean-up-workspaces.py so the nightly cron does not garbage-collect .demos/. - Document the .demos/ layout and the re-seed flow. https://claude.ai/code/session_01Y87aULHSdyBobPdaD4L6tW
…-azhkG Support configurable demo workspace source directories
The Secret used to be an out-of-band copy-the-example step, so forgetting the resources-list edit left the pod booting with an empty admin-secrets mount and a user-facing "Admin not configured" error for a feature that was never wired up in the first place. Now the Secret is committed to the base with an empty admin password and included in k8s/base/kustomization.yaml, so kubectl apply -k always creates it. The "Save as Demo" expander is gated on a non-empty password and is hidden entirely (no error box) when not configured. Operators enable the feature by patching the live Secret or by editing the file locally with git update-index --skip-worktree, both documented. Exception handling in is_admin_configured() is tightened to also catch StreamlitSecretNotFoundError so a missing secrets file never raises. https://claude.ai/code/session_01V1noocAR7uXWjWsC9oLGhz
Hide Save-as-Demo UI when admin password is not configured
Split the build+test flow into three stages so the traefik ingress
test no longer rebuilds Dockerfile_simple from scratch:
build (matrix: full, simple)
-> uploads each image as a workflow artifact
test-nginx (matrix: full, simple)
-> downloads artifact, kind loads, tests nginx ingress
test-traefik (simple only)
-> downloads simple artifact, kind loads, tests traefik ingress
Artifacts (not GHCR) are used because the build job only pushes on
non-PR events and fork PRs cannot auth to GHCR at all, so registry
sharing would not work for every PR path.
Mirror the build/test-nginx matrix so the traefik ingress test also covers the full and simple variants instead of just simple.
test-traefik (simple) failed in the combined "Wait for Redis and deployments to be ready" step because the deployment took longer than 120s to become available, and unlike the test-nginx wait the failure was not soft. Align test-traefik with test-nginx: - Split Redis wait (hard, 60s) from deployment wait (soft, `|| true`). - Bump deployment timeout 120s -> 180s in both jobs. - Widen the curl warm-up loop from 5x2s to 30x2s in both jobs so a marginally late deployment is tolerated; a real failure still surfaces via the trailing unconditional curl.
The previous skill was a manual find-and-replace checklist that assumed Claude could run kubectl against the cluster. Restructure it as an interview-driven file-editing guide with a clear handoff to a human operator (or CI) for cluster apply. - Drop kubectl, kubectl kustomize, and rollout-verification steps that Claude can't actually execute. - Drop nginx ingress fallback; production is Traefik-only. - Add a Step 1 recon over a fixed set of base/overlay/CI files so defaults are derived from the repo, and the skill bails on layouts it doesn't recognize. - Replace the manual checklist with six interview questions, each paired with what it controls in the running deployment, the proposed default, and the reasoning. Slug, GHCR ref, image tag, ingress subdomain, memory tier, workspace storage size. - Make storage a single 1-line edit to k8s/base/workspace-pvc.yaml when the user picks a non-default size; keep the PVC base name unchanged (namePrefix scopes it per-fork, no collisions). - Pin the default storage size to 500 Gi to match the stock base, so the default needs zero file edits. - Explain that images[0].name is a Kustomize match key and must not change.
Refactor CI workflow to build images once and reuse across jobs
Refactor k8s deployment skill to interview-driven overlay editing
The shared volume-group: workspaces label and required pod-affinity attracted every fork's workspace pods onto a single node per memory tier and deadlocked the first replica of any fork landing on an otherwise-empty tier (no peer pod for the required affinity to match). Per-fork RWO PVCs (<slug>-workspaces-pvc) already constrain all of a fork's workspace-using pods to the node the volume is attached to via the scheduler's VolumeBinding plugin, so the explicit affinity adds nothing on top. Removing it scopes co-location naturally to one fork and lets a fresh tier bootstrap without manual affinity-strip. NodeSelector continues to pick the memory tier; the RWO mount picks the specific node within that tier.
The kind integration jobs in build-and-test.yml hardcoded `template-app`
as the slug label and `template.webapps.openms.{de,org}` as the Traefik
hostnames. The configure-k8s-deployment skill rewrites those values when
a fork customizes its overlay, after which `kubectl wait -l app=...`
returns "no matching resources found" and Traefik curl tests hit the
wrong Host header. This broke OpenMS/quantms-web PR #19 on its first
overlay PR (run 24964475081).
Have test-nginx and test-traefik discover SLUG (from `commonLabels.app`)
and TRAEFIK_HOSTS (parsed from the rendered IngressRoute match) right
after deploy, and substitute them into the wait/curl steps. The nginx
hostnames stay hardcoded — they come from `k8s/base/ingress.yaml`, which
the skill never edits and Kustomize doesn't rewrite.
Update the configure-k8s-deployment skill to (a) check during recon that
the workflow uses dynamic discovery, (b) flag forks still on the old
hardcoded shape so the skill applies the patch before editing the
overlay, and (c) note in the handoff that no fork-specific workflow
edits are needed.
Remove pod-affinity rules; rely on RWO PVC for co-location
Make CI integration tests discover app slug and hosts dynamically
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
No description provided.