Skip to content

Fix celery autoscaler missing service config mount#771

Merged
dmchoiboi merged 1 commit intomainfrom
dmchoi/fix-celery-autoscaler-service-config
Feb 27, 2026
Merged

Fix celery autoscaler missing service config mount#771
dmchoiboi merged 1 commit intomainfrom
dmchoi/fix-celery-autoscaler-service-config

Conversation

@dmchoiboi
Copy link
Collaborator

@dmchoiboi dmchoiboi commented Feb 27, 2026

Problem

PR #770 scoped list_deployments to use hmi_config().endpoint_namespace, but the celery autoscaler StatefulSet was never given the service config ConfigMap mount that gateway/builder/cacher have. Without it, hmi_config() falls back to service_config_circleci.yaml which doesn't exist in the pod:

Error in deployment loop: [Errno 2] No such file or directory: '/workspace/model-engine/service_configs/service_config_circleci.yaml'

This causes every list_deployments loop iteration to fail, breaking async endpoint autoscaling.

Fix

Two minimal additions to celery_autoscaler_stateful_set.yaml:

  1. DEPLOY_SERVICE_CONFIG_PATH env var pointing to /workspace/model-engine/service_configs/service_config.yaml
  2. Mount the launch_service_config key from the existing service config ConfigMap at /workspace/model-engine/service_configs/

This exactly mirrors what gateway/builder/cacher do — no extra volumes, no extra env vars.

Test plan

🤖 Generated with Claude Code

Greptile Summary

This PR fixes the celery autoscaler StatefulSet that was missing the service config ConfigMap mount, causing hmi_config() to fall back to a non-existent service_config_circleci.yaml file. This broke async endpoint autoscaling after PR #770 introduced hmi_config().endpoint_namespace usage in list_deployments.

  • Adds DEPLOY_SERVICE_CONFIG_PATH env var pointing to the mounted config path
  • Adds launch_service_config ConfigMap volume and mount, matching the pattern used by gateway/builder/cacher
  • Correctly restructures volumeMounts: and volumes: keywords to be unconditional (previously they were only rendered inside the {{- if .Values.aws }} block, which would break on non-AWS deployments needing only the service config mount)

Confidence Score: 4/5

  • This PR is safe to merge — it's a minimal, well-scoped fix that mirrors an established pattern used by other services
  • The change is small (one file, ~20 lines), directly mirrors proven patterns from gateway/builder/cacher, and fixes a clear production error. The volume/mount definitions and ConfigMap references are identical to the _helpers.tpl helpers. The only minor concern is the hardcoded env var path vs using the config.file conditional, but this matches the existing template style and config.file is not set in default values.
  • No files require special attention — the single changed file follows established patterns

Important Files Changed

Filename Overview
charts/model-engine/templates/celery_autoscaler_stateful_set.yaml Adds DEPLOY_SERVICE_CONFIG_PATH env var and service config ConfigMap volume/mount to the celery autoscaler StatefulSet, mirroring gateway/builder/cacher. Also correctly restructures volumeMounts/volumes to be unconditional. Minor style note: env var is hardcoded rather than using the config.file conditional from _helpers.tpl.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Celery Autoscaler Pod Starts] --> B{DEPLOY_SERVICE_CONFIG_PATH set?}
    B -->|Before PR| C[Not set - uses DEFAULT_SERVICE_CONFIG_PATH]
    C --> D[service_config_circleci.yaml]
    D --> E[FileNotFoundError - file missing in pod]
    E --> F[list_deployments fails every loop]
    B -->|After PR| G[Set via env var in StatefulSet]
    G --> H[service_config.yaml path configured]
    H --> I{ConfigMap mounted?}
    I -->|Before PR| J[No mount - file missing]
    I -->|After PR| K[launch_service_config mounted from ConfigMap]
    K --> L[hmi_config loads successfully]
    L --> M[endpoint_namespace resolved]
    M --> N[list_deployments scoped correctly]
Loading

Last reviewed commit: 3847687

PR #770 added hmi_config().endpoint_namespace inside list_deployments
to scope the namespace scan, but the celery autoscaler StatefulSet
never had the service config ConfigMap mounted — unlike gateway,
builder, and cacher which all use the modelEngine.volumeMounts helper.

Without the mount, hmi_config() falls back to service_config_circleci.yaml
which doesn't exist in the pod, causing every list_deployments loop to
fail with FileNotFoundError.

Fix: mount the service config ConfigMap (launch_service_config key only)
at /workspace/model-engine/service_configs/ and set DEPLOY_SERVICE_CONFIG_PATH
to point to it — matching exactly what the other components do.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@dmchoiboi dmchoiboi enabled auto-merge (squash) February 27, 2026 03:19
@dmchoiboi dmchoiboi disabled auto-merge February 27, 2026 03:28
@dmchoiboi dmchoiboi merged commit 3470697 into main Feb 27, 2026
8 checks passed
@dmchoiboi dmchoiboi deleted the dmchoi/fix-celery-autoscaler-service-config branch February 27, 2026 03:44
dmchoiboi added a commit that referenced this pull request Feb 27, 2026
Two bugs introduced in PRs #770 and #771:

1. celery_autoscaler.py: hmi_config is an instance, not a callable.
   Change hmi_config().endpoint_namespace -> hmi_config.endpoint_namespace

2. celery_autoscaler_stateful_set.yaml: DEPLOY_SERVICE_CONFIG_PATH was
   hardcoded to the ConfigMap path, but prod uses config.file (config
   baked into the private image). Now correctly uses config.file.launch
   when set, falling back to the ConfigMap path when config.values is set.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants