Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot Overwrite DASK_SCHEDULER_ADDRESS in Worker env #873

Open
efeboyaci opened this issue Mar 6, 2024 · 1 comment
Open

Cannot Overwrite DASK_SCHEDULER_ADDRESS in Worker env #873

efeboyaci opened this issue Mar 6, 2024 · 1 comment

Comments

@efeboyaci
Copy link

Describe the issue:
To use a different port or non default FQDN for the scheduler we provide DASK_SCHEDULER_ADDRESS env in daskcluster .spec.worker.spec.containers[0].env, however the operator creates 2 items in worker deployment env. The last one overrides the first in the pod's env. So we cannot change DASK_SCHEDULER_ADDRESS environment variable from dask kubernetes objects.

Minimal Complete Verifiable Example:

from dask_kubernetes.operator import KubeCluster
cluster = KubeCluster(name="test", n_workers=1, namespace="dask-operator", env={"DASK_SCHEDULER_ADDRESS": "tcp://test-scheduler:8786"})

DaskWorkerGroup

env:
- name: DASK_SCHEDULER_ADDRESS
  value: tcp://dev-cluster-scheduler:8786

Worker Pod

env:
- name: DASK_SCHEDULER_ADDRESS
  value: tcp://dev-cluster-scheduler:8786
- name: DASK_WORKER_NAME
  value: dev-cluster-default-worker-5f4e5766de
- name: DASK_SCHEDULER_ADDRESS
  value: tcp://dev-cluster-scheduler.dask-operator.svc.cluster.local:8786

Anything else we need to know?:

Duplicate items with same names in container["env"] and env should be removed from env before adding it to configuration
Current:

for container in deployment_spec["spec"]["template"]["spec"]["containers"]:
if "env" in container:
container["env"].extend(env)
else:
container["env"] = env

for container in pod_spec["spec"]["containers"]:
if "env" in container:
container["env"].extend(env)
else:
container["env"] = env

Fix:

    for container in deployment_spec["spec"]["template"]["spec"]["containers"]:
        if "env" in container:
            # Remove duplicate env vars
            container_env_names = [env_item["name"] for env_item in container["env"]]
            for env_item in env:
                if env_item["name"] in container_env_names:
                    env.remove(env_item)
            # Add the env vars
            container["env"].extend(env)
        else:
            container["env"] = env

Environment:
2024.3.0 dask operator

@jacobtomlinson
Copy link
Member

@jonded94 given you recently fix a related problem in #869 do you have any interest into looking into what is going on here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants