[Runtimes][Dask] Ensure that MLRUN_DBPATH is available for workers #4172

laurybueno · 2023-08-29T15:45:40Z

The traditional process for populating Job pods with MLRun related environment variables doesn't work for Dask pods, since the their scheduler is itself separated from the MLRun API pod and, because of that, it's not aware of such variables.

This PR ensures that the MLRun API pod add the relevant variables to the pod templates used to create both scheduler and worker pods on the Dask runtime.

liranbg · 2023-08-29T19:01:12Z

mlrun/runtimes/daskjob.py

@@ -457,7 +457,7 @@ def _run(self, runobj: RunObject, execution):
        handler = runobj.spec.handler
        self._force_handler(handler)

-        extra_env = self._generate_runtime_env(runobj)
+        extra_env = self.generate_runtime_env(runobj)
        environ.update(extra_env)


not related, but this looks like a big hazard. is overrides the environ with runtime-specific envvars?
if code runs (and it might is) on server side, it is exposed to race condition of overridden envvars. im not sure where it is being used afterwards, but lets add a TODO here to make sure we understand it does not run on server-side.

Agreed. I considered removing this code, because it seems to me that its original intention was to mirror the responsibility split employed on the creation of functions with the type Job. In such case, it falls to _run the role of acquiring the runtime variables (see here and here).

During my testing, however, while _run executes in the MLRun API Pod for functions of type Job, it does not on functions of the type Dask. In this case, _run seem to only run in the Dask Scheduler Pod, so it's already too late to get the traditional runtime variables (and also too late to incorrectly overwrite the environment on MLRun API Pods). That's why MLRUN_DBPATH wasn't available originally for Dask functions.

I added a comment with a TODO here so we can reconsider this code later. Should I create a Jira task as well?

🚀 TIL
Yes please!

For future reference, this discussion will be followed up on ML-4515.

liranbg · 2023-08-29T19:03:01Z

mlrun/api/runtime_handlers/daskjob.py

+    env.extend(
+        [{"name": k, "value": v} for k, v in function.generate_runtime_env().items()]
+    )


code dup with mlrun/runtimes/mpijob/v1alpha1.py

extra_env = self.generate_runtime_env(runobj) extra_env = [{"name": k, "value": v} for k, v in extra_env.items()]

Perhaps you can have expose "generate_runtime_k8s_env" which uses _ generate_runtime_env. wdyt?

Good thinking. I did what you suggested. See if it's looks better now.

liranbg · 2023-08-29T19:04:14Z

mlrun/runtimes/base.py

@@ -374,15 +374,16 @@ def _get_db_run(self, task: RunObject = None):
        if task:
            return task.to_dict()

-    def _generate_runtime_env(self, runobj: RunObject):
+    def generate_runtime_env(self, runobj: RunObject = None):


add some docs and move function above as we (mostly try) keep public functions up, private functions down

Good point. I created the public function you suggested on another comment and moved it up.

…lates

quaark

LGTM!

liranbg · 2023-08-30T19:46:09Z

mlrun/runtimes/daskjob.py

@@ -457,7 +457,7 @@ def _run(self, runobj: RunObject, execution):
        handler = runobj.spec.handler
        self._force_handler(handler)

-        extra_env = self._generate_runtime_env(runobj)
+        extra_env = self.generate_runtime_env(runobj)
        environ.update(extra_env)


🚀 TIL
Yes please!

Ensure that MLRUN_DBPATH is available for Dask workers

62a7e53

laurybueno marked this pull request as ready for review August 29, 2023 18:22

liranbg requested changes Aug 29, 2023

View reviewed changes

Improve code reusability when preparing variables for Kubernetes temp…

4b7a9a2

…lates

laurybueno requested a review from liranbg August 29, 2023 21:24

Apply linter

e7136dd

quaark approved these changes Aug 30, 2023

View reviewed changes

liranbg approved these changes Aug 30, 2023

View reviewed changes

liranbg merged commit 163260a into mlrun:development Aug 30, 2023
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Runtimes][Dask] Ensure that MLRUN_DBPATH is available for workers #4172

[Runtimes][Dask] Ensure that MLRUN_DBPATH is available for workers #4172

laurybueno commented Aug 29, 2023

liranbg Aug 29, 2023

laurybueno Aug 29, 2023

liranbg Aug 30, 2023

laurybueno Aug 30, 2023

liranbg Aug 29, 2023

laurybueno Aug 29, 2023

liranbg Aug 29, 2023

laurybueno Aug 29, 2023

quaark left a comment

liranbg Aug 30, 2023

[Runtimes][Dask] Ensure that MLRUN_DBPATH is available for workers #4172

[Runtimes][Dask] Ensure that MLRUN_DBPATH is available for workers #4172

Conversation

laurybueno commented Aug 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

quaark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment