Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] [runtime env] Don't delete working_dir from runtime env #16475

Merged
merged 13 commits into from
Jun 18, 2021

Conversation

architkulkarni
Copy link
Contributor

@architkulkarni architkulkarni commented Jun 16, 2021

Why are these changes needed?

Related issue number

Based off of #16378

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@architkulkarni
Copy link
Contributor Author

cc @iycheng I would be grateful if you could take a look and let me know if this makes sense. My understanding is that currently (before this PR), "RAY_RUNTIME_ENV_FILES" was never set and never used, because the field working_dir is never present in the runtime env dict (it gets deleted before the dict is created.). Since we don't support working_dir per-task and per-actor, I deleted the part of worker.py that reads this env var. https://github.com/ray-project/ray/pull/16475/files#diff-e125e536c0f84a61d00bb9d0c7613c0984892ef3e89ac565605d40f93d7a68e2L1282-L1286. If I didn't delete this, it would try to interpret the value of "RAY_RUNTIME_ENV_FILES" (which is a path) as a URI, which would cause an error.

I also tried to delete https://github.com/ray-project/ray/blob/master/python/ray/_private/runtime_env.py#L165-L170, but it caused Ray Client to hang for some reason, which gave me some trouble. It seems that if I leave the block in it fixes the hanging, but I don't really understand why.

@architkulkarni architkulkarni linked an issue Jun 17, 2021 that may be closed by this pull request
2 tasks
@fishbone
Copy link
Contributor

I think RAY_RUNTIME_ENV_FILES is a temporary solution to pass uris around.

But this won't merge with job config which means #16481 still won't work.

@fishbone
Copy link
Contributor

This one is more like give the GCS some way to fetch the working dir.

@architkulkarni
Copy link
Contributor Author

The PR to merge the actor/task runtime env with the job_config has been merged #16378 , does that address this?

Regarding RAY_RUNTIME_ENV_FILES I'm still confused, because I don't see how it could ever be set. Here is the block of code where it's set in master https://github.com/ray-project/ray/blob/master/python/ray/_private/runtime_env.py#L165-L170, but I don't think this was ever called, because working_dir was removed from the dict before these lines ran.

@architkulkarni
Copy link
Contributor Author

With the current PR, my understanding is that we're passing the field "working_dir" in the runtime env dict just so it can be read and displayed externally. But the actual mechanism of working_dir for tasks and actors is just to get the URIs from the job_config (https://github.com/ray-project/ray/pull/16475/files#diff-e125e536c0f84a61d00bb9d0c7613c0984892ef3e89ac565605d40f93d7a68e2R1282).

@architkulkarni
Copy link
Contributor Author

architkulkarni commented Jun 17, 2021

test_experimental_package, test_experimental_package_lazy and test_experimental_package_github failed with this PR so I must be misunderstanding this codepath. https://buildkite.com/ray-project/ray-builders-pr/builds/7337#30496bea-c82a-4d8e-abe7-67c95e80fc34/6-9348

@architkulkarni
Copy link
Contributor Author

I see, RAY_RUNTIME_ENV_FILES is used for the experimental package loading. Let me take a closer look

@architkulkarni architkulkarni changed the title [WIP] [Core] Pass working dir to tasks/actors [Core] [runtime env] Don't delete working_dir from runtime env Jun 18, 2021
@architkulkarni architkulkarni added the tests-ok The tagger certifies test failures are unrelated and assumes personal liability. label Jun 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[runtime_env] Don't delete working dir in runtime env.
3 participants