Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[runtime envs] pip package in runtime env is not installed #26342

Closed
onlyone2019 opened this issue Jul 7, 2022 · 12 comments
Closed

[runtime envs] pip package in runtime env is not installed #26342

onlyone2019 opened this issue Jul 7, 2022 · 12 comments
Labels
bug Something that is supposed to be working; but isn't stale The issue is stale. It will be closed within 7 days unless there are further conversation triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@onlyone2019
Copy link

onlyone2019 commented Jul 7, 2022

What happened + What you expected to happen

ray job submit --address=‘http://192.168.0.166:8265’ --runtime-env-json=‘{“working_dir”:“./” , “pip”:[“smmap”]}’ -- /home/wanjia/conda/env1/bin/python ./job.py

We submitted a job using the above command to check if ray worker will install python packages at runtime.According to the document, ray worker will install smmap , and the job will work correctly.

However, we got some error messages:

Job submission server address: http://192.168.0.166:8265
2022-07-06 14:55:02,101	INFO dashboard_sdk.py:272 -- Uploading package gcs://_ray_pkg_90994eb4a30dc674.zip.
2022-07-06 14:55:02,101	INFO packaging.py:479 -- Creating a file package for local directory './'.

-------------------------------------------------------
Job 'raysubmit_z9DDa2AMUUGxzb8Z' submitted successfully
-------------------------------------------------------

Next steps
  Query the logs of the job:
    ray job logs raysubmit_z9DDa2AMUUGxzb8Z
  Query the status of the job:
    ray job status raysubmit_z9DDa2AMUUGxzb8Z
  Request the job to be stopped:
    ray job stop raysubmit_z9DDa2AMUUGxzb8Z

Tailing logs until the job exits (disable with --no-wait):
Traceback (most recent call last):
  File "./job.py", line 3, in <module>
    import smmap
ModuleNotFoundError: No module named 'smmap'

---------------------------------------
Job 'raysubmit_z9DDa2AMUUGxzb8Z' failed
---------------------------------------

Status message: Job failed due to an application error, last available logs:
Traceback (most recent call last):
  File "./job.py", line 3, in <module>
    import smmap
ModuleNotFoundError: No module named 'smmap'

Versions / Dependencies

ray : 3.0.0.dev0
python : 3.8.13

Reproduction script

job.py

import ray
import requests
import smmap
import sys

ray.init()

@ray.remote
class Counter:
    def __init__(self):
        self.counter = 0

    def inc(self):
        print("ok")

counter = Counter.remote()
for _ in range(5):
    ray.get(counter.inc.remote())
print(sys.version)

command

ray job submit --address=‘http://192.168.0.166:8265’ --runtime-env-json=‘{“working_dir”:“./” , “pip”:[“smmap”]}’ -- /home/wanjia/conda/env1/bin/python ./job.py

NOTE:
We submit a job to the remote ray worker and 192.168.0.166 is its address.Moreover, "/home/wanjia/conda/env1/bin/python" is the way to a remote worker's python environment.

We not just tested the package “smmap” which is a small package and takes only a short time to download. Correspondingly, we also took packages like numpy and torch for testing, but we got the same message.

Issue Severity

No response

@onlyone2019 onlyone2019 added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jul 7, 2022
@tupui
Copy link
Member

tupui commented Jul 7, 2022

Hi @onlyone2019, thank you for reporting. I suspect a conflict between pip/conda. Ray would not know to use /home/wanjia/conda/env1/bin/pip by itself. I would suggest to activate the environment as a first step instead of using /home/wanjia/conda/env1/bin/python. Something like this may work: conda activate env1;python job.py

@architkulkarni
Copy link
Contributor

Ah @tupui thanks, I do think the issue is related to this. conda activate env1;python job.py is worth a shot, but I worry that env1 still won't have access to the packages due to the isolated nature of conda environments. Ray does modify PYTHONPATH to allow the newly installed packages to be imported from their install location at /tmp/ray/session_latest/runtime_resources/pip, but conda might be overwriting PYTHONPATH completely to ensure the isolation.

In general, these issues can be debugged by printing out sys.path, and comparing with /tmp/ray/session_latest/dashboard_agent.log on the head node to see the detailed output of the runtime_env's pip install command.

@onlyone2019 To help you better, may I ask your reasons for using a conda environment? I think the behavior of dynamically installing "pip" packages into a user's conda environment might not be currently supported by Ray (though we should certainly keep this open as an enhancement request, since using conda is pretty common.)

@tupui
Copy link
Member

tupui commented Jul 7, 2022

What I don't get from the doc is that there is a way to install a package using conda, but I did not see how to specify the environment to use.

@architkulkarni
Copy link
Contributor

architkulkarni commented Jul 7, 2022

Ah, for the "conda" field of runtime_env, you can either

  • provide a conda environment.yaml (as a Dict or a file), which installs the environment from scratch, or
  • provide the name of a conda environment that already exists on all nodes (in this case, I guess it's env1), and this just activates the environment.

But what's not yet supported is modifying existing conda environments, for example installing pip packages into an existing conda environment on all nodes.

Did that clarify things? We're currently pushing to improve the docs so any feedback is very valuable.

@stephanie-wang stephanie-wang changed the title [<Ray component: Core|RLlib|etc...>] [runtime envs] pip package in runtime env is not installed Jul 7, 2022
@tupui
Copy link
Member

tupui commented Jul 7, 2022

But what's not yet supported is modifying existing conda environments, for example installing pip packages into an existing conda environment on all nodes.

But the doc mention that we can install packages in a conda env. Either from PyPi or conda. See here https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#api-reference

In the end I don't think that being able to modify a conda env when submitting a job is that of a good idea. IMO the env should be fully specified either in a yml (then caching to not recreate on the same node maybe, or build on one and propagate to the others) or exist already.

@architkulkarni
Copy link
Contributor

But the doc mention that we can install packages in a conda env. Either from PyPi or conda. See here https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#api-reference

Sorry I wasn't sure which part exactly, do you mind pointing to the exact line? Then I can make sure to update it to be more clear.

In the end I don't think that being able to modify a conda env when submitting a job is that of a good idea. IMO the env should be fully specified either in a yml (then caching to not recreate on the same node maybe, or build on one and propagate to the others) or exist already.

Makes sense, currently we use the first approach (caching to not recreate on the same node)

@onlyone2019
Copy link
Author

Hi @onlyone2019, thank you for reporting. I suspect a conflict between pip/conda. Ray would not know to use /home/wanjia/conda/env1/bin/pip by itself. I would suggest to activate the environment as a first step instead of using /home/wanjia/conda/env1/bin/python. Something like this may work: conda activate env1;python job.py

@tupui Thanks for your suggestions. I have tried using conda activate env1;python job.py , but It does not seem to work. Maybe @architkulkarni is right.

env1 still won't have access to the packages due to the isolated nature of conda environments.
Ray does modify PYTHONPATH to allow the newly installed packages to be imported from their install location at 
/tmp/ray/session_latest/runtime_resources/pip, but conda might be overwriting PYTHONPATH completely to 
ensure the isolation.

Unfortunately, I doesn't know how to debug. Next, I'll do some experiments with environment.yaml.

@onlyone2019
Copy link
Author

Hi @architkulkarni, thank you sincerely! I want to use different versions of python environments and I use anaconda to install them. Therefore, I only have conda environments in my PC. If I want to use specific python environment for each job, maybe providing a conda environment.yaml to install the environment from scratch is a better way. Next, I'll do some experiments with environment.yaml .

@onlyone2019
Copy link
Author

Ah, for the "conda" field of runtime_env, you can either

* provide a conda `environment.yaml` (as a `Dict` or a file), which installs the environment from scratch, or

* provide the name of a conda environment that already exists on all nodes (in this case, I guess it's `env1`), and this just activates the environment.

But what's not yet supported is modifying existing conda environments, for example installing pip packages into an existing conda environment on all nodes.

Did that clarify things? We're currently pushing to improve the docs so any feedback is very valuable.

Hi @architkulkarni. I want to submit a job using some python packages that my ray cluster doesn't have installed. Could you give me an example of getting the cluster to install packages at runtime, either using pip or conda?

@architkulkarni
Copy link
Contributor

Hi @onlyone2019, can you check the examples listed at https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#api-reference under "pip" or "conda"? Let me know if anything is missing or unclear. Both of these options will install packages at runtime.

@stale
Copy link

stale bot commented Nov 9, 2022

Hi, I'm a bot from the Ray team :)

To help human contributors to focus on more relevant issues, I will automatically add the stale label to issues that have had no activity for more than 4 months.

If there is no further activity in the 14 days, the issue will be closed!

  • If you'd like to keep the issue open, just leave any comment, and the stale label will be removed!
  • If you'd like to get more attention to the issue, please tag one of Ray's contributors.

You can always ask for help on our discussion forum or Ray's public slack channel.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Nov 9, 2022
@stale
Copy link

stale bot commented Nov 25, 2022

Hi again! The issue will be closed because there has been no more activity in the 14 days since the last message.

Please feel free to reopen or open a new issue if you'd still like it to be addressed.

Again, you can always ask for help on our discussion forum or Ray's public slack channel.

Thanks again for opening the issue!

@stale stale bot closed this as completed Nov 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't stale The issue is stale. It will be closed within 7 days unless there are further conversation triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

No branches or pull requests

3 participants