Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "[serve] Replace Ray Client with Ray Job Submission for `serve… #32976

Merged
merged 4 commits into from Mar 3, 2023

Conversation

zcin
Copy link
Contributor

@zcin zcin commented Mar 2, 2023

… run` (#30913)"

This reverts commit 72f7a7d.

Why are these changes needed?

Related issue number

Closes #32881

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

zcin added 4 commits March 2, 2023 10:33
… run` (ray-project#30913)"

This reverts commit 72f7a7d.

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
Copy link
Contributor

@architkulkarni architkulkarni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the revert, but let us know if you had to make any manual changes (e.g. to fix merge conflicts)

@zcin
Copy link
Contributor Author

zcin commented Mar 3, 2023

Yup, the following manual changes were made:

  • There were changes made to Ray Jobs (see changes in original PR) that I removed from this revert PR. tldr those changes allowed sending SIGINT instead of the SIGTERM as a job stop signal, and I didn't see the need to revert those changes, as there were merge conflicts and it's not related to the issue we are trying fix (LMK if there are concerns abt this).
  • I included the changes from [serve] Tear down serve on each serve run #31342 and [Serve] Change API to use ServeDeploySchema to replace ServeApplicationSchema #32285, which were both changes that affected run_script.py (which was a new file in the original PR). So with this revert, those changes needed to be added back into scripts.py. (However note that one of them was made after the 2.3.0 branch cut, so it will be removed when cherry-picked onto the release branch)
  • I edited the test test_run_teardown to work with the non-jobs-version of serve run (an issue with stdout vs stderr logs) (details).

@zcin
Copy link
Contributor Author

zcin commented Mar 3, 2023

Also, I made the same corresponding changes to the test PR on the release branch and tested it on workspaces and services, both work!

@edoakes edoakes merged commit 80384c3 into ray-project:master Mar 3, 2023
zcin added a commit to zcin/ray that referenced this pull request Mar 3, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this pull request Mar 3, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
zcin added a commit to zcin/ray that referenced this pull request Mar 3, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
@zcin zcin deleted the serve-run-revert-master branch March 3, 2023 19:49
zcin added a commit that referenced this pull request Mar 3, 2023
#32976) (#33006)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in #30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue #32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.

Signed-off-by: Cindy Zhang <cindyzyx9@gmail.com>
ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request Mar 21, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.

Signed-off-by: Jack He <jackhe2345@gmail.com>
cadedaniel pushed a commit to cadedaniel/ray that referenced this pull request Mar 22, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.
edoakes pushed a commit to edoakes/ray that referenced this pull request Mar 22, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.

Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
peytondmurray pushed a commit to peytondmurray/ray that referenced this pull request Mar 22, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.
scottsun94 pushed a commit to scottsun94/ray that referenced this pull request Mar 28, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.
cassidylaidlaw pushed a commit to cassidylaidlaw/ray that referenced this pull request Mar 28, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.
elliottower pushed a commit to elliottower/ray that referenced this pull request Apr 22, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.

Signed-off-by: elliottower <elliot@elliottower.com>
ProjectsByJackHe pushed a commit to ProjectsByJackHe/ray that referenced this pull request May 4, 2023
ray-project#32976)

- Previously, if connecting to a local cluster, `serve run` will directly call `ray.init()` and `serve.run`
- With the change in ray-project#30913, even when connecting to a local cluster, `serve run` would submit a Ray Job to the local cluster and then call `ray.init()` and `serve.run()` inside that job. This caused the issue ray-project#32881, where if a user ran `serve run` inside a Job (or any other process with a runtime environment), they should expect to have access to the dependencies installed in that runtime environment. However since `serve run` goes through another layer of Job submission, they don't.

Signed-off-by: Jack He <jackhe2345@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[serve] serve run command within job entrypoint fails if a working_dir is specified
5 participants