Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release] add golden notebook release test for torch/tune/serve #16619

Merged
merged 2 commits into from
Jun 29, 2021

Conversation

matthewdeng
Copy link
Contributor

@matthewdeng matthewdeng commented Jun 22, 2021

Why are these changes needed?

This test tests the integration between PyTorch, Ray Tune and Ray Serve.

At a high level, the MNIST dataset is loaded and Tune is used to train a ResNet model. This trained model is exposed for predictions through Serve, and a sample of 10 images are run through the deployed prediction service.

Notes

  1. Models are typically not passed through the Ray Object Store, but rather through a more persistent store such as S3. For the sake of this short-lived test the Ray Object Store is used to transfer the best model determined by the Tune step to the backend services that are deployed via Serve, as both Tune and Serve are done in the same session.
  2. In the prediction phase of the test, we wrap predict_and_validate in a @ray.remote decoration to parallelize the calls. As such, the requests made to http://localhost:8000/mnist may be executed on any of the head or worker nodes. To enable routing on the worker nodes, we set "location": "EveryNode" when starting Serve. In a real user setting, requests are typically from external sources and the requests would be directed to the head node URL alone.

Related issue number

n/a

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@matthewdeng matthewdeng changed the title [WIP][release] add golden notebook release test for torch/tune/serve [release] add golden notebook release test for torch/tune/serve Jun 23, 2021
@simon-mo
Copy link
Contributor

cc @ray-project/ray-serve

Copy link
Contributor

@amogkam amogkam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

@matthewdeng just confirming, this works when you run it with e2e.py right?

@matthewdeng
Copy link
Contributor Author

@amogkam yep this runs successfully with e2e, see example cluster!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants