[release] add golden notebook release test for torch/tune/serve #16619

matthewdeng · 2021-06-22T21:18:56Z

Why are these changes needed?

This test tests the integration between PyTorch, Ray Tune and Ray Serve.

At a high level, the MNIST dataset is loaded and Tune is used to train a ResNet model. This trained model is exposed for predictions through Serve, and a sample of 10 images are run through the deployed prediction service.

Notes

Models are typically not passed through the Ray Object Store, but rather through a more persistent store such as S3. For the sake of this short-lived test the Ray Object Store is used to transfer the best model determined by the Tune step to the backend services that are deployed via Serve, as both Tune and Serve are done in the same session.
In the prediction phase of the test, we wrap predict_and_validate in a @ray.remote decoration to parallelize the calls. As such, the requests made to http://localhost:8000/mnist may be executed on any of the head or worker nodes. To enable routing on the worker nodes, we set "location": "EveryNode" when starting Serve. In a real user setting, requests are typically from external sources and the requests would be directed to the head node URL alone.

Related issue number

n/a

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

simon-mo · 2021-06-28T17:05:26Z

cc @ray-project/ray-serve

amogkam

This looks great!

@matthewdeng just confirming, this works when you run it with e2e.py right?

matthewdeng · 2021-06-28T23:58:44Z

@amogkam yep this runs successfully with e2e, see example cluster!

[release] add golden notebook release test for torch/tune/serve

b416df3

matthewdeng assigned amogkam Jun 22, 2021

start serve on all nodes so remote localhost works

8026bf2

matthewdeng changed the title ~~[WIP][release] add golden notebook release test for torch/tune/serve~~ [release] add golden notebook release test for torch/tune/serve Jun 23, 2021

matthewdeng assigned krfricke and simon-mo Jun 24, 2021

amogkam approved these changes Jun 28, 2021

View reviewed changes

amogkam merged commit b0f304a into ray-project:master Jun 29, 2021

matthewdeng deleted the golden-notebooks branch June 29, 2021 17:39

This was referenced Jun 29, 2021

[release] update torch_tune_serve_test to use anyscale connect #16754

Merged

add golden notebook nightly tests ray-project/releaser#17

Merged

matthewdeng mentioned this pull request Nov 9, 2022

[ml-release][no-ci] fix torch tune serve test. #30095

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[release] add golden notebook release test for torch/tune/serve #16619

[release] add golden notebook release test for torch/tune/serve #16619

matthewdeng commented Jun 22, 2021 •

edited

Loading

simon-mo commented Jun 28, 2021

amogkam left a comment

matthewdeng commented Jun 28, 2021

[release] add golden notebook release test for torch/tune/serve #16619

[release] add golden notebook release test for torch/tune/serve #16619

Conversation

matthewdeng commented Jun 22, 2021 • edited Loading

Why are these changes needed?

Notes

Related issue number

Checks

simon-mo commented Jun 28, 2021

amogkam left a comment

Choose a reason for hiding this comment

matthewdeng commented Jun 28, 2021

matthewdeng commented Jun 22, 2021 •

edited

Loading