Add example for inference server #994

iojw · 2022-07-19T06:21:14Z

This is a first pass at how a server executing computations on Sky might be implemented. The use case I have in mind is when a user wants to execute certain computations on Sky triggered by HTTP requests, and based on the request parameters as input.

To test it out:

Update the LOCAL_UPLOAD_FOLDER variable and data init file mount path.
FLASK_APP=examples/inference_server/server.py flask run then access http://127.0.0.1:5000 and upload an image. The server spins up a Sky cluster, initializes the cluster with some model weights, runs some computation on the uploaded image, and returns the result as the HTTP response.

Some thoughts on the implementation:

Currently, there does not seem to be a good way of retrieving the return value of a function. As such, I had to use regex and logs to extract the result of the function. This is not ideal, but I believe it is being addressed by Add core programmatic API align with CLI #978
Because the handler launches a new Sky cluster, the request is very long running, taking > 10 min currently. I have 2 ideas for how to improve this:
- Use a task queue like Celery to execute the computations in the background when the handler is called, and return the job ID instead. User can use the job ID to poll the status of the job.
- It looks like sky.launch takes the bulk of the time. If all the computations can be executed on the same cluster, we can have a setup function run before the first request is made that spins up the cluster, and subsequent computations will all be run on this cluster using sky.exec.

Let me know what you think!

WoosukKwon · 2022-07-19T20:42:02Z

Hey @iojw, thanks for submitting this PR!

Just curious: what's the expected usage? Is it for real-time inference or offline inference? Is the inference request repeated over time? And why do we need Sky Python API for this? I think we need to make a concrete story here.

iojw · 2022-07-20T04:24:26Z

@WoosukKwon I believe the expected usage is for real-time inference, with the inference request changing based on the input image being passed in the HTTP request. I think the case for using the Python API is that it is much more natural over using subprocess since the server is built in Python as well. cc @concretevitamin

WoosukKwon

Thanks @iojw for submitting this PR! Left some comments. Please check them out.

WoosukKwon · 2022-07-30T04:30:55Z

examples/inference_server/server.py

+                            workdir=workdir,
+                            run=run_fn)
+
+            resources = sky.Resources(cloud=sky.Azure())


Is there any reason for using Azure here?

I only had access to Azure at the time, so I used it for easier testing locally. I recently gained access to AWS as well - would you suggest to leave it as the default in this case?

examples/inference_server/server.py

WoosukKwon · 2022-07-30T05:53:57Z

examples/inference_server/inference.py

@@ -0,0 +1,18 @@
+import os


Can we have a more realistic example in addition to (or instead of) this one? As pointed out in server.py, we need to send model weights and inputs to the cluster and get the prediction outputs from the cluster. Why don't we show such a complete example here?

That makes sense! I went with a simpler example because I don't have much experience with ML and inference - what do you think would be a good library and model to implement here?

WoosukKwon · 2022-07-30T05:55:07Z

examples/inference_server/server.py

+                # Copy model weights to the cluster
+                # Instead of local path, can also specify a cloud object store URI
+                '/remote/path/to/model-weights': 'local/path/to/model-weights',
+                # Copy image to the cluster
+                remote_image_path: local_image_path,


Nice point! Our storage APIs are indeed useful in sending model weights and inputs.

WoosukKwon · 2022-07-30T05:55:54Z

examples/inference_server/server.py

+LOCAL_UPLOAD_FOLDER = '/Users/isaac/Dropbox/Berkeley/Sky/sky/examples/inference_server/uploads'
+REMOTE_UPLOAD_FOLDER = '/remote/path/to/folder'


Can we make this example easier to run? It would be nice if users can run this example without any modification to the code.

github-actions · 2023-09-12T01:58:46Z

This PR is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.

github-actions · 2023-09-22T02:00:47Z

This PR was closed because it has been stalled for 10 days with no activity.

Implement inference server example

da3dcb5

iojw requested a review from concretevitamin July 19, 2022 06:21

iojw self-assigned this Jul 19, 2022

iojw added 2 commits July 18, 2022 23:23

Format

6abbf14

Remove raw str

d5d927d

iojw added 3 commits July 19, 2022 20:30

Add data init and image transfer functionality

641e8fa

Fix comment

5a5e532

Add file size to inference output

4e83f6d

Add newline

effd079

concretevitamin requested review from Michaelvll and WoosukKwon July 20, 2022 16:29

Include model init

38631c7

WoosukKwon reviewed Jul 30, 2022

View reviewed changes

iojw added 4 commits August 12, 2022 16:40

Fix imports and style

5e68724

Fix local upload path

4ff088f

Use os.path.splittext for extension checks

b94c05b

Fix quotes

fed2b8d

concretevitamin removed their request for review May 14, 2023 16:35

github-actions bot added the Stale label Sep 12, 2023

github-actions bot closed this Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add example for inference server #994

Add example for inference server #994

iojw commented Jul 19, 2022 •

edited

WoosukKwon commented Jul 19, 2022

iojw commented Jul 20, 2022

WoosukKwon left a comment

WoosukKwon Jul 30, 2022

iojw Aug 12, 2022

WoosukKwon Jul 30, 2022

iojw Aug 12, 2022

WoosukKwon Jul 30, 2022

WoosukKwon Jul 30, 2022

github-actions bot commented Sep 12, 2023

github-actions bot commented Sep 22, 2023

		LOCAL_UPLOAD_FOLDER = '/Users/isaac/Dropbox/Berkeley/Sky/sky/examples/inference_server/uploads'
		REMOTE_UPLOAD_FOLDER = '/remote/path/to/folder'

Add example for inference server #994

Add example for inference server #994

Conversation

iojw commented Jul 19, 2022 • edited

WoosukKwon commented Jul 19, 2022

iojw commented Jul 20, 2022

WoosukKwon left a comment

Choose a reason for hiding this comment

WoosukKwon Jul 30, 2022

Choose a reason for hiding this comment

iojw Aug 12, 2022

Choose a reason for hiding this comment

WoosukKwon Jul 30, 2022

Choose a reason for hiding this comment

iojw Aug 12, 2022

Choose a reason for hiding this comment

WoosukKwon Jul 30, 2022

Choose a reason for hiding this comment

WoosukKwon Jul 30, 2022

Choose a reason for hiding this comment

github-actions bot commented Sep 12, 2023

github-actions bot commented Sep 22, 2023

iojw commented Jul 19, 2022 •

edited