Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update error message when attempting inference on model with 0 workers #29

Closed
chauhang opened this issue Feb 16, 2020 · 4 comments · Fixed by #305
Closed

Update error message when attempting inference on model with 0 workers #29

chauhang opened this issue Feb 16, 2020 · 4 comments · Fixed by #305
Assignees
Labels
usability Usability issue

Comments

@chauhang
Copy link
Contributor

On adding a model via the management-api, the default min/max workers for the model is set to 0. As a result when running prediction against the model after registering gives a 503 error with details as 'No worker is available to serve request: densenet161'. This will be confusing for user's trying to add models from the inference api.

Register model using:
curl -X POST "http://:/models?url=https://<s3_path>/densenet161.mar"
{
"status": "Model densenet161 registered"
}

Inference using:
curl -X POST http://:/predictions/densenet161 -T cutekit.jpeg
{
"code": 503,
"type": "ServiceUnavailableException",
"message": "No worker is available to serve request: densenet161"
}

Model details:
curl -X GET http://:/models/densenet161
[
{
"modelName": "densenet161",
"modelVersion": "1.0",
"modelUrl": "https:///densenet161.mar",
"runtime": "python",
"minWorkers": 0,
"maxWorkers": 0,
"batchSize": 1,
"maxBatchDelay": 100,
"loadedAtStartup": false,
"workers": []
}
]

@chauhang chauhang added the usability Usability issue label Feb 16, 2020
@harshbafna
Copy link
Contributor

@chauhang

User can provide the initial-workers while registering the models through management API.

As documented in management api doc :

User may want to create workers while register, creating initial workers may take some time, user can choose between synchronous or synchronous call to make sure initial workers are created properly.

The asynchronous call will return before trying to create workers with HTTP code 202:

curl -v -X POST "http://localhost:8081/models?initial_workers=1&synchronous=false&url=https://<s3_path>/squeezenet_v1.1.mar"

< HTTP/1.1 202 Accepted
< content-type: application/json
< x-request-id: 29cde8a4-898e-48df-afef-f1a827a3cbc2
< content-length: 33
< connection: keep-alive
< 
{
  "status": "Worker updated"
}

The synchronous call will return after all workers has be adjusted with HTTP code 200.

curl -v -X POST "http://localhost:8081/models?initial_workers=1&synchronous=true&url=https://<s3_path>/squeezenet_v1.1.mar"

< HTTP/1.1 200 OK
< content-type: application/json
< x-request-id: c4b2804e-42b1-4d6f-9e8f-1e8901fc2c6c
< content-length: 32
< connection: keep-alive
< 
{
  "status": "Worker scaled"
}

We are planning to update the error message as follows :

{
"code": 503,
"type": "ServiceUnavailableException",
"message": "No worker available to serve request for model <model_name>. Please use scale workers api to add workers."
}

Please let us know your thoughts.

@mycpuorg
Copy link
Collaborator

Changing the Error Message sounds like a reasonable way. Should we set this to a default value of 1 so clients can make inference?

@fbbradheintz
Copy link
Contributor

Leaving the default workers as 0 is the safest option, and the user still has the option of specifying initial_workers when registering the model. I would favor leaving the default as it is.

The two measures I'd recommend:

  • Change the 503 error message, as @mycpuorg suggested
  • Include a warning in the response to the call that registers the model, reminding the user that they have no workers dedicated to the model.

With this added communication to mitigate user surprise, I think we'd be in good shape.

@fbbradheintz fbbradheintz changed the title On adding a model via API, the default min/max workers is 0 Update error message when attempting inference on model with 0 workers Feb 27, 2020
@harshbafna harshbafna assigned shivamshriwas and unassigned mjpsl May 19, 2020
@maaquib maaquib moved this from To do to In progress in TorchServe v0.1.1 Issues Lifecycle May 21, 2020
@mycpuorg mycpuorg moved this from In progress to In Testing in TorchServe v0.1.1 Issues Lifecycle May 28, 2020
@mycpuorg mycpuorg moved this from In Testing to Verified (close after merge) in TorchServe v0.1.1 Issues Lifecycle May 28, 2020
@maaquib maaquib closed this as completed Jun 9, 2020
TorchServe v0.1.1 Issues Lifecycle automation moved this from Verified (close after merge) to Done Jun 9, 2020
@DSLituiev
Copy link

I ran into a similar issue when following an example here:

curl -X POST "localhost:8081/models?model_name=resnet152&url=resnet-152-batch.mar&batch_size=4&max_batch_delay=5000&initial_workers=3&synchronous=true"

I plugged in my own model. Is it an outdated piece?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usability Usability issue
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

9 participants