Is it a good idea to support predicting multiple instances upon one request? #2929

zyxue · 2021-02-05T23:30:07Z

Currently, we're using the gPRC and predict_raw interface of seldon to do model serving. I wonder if there is any cons of supporting batch instances prediction (< 20 instances).

e.g. we are thinking of using a message like

message BatchRequest {
    repeated PredictionRequest requests = 1;
}

message PredictionRequest {
    feature_1 = 1;
    feature_2 = 2;
    ...
}

which is serialized to the binData field of SeldonMessage.

Similarly, the response would be

message BatchResponse {
    repeated PredictionResponse responses = 1;
}

Is it a good idea?

I wonder if supporting batch prediction would make it impossible to integrate with more complex features of Seldon in the future like the those in the "Complex Seldon Core Inference Graph" as shown in this picture

The text was updated successfully, but these errors were encountered:

adriangonz · 2021-02-18T09:35:38Z

Hey @zyxue, I'm not sure if you've checked out yet the early support for batch processing in Seldon Core? One of the points that it tries to tackle is precisely how to send / process large batches of requests.

It would be good to hear your thoughts on that!

zyxue · 2021-02-18T17:27:42Z

the batch processing in Seldon Core is for offline use (not real time), right?

ukclivecox · 2021-03-11T08:28:35Z

You can use gRPC with a batch as long as any components in your graph can handle a batch of instances. As @adriangonz mentioned for offline tasks the batch processing route is probably more suited but it sounds like you are looking for real time.

ukclivecox · 2021-03-25T07:32:03Z

Closing now. Please reopen if further discussion needed

zyxue added the triage Needs to be triaged and prioritised accordingly label Feb 5, 2021

ukclivecox assigned adriangonz Feb 11, 2021

ukclivecox removed the triage Needs to be triaged and prioritised accordingly label Feb 11, 2021

adriangonz added the awaiting-feedback label Feb 18, 2021

ukclivecox closed this as completed Mar 25, 2021

ukclivecox removed the awaiting-feedback label Mar 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it a good idea to support predicting multiple instances upon one request? #2929

Is it a good idea to support predicting multiple instances upon one request? #2929

zyxue commented Feb 5, 2021

adriangonz commented Feb 18, 2021

zyxue commented Feb 18, 2021

ukclivecox commented Mar 11, 2021

ukclivecox commented Mar 25, 2021

Is it a good idea to support predicting multiple instances upon one request? #2929

Is it a good idea to support predicting multiple instances upon one request? #2929

Comments

zyxue commented Feb 5, 2021

adriangonz commented Feb 18, 2021

zyxue commented Feb 18, 2021

ukclivecox commented Mar 11, 2021

ukclivecox commented Mar 25, 2021