Adaptive batching leads to parameters being cut off #1541

tobbber · 2024-01-17T14:10:34Z

Hi, I observed some weird behavior when using the REST API with adaptive batching enabled.
When sending a single request to the v2 REST endpoint /v2/models/<MODEL>/infer the Parameters within the responseOutput are cut off. If a parameter is not an iterable, a TypeError is raised: e.g. TypeError: 'int' object is not iterable

Note that this only happens when:

Adaptive batching is enabled
A single request is sent within the max_batch_time time window

How to Reproduce:

# model.py 
from mlserver import MLModel
from mlserver.types import InferenceResponse, ResponseOutput, InferenceRequest

class EchoModel(MLModel):
	async def load(self):
		return True

        async def predict(self, payload: InferenceRequest):
		request_input = payload.inputs[0]
		# return the payload input as output
		output = ResponseOutput(**request_input.dict())
		return InferenceResponse(model_name=self.name, outputs=[output])

// model-settings.json
{
	"name": "echoModel",
	"max_batch_time": 2,
	"max_batch_size": 32,
	"implementation": "model.EchoModel"
}

Request Body:

// POST to localhost:8080/v2/models/echoModel/infer
{
	"inputs": [{
		"name": "docs",
		"shape": [2],
		"datatype": "INT32",
		"parameters": {
			"id": "123"
		},
		"data": [10,11]
	}]
}

Expected behavior: EchoModel returns the RequestInput as Output.

Actual behavior: Parameter in the output are cut off or TypeError is raised

Examples:

input parameters: {"custom-param": "123"} --> output parameters: {"custom-param": "1"}
input parameters: {"custom-params": ["123", "456"]} --> output parameters: {"custom-param": "123"}
input parameters: {"custom-param": 123 } --> TypeError: 'int' object is not iterable

It seems like the Parameters are unbatched even if they were never batched in the first place.

The text was updated successfully, but these errors were encountered:

yaliqin · 2024-02-12T23:16:02Z

Hi @tobbber Can you share the Dockerfile used? I tried to wrap up my code as a similar way and set up the batch settings. Then I met the error of prometheus_client issue as below
File "/opt/conda/lib/python3.8/site-packages/prometheus_client/metrics.py", line 121, in __init__ registry.register(self) File "/opt/conda/lib/python3.8/site-packages/prometheus_client/registry.py", line 29, in register raise ValueError( ValueError: Duplicated timeseries in CollectorRegistry: {'batch_request_queue_count', 'batch_request_queue_bucket', 'batch_request_queue_created', 'batch_request_queue_sum'}
I used mlserver build and the generated Dockerfile use seldonio/mlserver:1.3.5-slim

tobbber · 2024-02-13T12:23:33Z

Hi @yaliqin, i used the Mlserver CLI directly with mlserver start mlserver_example/ with structure:

mlserver_example/
├── model-settings.json
└── model.py

To install mlserver i used pip install mlserver==1.3.5

yaliqin · 2024-02-13T14:51:49Z

Thank you very much! Which python version are you using?

…

On Tue, Feb 13, 2024 at 4:23 AM Tobi ***@***.***> wrote: Hi @yaliqin <https://github.com/yaliqin>, i used the Mlserver CLI directly with mlserver start mlserver_example/ with structure: mlserver_example/ ├── model-settings.json └── model.py To install mlserver i used pip install mlserver==1.3.5 — Reply to this email directly, view it on GitHub <#1541 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKSGHEZNB5HWGQOTQ5KVORLYTNLNBAVCNFSM6AAAAABB6TOTQ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBRGM4TMOJVGM> . You are receiving this because you were mentioned.Message ID: ***@***.***>

tobbber · 2024-02-14T13:09:30Z

I am using python 3.11.6 on a arm64 machine (M1 mac)

yaliqin · 2024-02-14T17:34:44Z

Thanks @tobbber. mlserver start . worked but the docker run failed. Will check the difference.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adaptive batching leads to parameters being cut off #1541

Adaptive batching leads to parameters being cut off #1541

tobbber commented Jan 17, 2024 •

edited

Loading

yaliqin commented Feb 12, 2024 •

edited

Loading

tobbber commented Feb 13, 2024

yaliqin commented Feb 13, 2024 via email

tobbber commented Feb 14, 2024

yaliqin commented Feb 14, 2024

Adaptive batching leads to parameters being cut off #1541

Adaptive batching leads to parameters being cut off #1541

Comments

tobbber commented Jan 17, 2024 • edited Loading

How to Reproduce:

Expected behavior: EchoModel returns the RequestInput as Output.

Actual behavior: Parameter in the output are cut off or TypeError is raised

Examples:

yaliqin commented Feb 12, 2024 • edited Loading

tobbber commented Feb 13, 2024

yaliqin commented Feb 13, 2024 via email

tobbber commented Feb 14, 2024

yaliqin commented Feb 14, 2024

tobbber commented Jan 17, 2024 •

edited

Loading

yaliqin commented Feb 12, 2024 •

edited

Loading