You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My question is when I send a Batched input request to the server in my case like below (code attached), is the order the server expects the raw binary bytes data to be in as follows: input_ids[0]+segment_ids[0]+input_mask[0]+input_ids[1]+segment_ids[1]+input_mask[1]+... (as by commented out code) or is it order of all input_ids then all segment_ids then all input_mask (as by comment "# or Order of batch inputs #2" in code below)?
I'm not able to match my expected output from my inputs currently is why, except for the first input in the batched input going in has the correct corresponding output, the rest of the batched inputs don't match the expected output though...
batch_size = len(input_ids.cpu().numpy().tolist())
input_ids = tensor_to_numpy_and_padd(input_ids) # input token padded with 0's if necessary
segment_ids = tensor_to_numpy_and_padd(segment_ids) # this is always a 0 array
input_mask = tensor_to_numpy_and_padd(input_mask) # always 1 array padded with 0's
# Order of batch inputs #1
# x = b''
# for i in range(len(input_ids)):
# x = x + input_ids[i] + segment_ids[i] + input_mask[i]
# data = x
# or Order of batch inputs #2?
x = b''
for i in range(len(input_ids)):
x = x + input_ids[i]
for i in range(len(segment_ids)):
x = x + segment_ids[i]
for i in range(len(input_mask)):
x = x + input_mask[i]
data = x
inference_server_root = "http://{}:8000/api/infer/bert_mtdnn_onnx".format(inference_server_container_name)
r = requests.post(
url=inference_server_root,
headers={
'NV-InferRequest': 'batch_size: ' + str(batch_size) + ' input [{ name: "input_ids" }, { name: "segment_ids"}, { name: "input_mask"}] output [{ name: "hedging_0" cls { count: 2 } }, { name: "sentiment_1" cls { count: 2 } }, { name: "explainer_2" cls { count: 2 } }, { name: "memoryloss_3" cls { count: 2 } }]',
'Content-Type': 'application/octet-stream'
},
data=data
)
def tensor_to_numpy_and_padd(tens_variable):
tens_variable = tens_variable.cpu().numpy().tolist()
batch_list = []
for i in tens_variable:
out = [0] * max_length_bert # padd to max_length_bert
out[:len(i)] = i
out = np.asarray(out, dtype=np.int64) # dtype for all variables is INT64 ***
out = out.tobytes()
batch_list.append(out)
return batch_list
Thanks.
The text was updated successfully, but these errors were encountered:
jfdsr
changed the title
Raw binary data order of multiple inputs batch request + Response dimensions index order
Raw binary data order of multiple inputs batch request
Apr 19, 2020
The input tensor values are communicated in the body of the HTTP POST request as raw binary in the order as the inputs are listed in the request header. See detail.
Have you tried to use our provided HTTP client library and see if the results are correct?
My ONNX model configuration file is:
My question is when I send a Batched input request to the server in my case like below (code attached), is the order the server expects the raw binary bytes data to be in as follows: input_ids[0]+segment_ids[0]+input_mask[0]+input_ids[1]+segment_ids[1]+input_mask[1]+... (as by commented out code) or is it order of all input_ids then all segment_ids then all input_mask (as by comment "# or Order of batch inputs #2" in code below)?
I'm not able to match my expected output from my inputs currently is why, except for the first input in the batched input going in has the correct corresponding output, the rest of the batched inputs don't match the expected output though...
Thanks.
The text was updated successfully, but these errors were encountered: