Add String data type support #58

marshallpierce · 2021-02-09T18:31:08Z

Tensor construction now uses different logic for primitive types (using their native in-memory layout) and strings (filling the tensor with onnxruntime's FillStringTensor).
- Extended TypeToTensorElementDataType to also be able to expose utf8 contents, if present
- Utf8Data trait to make it possible to use both String and &str
call_ort helper that takes care of mapping the ort status to a Result so you can't forget to do it
print_structure example that shows the inputs and outputs of an .onnx model (names, types, etc)

Without this, interacting with session inputs for a model with a string input would lead to SIGABRT as the auto-generated Debug, etc, didn't know what to do with an enum variant of 8 (onnxruntime's string type).

If this looks good, I'll work on an integration test with a trivial model to ensure that feeding data through actually does work, but I wanted to get feedback on the approach first.

- Tensor construction now uses different logic for primitive types (using their native in-memory layout) and strings (filling the tensor with onnxruntime's `FillStringTensor`). - Extended `TypeToTensorElementDataType` to also be able to expose utf8 contents, if present - `Utf8Data` trait to make it possible to use both `String` and `&str` - `call_ort` helper that takes care of mapping the ort status to a Result so you can't forget to do it - `print_structure` example that shows the inputs and outputs of an .onnx model (names, types, etc)

marshallpierce · 2021-02-18T23:16:18Z

Strings outputs aren't properly handled, and also even a trivial string model that only applies Tensorflow's unique produces multiple outputs of different types, so output types will need to become more flexible. Returning to draft status until I have that resolved.

nbigaouette

Thanks for your work and sorry for my late reply!

Except from small nitpicks, I wouldn't mind merging this. I would even like to see a small/simple .onnx model committed to test the functionality. Can you create a one that would take a <10s KB?

Thanks!

onnxruntime/src/error.rs

onnxruntime/src/tensor/ort_tensor.rs

marshallpierce · 2021-02-22T16:53:05Z

I'll tidy up this stuff per your comments. I've found that just applying TensorFlow's unique as an ultra simple "model" produces a tiny 424 byte .onnx artifact that doesn't need any custom operators, so no problem checking that in, but it produces outputs of both string and int types. Could add a test that it doesn't crash when you run it with string inputs, but can't really meaningfully assert on the output until (1) string output and (2) dynamic type output are addressed in a follow up PR.

onnxruntime/src/tensor/ort_tensor.rs

nbigaouette · 2021-02-24T01:51:47Z

Awesome! Thanks for this!! 👍

marshallpierce marked this pull request as ready for review February 16, 2021 22:12

marshallpierce marked this pull request as draft February 18, 2021 23:16

nbigaouette reviewed Feb 22, 2021

View reviewed changes

onnxruntime/src/error.rs Outdated Show resolved Hide resolved

onnxruntime/src/error.rs Outdated Show resolved Hide resolved

onnxruntime/src/tensor/ort_tensor.rs Show resolved Hide resolved

onnxruntime/src/tensor/ort_tensor.rs Show resolved Hide resolved

Better names, style fixes

f2ed12a

marshallpierce force-pushed the mp/string-type branch from 152a0cd to f2ed12a Compare February 22, 2021 17:20

marshallpierce marked this pull request as ready for review February 22, 2021 17:20

Appease clippy

d5c5e0e

nbigaouette reviewed Feb 24, 2021

View reviewed changes

onnxruntime/src/tensor/ort_tensor.rs Show resolved Hide resolved

onnxruntime/src/tensor/ort_tensor.rs Show resolved Hide resolved

Fix test name

a98d630

nbigaouette merged commit 73d770f into nbigaouette:master Feb 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add String data type support #58

Add String data type support #58

marshallpierce commented Feb 9, 2021

marshallpierce commented Feb 18, 2021

nbigaouette left a comment

marshallpierce commented Feb 22, 2021

nbigaouette commented Feb 24, 2021

Add String data type support #58

Add String data type support #58

Conversation

marshallpierce commented Feb 9, 2021

marshallpierce commented Feb 18, 2021

nbigaouette left a comment

Choose a reason for hiding this comment

marshallpierce commented Feb 22, 2021

nbigaouette commented Feb 24, 2021