Expose a Python interface for inference functions #4409

jbachurski · 2022-08-04T10:28:34Z

Exposes a node-level API for performing type/shape inference to Python bindings.

Description
(Updated)

Adds a low-level function OpSchema._infer_node_outputs, to which serialised protobuf bytes are passed - arguments: schema, node, input types, input data/other inference context arguments, returns: inferred output types.
Adds a public interface shape_inference.infer_node_outputs:

def infer_node_outputs(
     schema: onnx.defs.OpSchema,
     node: onnx.NodeProto,
     input_types: Dict[str, onnx.TypeProto],
     input_data: Optional[Dict[str, onnx.TensorProto]] = None,
     input_sparse_data: Optional[Dict[str, onnx.SparseTensorProto]] = None,
 ) -> Dict[str, onnx.TypeProto]:  # output_types

Underneath this performs validation of passed in nodes/types, ValidationErrors and InferenceErrors are raised.
Should be the only used function, OpSchema._infer_node_outputs is internal to the implementation.
input_types keys are named like node declares its inputs.
input_data and input_sparse_data are used for passing in known constant inputs (as per C++ usage).
Could implement passing in partial data propagation results and GraphInferenceContext in the future.
Adds test cases for some inference calls from the Python side.

Motivation and Context

See Exposing a Python interface for calling inference functions #4405

jbachurski · 2022-08-09T08:58:30Z

@gramalingam Would you mind viewing the proposal if this sort of interface would work? In the future it could be extended to support the other arguments, they could be optional so it's backward compatible.

onnx/defs/schema.h

gramalingam · 2022-08-09T21:42:28Z

Updating the pyi definition for OpSchema would be useful.

onnx/test/inference_function_test.py

onnx/cpp2py_export.cc

jbachurski · 2022-08-10T10:11:10Z

I think it might be useful to promote an API like this as a wrapper around the raw C++ method (that is, make if part of the public API instead of something in just this test-file). But, for that, I don't think we would want node to be optional. And am not sure about num_outputs either.

When it comes to the Python-side interface: A node has input/output names, operator name & domain, attributes. Own name and docstring don't matter. We are missing basically only the operator version it seems, and the input types. I see some variations of the possible signature:

Inputs & outputs
- list[TypeProto] - we already have an ordering of fields inside the node. Node inputs/outputs are positional anyway, so making it explicit with ordering in a list seems okay.
- dict[str, TypeProto] - since the node names the fields in definition. On the other hand since the node has to get created explicitly first anyway in this approach we might as well use the names that are within the node. I think this shouldn't mess with generics etc. since if we use a name multiple times is has to have the same type - we get some very simple consistency check.
Operator version
- version: int - get the current operator schema from a list at this current version based on what is in the node.
- OpSchema- make the user explicitly provide the schema themselves. Is more redundant, since the Node already has the domain and identifier defined.

Also, where would we put this function? shape_inference would be fine, but really it should have been called inference (as it also infers types and other stuff?). Could also put it in helper. Any ideas?

Let me know your thoughts and I'll get started on some improvements in the Python-interface area.

jbachurski · 2022-08-10T14:31:58Z

@gramalingam I pushed some changes:

std::unordered_map<std::string, py::bytes> CallNodeInferenceFunction(
    OpSchema* schema,
    const py::bytes& nodeBytes,
    std::unordered_map<std::string, py::bytes> valueTypesByNameBytes,
    std::unordered_map<std::string, py::bytes> inputDataByNameBytes,
    std::unordered_map<std::string, py::bytes> inputSparseDataByNameBytes) {

Now returns a map, since the inputs are a map already. Doesn't return entries were the output TypeProto was not initialised (inference failed implicitly?).
Added OpSchema::Verify before the call to the inference (checks if the count of inputs/outputs is correct, etc.), and OpSchema::CheckInputOutputType afterwards (checks if the types are consistent with the signature). Both should throw ValidationError.
At the same time, inference itself may throw InferenceError. Interestingly, this may also throw errors related to attributes (so things like Cast will check here if exactly one of the right attributes is set).
Let me know if I should handle exceptions there in some more involved way. I think what is thrown already reflects most of what check_model and infer_shapes do.

# class OpSchema:
    def infer_node_outputs(self, node_proto: bytes, value_types: Dict[str, bytes],
                           input_data: Dict[str, bytes], input_sparse_data: Dict[str, bytes]
                           ) -> Dict[str, bytes]: ...

This is just a binding for the C++ function. I added it to defs.pyi as above.
My general feeling is that the attributes after value_types should have empty defaults so that in the future generatedShapeData and graphInferenceContext may be supported as well (but this involves, I think, significant work).

# shape_inference.py
def infer_node_outputs(
    schema: onnx.defs.OpSchema, node: onnx.NodeProto, input_types: Dict[str, onnx.TypeProto],
    input_data: Optional[Dict[str, onnx.TensorProto]] = None,
    input_sparse_data: Optional[Dict[str, onnx.SparseTensorProto]] = None
) -> Dict[str, onnx.TypeProto]: ...

The public-facing Python function does all of the conversions and expects proper types.
Already has default arguments, which we may change in the future. Reflect the actual underlying implementation.
Does some checks if all the expected keys are present and only copies over what is actually required on C++ side for some performance.

Let me know what you think of these changes! I also added some more tests.

onnx/test/inference_function_test.py

onnx/cpp2py_export.cc

xadupre · 2022-08-11T16:26:36Z

onnx/cpp2py_export.cc

@@ -85,6 +136,10 @@ PYBIND11_MODULE(onnx_cpp2py_export, onnx_cpp2py_export) {
            return py::bytes(bytes);
          })
      .def_property_readonly("has_context_dependent_function", &OpSchema::HasContextDependentFunction)
+      .def("infer_node_outputs", CallNodeInferenceFunction,
+          py::arg("nodeBytes"), py::arg("valueTypesByNameBytes"),
+          py::arg("inputDataByNameBytes") = std::unordered_map<std::string, py::bytes>{},


None is better than an empty container as a default value.

I don't think I know what you mean here. This is C++ side, so there's no None. Do you mean to make it a pointer and use nullptr/use std::optional?
I think that this wouldn't be a lot better as the map isn't mutated and also in this it makes sense that the default is "no additional information" (for the inference context we have to pass in some map, and it makes sense to make it empty as there's no input data we know).
Could you provide a code example for the change?

gramalingam

Looks great, thanks very much!

Signed-off-by: KubinGH <kbachurski@gmail.com>

jbachurski · 2022-08-15T09:08:43Z

Thanks!
Should be ready for merge, then?

justinchuby · 2022-08-18T05:02:39Z

onnx/shape_inference.py

+def infer_node_outputs(
+    schema: onnx.defs.OpSchema,
+    node: onnx.NodeProto,
+    input_types: Dict[str, onnx.TypeProto],


Do we explicitly require a dictionary for these params? If not prefer a more generic type like Mapping.

——-

Use more generic types for input parameters and specific types for return values. For example, a function may take a Sequence and return a List.

Diagram for reference: https://gist.github.com/justinchuby/4021cebe9e093f636759a88de325c85f

* Add an experimental infer_types implementation Signed-off-by: KubinGH <kbachurski@gmail.com> * Create infers_.py Signed-off-by: KubinGH <kbachurski@gmail.com> * Fix naming Signed-off-by: KubinGH <kbachurski@gmail.com> * Replace example script with tests Signed-off-by: KubinGH <kbachurski@gmail.com> * Use unittest Signed-off-by: KubinGH <kbachurski@gmail.com> * Use unittest Signed-off-by: KubinGH <kbachurski@gmail.com> * Fix wrapper function signature Signed-off-by: KubinGH <kbachurski@gmail.com> * Get rid of existing getter Signed-off-by: KubinGH <kbachurski@gmail.com> * Run C++ formatter Signed-off-by: KubinGH <kbachurski@gmail.com> * Improve C++ implementation, Python interface, tests Signed-off-by: KubinGH <kbachurski@gmail.com> * Fix CI Signed-off-by: KubinGH <kbachurski@gmail.com> * Use explicit enums for tensor elements Signed-off-by: KubinGH <kbachurski@gmail.com> * Run black and isort for new linter Signed-off-by: KubinGH <kbachurski@gmail.com> * Specify reshape vector dtype Signed-off-by: KubinGH <kbachurski@gmail.com> * Make binding-side attributes optional Signed-off-by: KubinGH <kbachurski@gmail.com> * Run clang-format Signed-off-by: KubinGH <kbachurski@gmail.com> * Make OpSchema method explicitly protected Signed-off-by: KubinGH <kbachurski@gmail.com> * Missed stub file protected Signed-off-by: KubinGH <kbachurski@gmail.com> Signed-off-by: KubinGH <kbachurski@gmail.com>

JakubBachurskiQC force-pushed the python-inference-function branch from 0442057 to 0d33d68 Compare August 4, 2022 10:35

gramalingam reviewed Aug 9, 2022

View reviewed changes

onnx/defs/schema.h Outdated Show resolved Hide resolved

gramalingam reviewed Aug 9, 2022

View reviewed changes

onnx/test/inference_function_test.py Outdated Show resolved Hide resolved

gramalingam reviewed Aug 9, 2022

View reviewed changes

onnx/cpp2py_export.cc Outdated Show resolved Hide resolved

gramalingam reviewed Aug 10, 2022

View reviewed changes

onnx/test/inference_function_test.py Outdated Show resolved Hide resolved

gramalingam reviewed Aug 10, 2022

View reviewed changes

onnx/cpp2py_export.cc Show resolved Hide resolved

JakubBachurskiQC force-pushed the python-inference-function branch 2 times, most recently from 4363146 to da9b92a Compare August 11, 2022 09:26

xadupre reviewed Aug 11, 2022

View reviewed changes

jbachurski force-pushed the python-inference-function branch from 60d4f8a to c07b372 Compare August 11, 2022 21:00

jbachurski marked this pull request as ready for review August 11, 2022 21:49

jbachurski requested a review from a team as a code owner August 11, 2022 21:49

gramalingam approved these changes Aug 12, 2022

View reviewed changes

jbachurski added 13 commits August 15, 2022 10:30

Add an experimental infer_types implementation

a738a94

Signed-off-by: KubinGH <kbachurski@gmail.com>

Create infers_.py

a3bde67

Signed-off-by: KubinGH <kbachurski@gmail.com>

Fix naming

edf4d02

Signed-off-by: KubinGH <kbachurski@gmail.com>

Replace example script with tests

a2ae031

Signed-off-by: KubinGH <kbachurski@gmail.com>

Use unittest

7f60f64

Signed-off-by: KubinGH <kbachurski@gmail.com>

Use unittest

af0baef

Signed-off-by: KubinGH <kbachurski@gmail.com>

Fix wrapper function signature

ed53c24

Signed-off-by: KubinGH <kbachurski@gmail.com>

Get rid of existing getter

0aa12a4

Signed-off-by: KubinGH <kbachurski@gmail.com>

Run C++ formatter

c576694

Signed-off-by: KubinGH <kbachurski@gmail.com>

Improve C++ implementation, Python interface, tests

4442b65

Signed-off-by: KubinGH <kbachurski@gmail.com>

Fix CI

0ec9813

Signed-off-by: KubinGH <kbachurski@gmail.com>

Use explicit enums for tensor elements

15a7bf4

Signed-off-by: KubinGH <kbachurski@gmail.com>

Run black and isort for new linter

ffed356

Signed-off-by: KubinGH <kbachurski@gmail.com>

jbachurski added 5 commits August 15, 2022 10:30

Specify reshape vector dtype

0fa4b36

Signed-off-by: KubinGH <kbachurski@gmail.com>

Make binding-side attributes optional

d3550ff

Signed-off-by: KubinGH <kbachurski@gmail.com>

Run clang-format

8136fd7

Signed-off-by: KubinGH <kbachurski@gmail.com>

Make OpSchema method explicitly protected

8bc7b8e

Signed-off-by: KubinGH <kbachurski@gmail.com>

Missed stub file protected

0b0d99d

Signed-off-by: KubinGH <kbachurski@gmail.com>

jbachurski force-pushed the python-inference-function branch from b1b8742 to 0b0d99d Compare August 15, 2022 08:30

Merge branch 'main' into python-inference-function

8098e46

gramalingam merged commit 1f3cecc into onnx:main Aug 15, 2022

justinchuby reviewed Aug 18, 2022

View reviewed changes

jcwchen mentioned this pull request Nov 2, 2022

Consider GraphInferenceContext in inference functions: InferenceContext #4632

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose a Python interface for inference functions #4409

Expose a Python interface for inference functions #4409

jbachurski commented Aug 4, 2022 •

edited

jbachurski commented Aug 9, 2022

gramalingam commented Aug 9, 2022

jbachurski commented Aug 10, 2022

jbachurski commented Aug 10, 2022

xadupre Aug 11, 2022

jbachurski Aug 11, 2022

gramalingam left a comment

jbachurski commented Aug 15, 2022

justinchuby Aug 18, 2022

Expose a Python interface for inference functions #4409

Expose a Python interface for inference functions #4409

Conversation

jbachurski commented Aug 4, 2022 • edited

jbachurski commented Aug 9, 2022

gramalingam commented Aug 9, 2022

jbachurski commented Aug 10, 2022

jbachurski commented Aug 10, 2022

xadupre Aug 11, 2022

Choose a reason for hiding this comment

jbachurski Aug 11, 2022

Choose a reason for hiding this comment

gramalingam left a comment

Choose a reason for hiding this comment

jbachurski commented Aug 15, 2022

justinchuby Aug 18, 2022

Choose a reason for hiding this comment

jbachurski commented Aug 4, 2022 •

edited