Improve mapping and add more tests for make_tensor #4270

jcwchen · 2022-06-13T20:53:46Z

Describe your changes

Refactor mapping
Deprecate TENSOR_TYPE_TO_STORAGE_TENSOR_TYPE
Use functions instead of more maps to prevent confusions and save binary size
Add more tests (test_make_tensor) for more tensor types

Motivation and Context

Relevant to provide a nicer API for accessing the proper member of TensorProto #4261. Current mapping is quite confusing and there are some duplicate implementation for getting field by a tensor type.
Now there are missing some tensor types for test_make_tensor w/ and w/o raw.

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

onnx/mapping.py

garymm

I think we should go much further and make only a small number of very useful, user-friendly things public and hide the rest.

Everything currently inside mapping.py should be made private, since I think it's all implemtnation details that clients shouldn't rely on.

I propose:

Add helper.tensor_field_for_data_type(TensorProto, DataType), helper.tensor_field_for_numpy_dtype(TensorProto, np.dtype). These do all of the mapping lookups and the getattr call.
Move mapping.py to _mapping.py
Add a new mapping.py that does something like:

import warnings
warnings.warn("onnx.mapping is deprecated and will be removed in a future release. Use functions in onnx.helper instead.", ...)

# for temporary backwards compatibility
from _mapping import *

If there's something else in mapping.py that is truly needed by users, we can expose that in helper.py as well.

WDYT?

onnx/test/hub_test.py

onnx/test/parser_test.py

onnx/test/helper_test.py

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

onnx/test/helper_test.py

onnx/helper.py

onnx/mapping.py

…o jcw/improve-raw Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> # Conflicts: # onnx/mapping.py # onnx/test/helper_test.py

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

…o jcw/improve-raw Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> # Conflicts: # onnx/numpy_helper.py

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

lgtm-com · 2022-07-14T16:22:43Z

This pull request introduces 1 alert when merging 8df7a7a into 0fc92e4 - view on LGTM.com

new alerts:

1 for Unused import

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

xadupre · 2022-09-15T08:08:14Z

onnx/numpy_helper.py

@@ -36,10 +36,9 @@ def to_array(tensor: TensorProto, base_dir: str = "") -> np.ndarray:
        raise TypeError("The element type in the input tensor is not defined.")

    tensor_dtype = tensor.data_type
-    np_dtype = mapping.TENSOR_TYPE_TO_NP_TYPE[tensor_dtype]
-    storage_type = mapping.TENSOR_TYPE_TO_STORAGE_TENSOR_TYPE[tensor_dtype]


In PR #4510, I remove the use of dictionary TENSOR_TYPE_TO_STORAGE_TENSOR_TYPE. It think use int8 or uint8 to store bool should not be allowed.

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

gramalingam · 2022-09-22T05:35:14Z

onnx/numpy_helper.py

@@ -156,13 +155,9 @@ def to_list(sequence: SequenceProto) -> List[Any]:
    """
    lst: List[Any] = []
    elem_type = sequence.elem_type
-    value_field = mapping.STORAGE_ELEMENT_TYPE_TO_FIELD[elem_type]
-    values = getattr(sequence, value_field)
+    values = helper.get_attr_from_sequence_elem_type(sequence, elem_type)


IMHO, I think it would be simpler to rewrite the code as below:

if elem_type == SequenceProto.TENSOR: return [to_array(v) for v in sequence.tensor_values] elif elem_type == SequenceProto.SPARSE_TENSOR: return [to_array(v) for v in sequence.sparse_tensor_values] elif elem_type == SequenceProto.SEQUENCE: return [to_list(v) for v in sequence.sequence_values] etc.

I don't think we need to define the helper methods like get_attr_from_seqience_elem_type.

gramalingam · 2022-09-22T05:38:56Z

onnx/helper.py

+    return cast(str, mapping.STORAGE_ELEMENT_TYPE_TO_FIELD[elem_type])
+
+
+def get_attr_from_sequence_elem_type(tensor: SequenceProto, elem_type: int) -> Any:


I don't think we need these utility/helper methods. Please see my comment in numpy_helper.py on rewriting the code there. I think splitting the code into these two files actually makes the logic of either one harder to understand. (I understand this is a problem with the pre-existing code introduced by someone else previously, and not this PR.)

For readability, yes I think keeping if-else statement makes more sense, but for maintainability, having a common function for the same mapping seems not too bad (we don't need to change the mapping in several places if mapping is changed). However, if the attribute mapping for protos is quite stable, I am fine with having if-else statements. I have removed attribute map for SequenceProto and OptionalProto and used if-else statements instead.

Not sure whether it is a concern -- that also means that previously the attribute mappings for SequenceProto and OptionalProto are public used functions and in the future they will be removed.

gramalingam · 2022-09-22T05:40:46Z

onnx/numpy_helper.py

@@ -318,10 +313,9 @@ def to_optional(optional: OptionalProto) -> Optional[Any]:
    elem_type = optional.elem_type
    if elem_type == OptionalProto.UNDEFINED:
        return opt


Change to return None

gramalingam · 2022-09-22T05:41:56Z

onnx/numpy_helper.py

    # TODO: create a map and replace conditional branches
-    if elem_type == OptionalProto.TENSOR or elem_type == OptionalProto.SPARSE_TENSOR:
+    if elem_type in (OptionalProto.TENSOR, OptionalProto.SPARSE_TENSOR):
        opt = to_array(value)
    elif elem_type == OptionalProto.SEQUENCE:
        opt = to_list(value)


Change to return to_list(optional.tensor_value)

Likewise for all the cases. We don't need the helper utility method, whose purpose is only to do this switch/case anyway.

Done. Thanks!

gramalingam · 2022-09-22T05:55:04Z

onnx/helper.py

+    return mapping.TENSOR_TYPE_MAP[int(tensor_dtype)].name
+
+
+def tensor_dtype_to_storage_numpy_type(tensor_dtype: int) -> np.dtype:


I wonder if we can omit this function. Users can call the two functions themselves. It is not easy to document/describe, and too many such functions in the API can confuse a reader.

Makes sense to me. Just omitted.

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

onnx/test/helper_test.py

+            TensorProto.COMPLEX128,
+        }
+    ],
+    ids=lambda tensor_dtype: helper.tensor_dtype_to_string(tensor_dtype),


onnx/test/helper_test.py

+        for t in helper.get_all_tensor_types()
+        if t not in {TensorProto.BFLOAT16, TensorProto.STRING}
+    ],
+    ids=lambda tensor_dtype: helper.tensor_dtype_to_string(tensor_dtype),


Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

gramalingam · 2022-09-26T19:25:29Z

docs/PythonAPIOverview.md

@@ -148,6 +148,16 @@ Runnable IPython notebooks:
 - [make_model.ipynb](/onnx/examples/make_model.ipynb)
 - [Protobufs.ipynb](/onnx/examples/Protobufs.ipynb)

+## Conversion utilities for mapping attributes in ONNX IR


Nit: suggest moving this to end of the existing sub-section on Manipulating TensorProto and Numpy Array instead of this new sub-section

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

* Improve mapping and add more tests for make_tensor Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix mypy typecheck failures Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix issues Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * first fix by reviews Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * use double quotes instead of single one Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix mypy typecheck Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * move to function from mapping to helper Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * remove duplicate import Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix flake8 Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix getattr error due to np.bool Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * use helper instead of mapping in all places Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * update test coverage Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix typecheck Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * update test coverage Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * update operators.md Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix warnings Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * introduce get_attr_from_sequence_elem_type and optional Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * remove wrong import Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * use namedtuple for better readability Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix warnings and conflicts Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * import KeysView from typing instead of collections Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix mypy failures Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * isort and black Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * add docs Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * use dobule quote and f-string Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * remove space Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * dtype instead of type Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * more dtype Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix isort and remove tensor_dtype_to_storage_numpy_type Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * remove get_attr_from_sequence_elem_type and optional Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix mypy Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * move section in PythonAPI Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix lint Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * fix lint part 2 Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> * cover call examples in PythonAPI Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

Improve mapping and add more tests for make_tensor

28d153f

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

jcwchen added test utility labels Jun 13, 2022

jcwchen requested a review from a team as a code owner June 13, 2022 20:53

fix mypy typecheck failures

fe8156b

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

gramalingam reviewed Jun 13, 2022

View reviewed changes

onnx/mapping.py Outdated Show resolved Hide resolved

gramalingam reviewed Jun 13, 2022

View reviewed changes

onnx/mapping.py Outdated Show resolved Hide resolved

gramalingam reviewed Jun 13, 2022

View reviewed changes

onnx/mapping.py Outdated Show resolved Hide resolved

gramalingam reviewed Jun 13, 2022

View reviewed changes

onnx/mapping.py Outdated Show resolved Hide resolved

garymm suggested changes Jun 13, 2022

View reviewed changes

jcwchen added 3 commits June 15, 2022 10:15

fix issues

6109bc6

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

first fix by reviews

ae61d33

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

use double quotes instead of single one

17cd8b1

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

garymm suggested changes Jun 17, 2022

View reviewed changes

onnx/test/helper_test.py Outdated Show resolved Hide resolved

onnx/test/helper_test.py Outdated Show resolved Hide resolved

onnx/helper.py Outdated Show resolved Hide resolved

onnx/mapping.py Outdated Show resolved Hide resolved

jcwchen marked this pull request as draft June 17, 2022 16:53

jcwchen changed the title ~~Improve mapping and add more tests for make_tensor~~ [WIP] Improve mapping and add more tests for make_tensor Jun 17, 2022

jcwchen added 7 commits June 29, 2022 22:31

Merge branch 'jcw/improve-raw' of https://github.com/jcwchen/onnx int…

5a8b9bb

…o jcw/improve-raw Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> # Conflicts: # onnx/mapping.py # onnx/test/helper_test.py

Merge branch 'main' into jcw/improve-raw

6c7ed01

fix mypy typecheck

ba5a51b

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

move to function from mapping to helper

31b45c4

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

Merge branch 'jcw/improve-raw' of https://github.com/jcwchen/onnx int…

337c686

…o jcw/improve-raw Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com> # Conflicts: # onnx/numpy_helper.py

remove duplicate import

7a7f1c2

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

fix flake8

8df7a7a

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

fix getattr error due to np.bool

8e7c5bd

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

jcwchen added run release CIs Use this label to trigger release tests in CI labels Jul 15, 2022

jcwchen changed the title ~~[WIP] Improve mapping and add more tests for make_tensor~~ Improve mapping and add more tests for make_tensor Jul 15, 2022

jcwchen marked this pull request as ready for review July 15, 2022 00:47

use helper instead of mapping in all places

074c1b1

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

xadupre reviewed Sep 15, 2022

View reviewed changes

jcwchen added 2 commits September 21, 2022 16:07

dtype instead of type

4efb630

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

Merge branch 'main' into jcw/improve-raw

122e477

jcwchen mentioned this pull request Sep 21, 2022

Format Python code in documents with black #4530

Merged

more dtype

61f09f5

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

jcwchen requested review from gramalingam and xadupre September 21, 2022 23:38

gramalingam reviewed Sep 22, 2022

View reviewed changes

jcwchen added 2 commits September 23, 2022 17:47

Merge branch 'main' into jcw/improve-raw

aba5ceb

fix isort and remove tensor_dtype_to_storage_numpy_type

05f7d73

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

github-advanced-security bot found potential problems Sep 24, 2022

View reviewed changes

jcwchen added 2 commits September 26, 2022 11:13

remove get_attr_from_sequence_elem_type and optional

57e11d0

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

fix mypy

76b21d2

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

gramalingam reviewed Sep 26, 2022

View reviewed changes

move section in PythonAPI

0e098f0

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

gramalingam approved these changes Sep 26, 2022

View reviewed changes

jcwchen added 4 commits September 26, 2022 14:37

Merge branch 'main' into jcw/improve-raw

00a1561

fix lint

8f742d4

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

fix lint part 2

d36c09f

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

cover call examples in PythonAPI

640f6de

Signed-off-by: Chun-Wei Chen <jacky82226@gmail.com>

jcwchen linked an issue Sep 27, 2022 that may be closed by this pull request

provide a nicer API for accessing the proper member of TensorProto #4261

Closed

jcwchen merged commit de24599 into onnx:main Sep 28, 2022

jcwchen deleted the jcw/improve-raw branch September 28, 2022 04:22

jcwchen mentioned this pull request Sep 28, 2022

Fix several issues regarding recent mapping update #4551

Merged

xadupre mentioned this pull request Sep 28, 2022

New added function generates DeprecatedWarning #4552

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve mapping and add more tests for make_tensor #4270

Improve mapping and add more tests for make_tensor #4270

jcwchen commented Jun 13, 2022 •

edited

Loading

garymm left a comment

lgtm-com bot commented Jul 14, 2022

xadupre Sep 15, 2022

gramalingam Sep 22, 2022

gramalingam Sep 22, 2022

jcwchen Sep 26, 2022

gramalingam Sep 22, 2022

gramalingam Sep 22, 2022

gramalingam Sep 22, 2022

jcwchen Sep 26, 2022

gramalingam Sep 22, 2022

jcwchen Sep 26, 2022

gramalingam Sep 26, 2022

jcwchen Sep 26, 2022

		return cast(str, mapping.STORAGE_ELEMENT_TYPE_TO_FIELD[elem_type])


		def get_attr_from_sequence_elem_type(tensor: SequenceProto, elem_type: int) -> Any:

		return mapping.TENSOR_TYPE_MAP[int(tensor_dtype)].name


		def tensor_dtype_to_storage_numpy_type(tensor_dtype: int) -> np.dtype:

Improve mapping and add more tests for make_tensor #4270

Improve mapping and add more tests for make_tensor #4270

Conversation

jcwchen commented Jun 13, 2022 • edited Loading

garymm left a comment

Choose a reason for hiding this comment

lgtm-com bot commented Jul 14, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jcwchen commented Jun 13, 2022 •

edited

Loading