Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop "one of" default attribute check in LabelEncoder #5673

Merged
merged 5 commits into from Oct 16, 2023
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
37 changes: 15 additions & 22 deletions onnx/defs/traditionalml/defs.cc
Expand Up @@ -382,28 +382,6 @@ ONNX_ML_OPERATOR_SET_SCHEMA(
fail_shape_inference(
"At least one of values_tensor, values_strings, values_int64s, values_floats must be set.");
}

int default_length, default_type;
std::tie(default_type, default_length) = getAttributeElementTypeAndLength(
ctx, {"default_tensor", "default_string", "default_int64", "default_float"});
if (default_type != TensorProto::UNDEFINED) {
if (value_type != default_type) {
fail_shape_inference(
"The value type ",
value_type,
" and the default type ",
default_type,
" are different, which is not permitted for LabelEncoders.");
}

// Ensure default_tensor is a singleton if set
const AttributeProto* default_tensor = ctx.getAttribute("default_tensor");
if (default_tensor != nullptr &&
(default_tensor->t().dims_size() != 1 || default_tensor->t().dims(0) != 1)) {
fail_shape_inference("default_tensor must be a singleton if set.");
}
}

if (value_length != key_length) {
fail_shape_inference(
"The number of keys ",
Expand All @@ -413,6 +391,21 @@ ONNX_ML_OPERATOR_SET_SCHEMA(
" must be the same in the LabelEncoder.");
}

auto default_attr = ctx.getAttribute("default_tensor");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new test just ensures the consistency of the operator when default_tensor is set. According to the description of the PR, it was failing in any other case. No unit test is failing, so I assume it was not tested. Is it possible to fix the behaviour and add unit test to check it is doing as expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not quite that. When implementing the operator in ORT, I was finding the type inference fails even if I have only specified one of the default_* attributes. There seems to be to do with the fact that the default_* attributes tend to have values specified since they are optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some failure cases are checked here. Is there something you think is missing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All tests involve default_tensor, there is no test to check that a default value is not defined or defined with an unexpected attribute such as (value_int64s, default_floats). A test like that would have failed with your change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is no test to check that a default value is not defined

I don't think that is something we want, since the default value has a fallback (e.g. -1 for int64_t) specified in the specification. This is why it is not tested in the previous implementation I believe.

defined with an unexpected attribute such as (value_int64s, default_floats)

I think this will run into the same issues with ORT unfortunately, but I can check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify Aditya's point (in answer to Xavier's question): since the op-spec specifies a default-value for the various attributes, we need to be careful when flagging an error here: we cannot do that unless we can distinguish whether an attribute-value was explicitly specified in the node, or whether it is getting a default-value from the op-spec. (We should not be flagging an error when we get a default-value from the op-spec.) At the root of this, we have the issue that shape-inference works with the InferenceContext abstract interface, which exposes the getAttribute method. If we want to do such error-checking, we need to extend the interface to distinguish between the two cases mentioned above.)

So, undoing this check makes sense to me. My two cents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explained much better than me! Thanks

if (nullptr != default_attr) {
auto default_tensor = default_attr->t();
if (default_tensor.data_type() != value_type) {
fail_shape_inference(
"The default tensor type ",
default_tensor.data_type(),
" and the value type ",
value_type,
" must be the same in the LabelEncoder.");
}
if (1 != default_tensor.dims_size() || 1 != default_tensor.dims(0)) {
fail_shape_inference("The default tensor must be a singleton 1D tensor.");
}
}
// Propagate shape from input type and assign output type based on value type
ctx.getOutputType(0)->mutable_tensor_type()->set_elem_type(value_type);
propagateShapeFromInputToOutput(ctx, 0, 0);
Expand Down