-
Notifications
You must be signed in to change notification settings - Fork 607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend context and name propagation in errors #5396
Extend context and name propagation in errors #5396
Conversation
786f7e4
to
2584a08
Compare
!build |
CI MESSAGE: [13808683]: BUILD STARTED |
CI MESSAGE: [13808683]: BUILD FAILED |
!build |
CI MESSAGE: [13812812]: BUILD STARTED |
CI MESSAGE: [13812812]: BUILD PASSED |
dali/pipeline/graph/graph_descr.cc
Outdated
"Operator `" + GetOpDisplayName(spec, true) + "` does not support in-place execution."); | ||
DALI_ENFORCE(spec.NumRegularInput() <= schema.MaxNumInput(), | ||
"Operator '" + spec.SchemaName() + | ||
"' supports a maximum of " + std::to_string(schema.MaxNumInput()) + " inputs, " | ||
"Operator `" + GetOpDisplayName(spec, true) + | ||
"` supports a maximum of " + std::to_string(schema.MaxNumInput()) + " inputs, " | ||
"but was passed " + std::to_string(spec.NumRegularInput()) + "."); | ||
DALI_ENFORCE(spec.NumRegularInput() >= schema.MinNumInput(), | ||
"Operator '" + spec.SchemaName() + | ||
"' supports a minimum of " + std::to_string(schema.MinNumInput()) + " inputs, " | ||
"Operator `" + GetOpDisplayName(spec, true) + | ||
"` supports a minimum of " + std::to_string(schema.MinNumInput()) + " inputs, " | ||
"but was passed " + std::to_string(spec.NumRegularInput()) + "."); | ||
DALI_ENFORCE(spec.NumOutput() == schema.CalculateOutputs(spec) + additional_outputs, | ||
"Operator '" + spec.SchemaName() + "' supports " | ||
"Operator `" + GetOpDisplayName(spec, true) + "` supports " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgrade to make_string? You'll be able to drop all those std::to_string
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -96,7 +96,7 @@ void PropagateError(ErrorInfo error) { | |||
} | |||
} | |||
|
|||
std::string GetErrorContextMessage(const OpSpec &spec) { | |||
std::string GetErrorContextMessage(const OpSpec &spec, const std::string &message_name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using a string_view
in such functions?
std::string GetErrorContextMessage(const OpSpec &spec, const std::string &message_name) { | |
std::string GetErrorContextMessage(const OpSpec &spec, std::string_view message_name) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
dali/pipeline/pipeline.cc
Outdated
DALI_FAIL(make_string("Error for ", FormatArgument(spec, arg_name), | ||
". Named arguments inputs to operators must be CPU data nodes. " | ||
"However, a GPU data node was provided.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this message. It says that the input is GPU, but we're not really sure, are we? We know that it's not CPU. Perhaps there's some internal error and it's just empty? I think we should stick to the known facts and report those.
DALI_FAIL(make_string("Error for ", FormatArgument(spec, arg_name), | |
". Named arguments inputs to operators must be CPU data nodes. " | |
"However, a GPU data node was provided.")); | |
DALI_FAIL(make_string("Error for ", FormatArgument(spec, arg_name), | |
". Named arguments inputs to operators must be CPU data nodes. "")); |
or otherwise make sure that we're saying the truth
DALI_FAIL(make_string("Error for ", FormatArgument(spec, arg_name), | |
". Named arguments inputs to operators must be CPU data nodes. " | |
"However, a GPU data node was provided.")); | |
assert(it->second.has_gpu); | |
DALI_FAIL(make_string("Error for ", FormatArgument(spec, arg_name), | |
". Named arguments inputs to operators must be CPU data nodes. " | |
"However, a GPU data node was provided.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
dali/pipeline/pipeline.cc
Outdated
make_string("Error for ", FormatOutput(spec, i), ". Output name \"", output_name, | ||
"\" conflicts with existing intermediate result name.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make_string("Error for ", FormatOutput(spec, i), ". Output name \"", output_name, | |
"\" conflicts with existing intermediate result name.")); | |
make_string("Error while specifying ", FormatOutput(spec, i), ". Output name \"", output_name, | |
"\" conflicts with an existing intermediate result name.")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
dali/pipeline/pipeline.cc
Outdated
eptr = std::current_exception(); | ||
} | ||
if (eptr) { | ||
PropagateError({eptr, | ||
"Critical error when building pipeline:\n" + GetErrorContextMessage(op_spec), | ||
"\nCurrent pipeline object is no longer valid."}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's wrong with the following?
eptr = std::current_exception(); | |
} | |
if (eptr) { | |
PropagateError({eptr, | |
"Critical error when building pipeline:\n" + GetErrorContextMessage(op_spec), | |
"\nCurrent pipeline object is no longer valid."}); | |
PropagateError({std::current_exception(), | |
"Critical error when building pipeline:\n" + GetErrorContextMessage(op_spec), | |
"\nCurrent pipeline object is no longer valid."}); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, much simpler, adjusted the other occurrences as well.
!build |
CI MESSAGE: [13836164]: BUILD STARTED |
CI MESSAGE: [13836164]: BUILD FAILED |
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
71096a0
to
e8d2079
Compare
Signed-off-by: Krzysztof Lecki <klecki@nvidia.com>
!build |
CI MESSAGE: [14134128]: BUILD STARTED |
CI MESSAGE: [14134128]: BUILD PASSED |
Category: Refactoring
Description:
Add error context related to particular operator during graph construction and pipeline build:
Replace most of the SchemaName() occurrences that were used to indicate the name operator
with the fully formatted operator name in the correct API.
Make sure that the naming information is added as soon as possible, so it can be accessed in partially
constructed schema with the correct values present.
Note: This PR mostly adds the context like:
to places where it was not previously used, but we are processing a single operator.
Error messages are adjusted to show the user-facing input/output/argument name (in uniform way) rather than the internal one.
Otherwise the checks and messages are preserved.
The types of error messages are not adjusted.
Additional information:
Affected modules and functionalities:
Pipeline and spec building
Key points relevant for the review:
Typos, broken formatting or conditions.
Tests:
Some of those error conditions are behind double or triple layer of error checks and are not observable by the Python user.
Will use CI to determine how much adjusting the wording affects the tests.
The context added in AddOperator is verified to work, not all conditions are easily verified from Python tests. I won't build loads of broken GTest pipelines just to match the exact error messages.
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A