Implement L1 graph transformer for free dimension override#1825
Conversation
|
Can you add a test case where you're actually overriding the dimension? |
|
/azp run |
|
Azure Pipelines successfully started running 21 pipeline(s). |
|
Added a testcase |
|
/azp run |
|
Azure Pipelines successfully started running 21 pipeline(s). |
|
/azp run |
|
Commenter does not have sufficient privileges for PR 1825 in repo microsoft/onnxruntime |
|
/azp run |
|
Azure Pipelines successfully started running 21 pipeline(s). |
| *new_dimension = dimension; | ||
|
|
||
| if (dimension.has_denotation()) { | ||
| // Convert to lowercase to perform case-insensitive comparison |
There was a problem hiding this comment.
I've typically seen a symbolic dimension specified in dim_param. How does that relate to the usage of denotation? Are they equivalent ways of doing the same thing, in which case we could/should support both?
There was a problem hiding this comment.
From this doc, it doesn't look like this is meant for specifying symbolic shapes. Are the WinML models using this instead of dim_param or both?
There was a problem hiding this comment.
In WinML, we exclusively use the denotation for overriding free dimensions. As far as I'm aware, the value stored in dim_param isn't really used.
There was a problem hiding this comment.
If dim_param is used, this optimization won't apply. We should probably check for both?
There was a problem hiding this comment.
Hmm, I'm not sure. Is that the intended use of dim_param? My understanding is that dim_param can be any arbitrary string, while denotations have defined values (like "DATA_BATCH") that hopefully all converters and tools can rely on. I would lean towards doing the conservative thing right now and using just the denotation. If we find a case where this is causing us to miss an optimization, it's something we can always add later.
There was a problem hiding this comment.
Based on that document, should something processing denotations stick to looking for the defined names like DATA_BATCH (possibly without the case insensitive matching too for simplicity)?
This also seems like a somewhat rare use case (at least from the models I have seen), so wondering if we really want to add it as a default transformer vs. making it available as part of ORT but having it manually added by people who want to use it. Also 'free dimension override' may be too general a name for something only dealing with denotations.
The way this transformer works could probably be generalized to support a similar thing with symbolic dimensions where the name to match comes from dim_param, but I'm not sure if there's a real use case for that and I don't think you want a single transformer to handle both. They could share an implementation but handling both in one transformer brings in questions like how to resolve a dimension with a dim_param and denotation.
There was a problem hiding this comment.
As an example of dim_param usage, the symbolic shape inferencing script used by nuphar to infer shapes just looks at dim_param for symbolic dimension names when propagating shape information through a graph. https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/providers/nuphar/scripts/symbolic_shape_infer.py
There was a problem hiding this comment.
Based on that document, should something processing denotations stick to looking for the defined names like DATA_BATCH (possibly without the case insensitive matching too for simplicity)?
This is actually how it's implemented in WinML - it looks specifically for the string "DATA_BATCH" and performs the override based on that. But because we've seen models with free dimensions other than the batch, this transformer generalizes the concept to use an arbitrary string. This allows callers to specify any of the defined denotations (e.g. onnx::DATA_FEATURE, onnx::DATA_CHANNEL).
This also seems like a somewhat rare use case (at least from the models I have seen), so wondering if we really want to add it as a default transformer vs. making it available as part of ORT but having it manually added by people who want to use it. Also 'free dimension override' may be too general a name for something only dealing with denotations.
Interestingly this is something we see quite commonly. @jeffbloo can give more details, but my understanding is that there are several cases where this is important. Most commonly free dimensions used to allow a model to have customizable batch sizes, but I believe we've also seen cases where models are using free width/height dimensions too (e.g. for convolutional models).
The denotation is how free dimensions get tagged, but ultimately what this transformer achieves is that it overrides free dimensions and replaces them with static dimensions. To clarify, this transformer doesn't do anything unless the caller actually specifies any overrides in the session options.
The way this transformer works could probably be generalized to support a similar thing with symbolic dimensions where the name to match comes from dim_param, but I'm not sure if there's a real use case for that and I don't think you want a single transformer to handle both. They could share an implementation but handling both in one transformer brings in questions like how to resolve a dimension with a dim_param and denotation.
Yes, I think this makes sense. Especially since dim_param and denotation are orthogonal, it seems unusual to look at both with only a single key. Today we have real models that have these denotations and use free dimension overrides in WinML, but I don't think we've seen any equivalent for dim_param.
There was a problem hiding this comment.
As an example of dim_param usage, the symbolic shape inferencing script used by nuphar to infer shapes just looks at dim_param for symbolic dimension names when propagating shape information through a graph.
I'm not sure if I understand... the goal of this PR is to replace dim_params with dim_values, so that dim_value is what gets propagated through the graph during shape inference. That is, the goal is to override the free dimensions (dim_param) with static shapes (dim_value) so that the static shapes are available to the operator kernels. This is what allows for the optimizations we can perform at initialization time.
|
|
||
| struct FreeDimensionOverride | ||
| { | ||
| std::string dimension_denotation; |
There was a problem hiding this comment.
Since this is going to be exposed in the C API as well, can you use std::basic_string<ORTCHAR_T> like other strings in SessionOptions? Examples are optimized_model_filepath and profile_file_prefix.
There was a problem hiding this comment.
This string represents an ONNX dimension denotation (e.g onnx::DATA_BATCH). Shouldn't that always be a regular char string? Either way is fine, just double-checking. :)
There was a problem hiding this comment.
I was thinking more in terms of dim_param. If these are going to be just denotations (based on the previous discussion), I guess this is ok.
|
Can you resolve the conflict? |
…_upstream # Conflicts: # onnxruntime/core/session/inference_session.h
This change implements a new L1 graph transformer that applies free dimension overrides to the inputs of a model. Free dimensions are tensor shapes which aren't statically known at model author time, and must be provided at runtime. Free dimensions are most often used for the batch size of a model's inputs, allowing for customizable batch sizes at runtime.
Free dimensions inhibit certain kinds of optimizations because assumptions can no longer be made about the shape of tensors during initialization. This is particularly important for EPs which perform optimizations up-front. For example, DirectML will select different more-optimal kernel implementations for Conv and Gemm depending on the exact shape of the input tensors. This kind of optimization isn't possible unless we know tensor shapes up-front.
This change adds the concept of "free dimension overrides" to the session options and implements a graph transformer which implements these overrides. Free dimension overrides allow a caller to specify the values of these free dimensions up-front at session creation time.
This is useful because there are many scenarios where you don't know the static shapes at model author time, but you do at runtime. Batch sizes are one example. Another example is e.g. a style transfer or artifact removal model which contains free dimensions for the width and height of the input image. At runtime during session creation, a caller will usually know the sizes of their inputs (e.g. the resolution of the user's webcam) and will be able to supply this information via the free dimension override, improving performance (particularly on GPU).
In addition to being generally useful for enabling various potential optimizations, this change is specifically necessary for achieving performance parity with WinML when running on GPU.