Refactor InferenceSession::Impl::Load code to remove duplication.#248
Conversation
|
Is this function only for testing purposes? |
|
@snnn it can be used for some graph transformers, I think. |
|
There's a link in the description to the PR for constant folding that will use this. In reply to: 450018587 [](ancestors = 450018587) |
yuanbyu
left a comment
There was a problem hiding this comment.
Constant folding as it is currently implemented is not quite right. I don't understand why we need to make changes in session API to accomodate something that should be completely internal.
| * @param model Model to use. InferenceSession::Load must not have been called previously. | ||
| * @return OK if success. Error information if failure. | ||
| */ | ||
| common::Status Initialize(std::shared_ptr<Model>& model); |
There was a problem hiding this comment.
How do we ensure that the model is created with the exact same version of protobuf?
There was a problem hiding this comment.
Scott, FYI: We intentionally removed this method from the interface after RS5 due to protobuf version mismatch issues. We'd added it initially to facilitate faster dev cycles for the Windows folks.
There was a problem hiding this comment.
So what alternate approach is best to avoid current ugly necessity to have to serialize a dynamically created Model and reload it in order to be able to execute it? Internal usage only.
e.g. #168 constant_folding.cc dynamically creates a model but is forced to serialize it to use the current API.
Do we need to refactor to split out internal aspects from InferenceSession so that those can be called directly in this sort of scenario instead of using the public InferenceSession API?
There was a problem hiding this comment.
Removed from this PR. Constant folding is going to be refactored and shouldn't need an API change in InferenceSession.
## Describe your changes make engine register the Pass config instead of Pass instance to support hardware accelerators ## Checklist before requesting a review - [x] Add unit tests for this change. - [x] Make sure all tests can pass. - [x] Update documents if necessary. - [x] Format your code by running `pre-commit run --all-files` ## (Optional) Issue link
- Fix SconvNchwcKernelNeon.S KernelFlags==0 remainder path temp spill slot (sp+microsoft#120 -> sp+microsoft#248) to avoid clobbering callee-saved SIMD spills (q8-q15) - Replace \@-based local labels with numeric local labels (90f/91b style) for portable assembly parsing (including macOS toolchains) Signed-off-by: Milos Puzovic <milos.puzovic@arm.com>
Support use cases where the Model instance is already loaded.
e.g. #168 constant_folding.cc dynamically creates a model but is forced to serialize it to use the current API.
Add generic loading method to InferenceSession::Impl and convert existing methods to use that so load logic isn't cut-and-pasted in 4 places.