-
Notifications
You must be signed in to change notification settings - Fork 816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ONNX and ORT support + Docs for TensorRT #1857
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1857 +/- ##
==========================================
+ Coverage 41.67% 44.66% +2.99%
==========================================
Files 55 63 +8
Lines 2282 2624 +342
Branches 1 56 +55
==========================================
+ Hits 951 1172 +221
- Misses 1331 1452 +121
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Beside the inline comments, we also would need to think of preprocessing for onnx (to avoid custom handlers), as the input need to be converted to numpy. We might be able to have separate util files for onnx, trt or future backends/dynamo or alternatively have one backend utils to keep all the related helper functions there. This may help to keep the base_handler cleaner and more maintainable. cc @lxning on this as well. |
requirements/developer.txt
Outdated
@@ -13,4 +13,5 @@ pygit2==1.6.1 | |||
pyspelling | |||
pre-commit | |||
twine | |||
onnxruntime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be dynamically loaded based on model runtime type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do that for our common dependencies but for our developer dependencies we should install everything so we can run tests in CI
Just a quick update, the main challenge for this PR will be correctly packaging setup.py with all the correct package versions and I'm seeing all sorts of basic issues even with basic models to setup some E2E test. Some relevant issues
Is linking to specific versions enough? Should I use git submodules? When do I update them? How will docker support work? EDIT: The installation experience is now |
And here's an example of an inference running from my logs. Repro is in
|
requirements/developer.txt
Outdated
onnx | ||
onnx-runtime | ||
numpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would potentially cause docker image (> 10g) too big if we want to support multi-platform. We need figure out a way to package for multi-platform support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, I guess for dev dependencies that seems fine though. I can remove these if we disable the ONNX tests by default but I don't believe we should either
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @msaroufim , I added some comments, but more generally I wonder if we have plans to move all the backend initialization logics + utilities into a backend specific files / utils. This would make the base_handler easier to maintain and more readable. Maybe its a good boot camp task?
ts/torch_handler/base_handler.py
Outdated
|
||
# Load class mapping for classifiers | ||
mapping_file_path = os.path.join(model_dir, "index_to_name.json") | ||
self.mapping = load_label_mapping(mapping_file_path) | ||
|
||
self.initialized = True | ||
|
||
def _load_onnx_model(self, model_onnx_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we going to replace the above Lines 120-137 with this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is a typo, will fix
Discussed offline with Li and Hamid To merge this PR
Future
|
@HamidShojanazeri @lxning I made all the changes we discussed Logs
Pytest
Link check is failing because I'm linking to a test file that doesn't exist on master yet https://github.com/pytorch/serve/actions/runs/3425428486/jobs/5706209933#step:5:867 - this wont be a problem after we merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @msaroufim , LGTM. I also suggest for next set of PRs maybe having linting/ formatting as a separate PR makes it easier to focus on the changes.
EDIT 11/4: To make this easier to merge I've cut the scope to only ONNX and ORT, will revisit the rest after pytorch/tensorrt when it gets official pypi binaries but this is now ready for review
EDIT 11/6: I'm going to write a brief doc page on using different optimization runtimes
EDIT 11/8: Addressed most feedback, rest will be addressed in future work
This PR
--serialized-file
that's in.onnx
format, which will be correctly loaded by the base handler using anort.InferenceSession()
https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html.ts
extension which can be loaded viatorch.jit.load()
https://pytorch.org/TensorRT/getting_started/getting_started_with_python_api.html#getting-started-with-python-apimodel_config.json
similar toindex_to_name.json
which would have model specific information which you can get access via `model_config.get("property"). For now I'm not using any special configsOpen question
initialize()
? My gut is no, we don't ask users to train models ininitialize()
they should prepare their models and once prepared usedserve
but I could be convinced otherwise - EDIT: NOmodel_config.json
? I'd like to wait and see what schema gets chosen for Modularize ipex optimization inbase_handler.py
intots/utils/ipex_optimization.py
. #1664 and then decideFuture work
torch.jit.script()