Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ONNX and ORT support + Docs for TensorRT #1857

Merged
merged 30 commits into from
Nov 9, 2022
Merged

Add ONNX and ORT support + Docs for TensorRT #1857

merged 30 commits into from
Nov 9, 2022

Conversation

msaroufim
Copy link
Member

@msaroufim msaroufim commented Sep 12, 2022

EDIT 11/4: To make this easier to merge I've cut the scope to only ONNX and ORT, will revisit the rest after pytorch/tensorrt when it gets official pypi binaries but this is now ready for review

EDIT 11/6: I'm going to write a brief doc page on using different optimization runtimes

EDIT 11/8: Addressed most feedback, rest will be addressed in future work

This PR

Open question

Future work

@codecov
Copy link

codecov bot commented Sep 12, 2022

Codecov Report

Merging #1857 (0733614) into master (a5fc61f) will increase coverage by 2.99%.
The diff coverage is 25.67%.

❗ Current head 0733614 differs from pull request most recent head 6c9c7c5. Consider uploading reports for the commit 6c9c7c5 to get more accurate results

@@            Coverage Diff             @@
##           master    #1857      +/-   ##
==========================================
+ Coverage   41.67%   44.66%   +2.99%     
==========================================
  Files          55       63       +8     
  Lines        2282     2624     +342     
  Branches        1       56      +55     
==========================================
+ Hits          951     1172     +221     
- Misses       1331     1452     +121     
Impacted Files Coverage Δ
setup.py 0.00% <0.00%> (ø)
ts/torch_handler/base_handler.py 0.00% <0.00%> (ø)
ts/utils/util.py 37.50% <ø> (ø)
...l-archiver/model_archiver/model_packaging_utils.py 54.66% <54.28%> (ø)
...el-archiver/model_archiver/model_archiver_error.py 100.00% <0.00%> (ø)
workflow-archiver/workflow_archiver/version.py 100.00% <0.00%> (ø)
model-archiver/model_archiver/model_packaging.py 90.00% <0.00%> (ø)
...w-archiver/workflow_archiver/workflow_packaging.py 89.65% <0.00%> (ø)
model-archiver/model_archiver/version.py 100.00% <0.00%> (ø)
...hiver/workflow_archiver/workflow_archiver_error.py 100.00% <0.00%> (ø)
... and 2 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@msaroufim msaroufim changed the title [WIP] Add ONNX support [WIP] Add ONNX and TensorRT support Sep 13, 2022
@msaroufim msaroufim changed the title [WIP] Add ONNX and TensorRT support Add ONNX and TensorRT support Sep 13, 2022
@HamidShojanazeri
Copy link
Collaborator

HamidShojanazeri commented Sep 20, 2022

Beside the inline comments, we also would need to think of preprocessing for onnx (to avoid custom handlers), as the input need to be converted to numpy. We might be able to have separate util files for onnx, trt or future backends/dynamo or alternatively have one backend utils to keep all the related helper functions there. This may help to keep the base_handler cleaner and more maintainable. cc @lxning on this as well.

model-archiver/model_archiver/arg_parser.py Show resolved Hide resolved
ts/torch_handler/base_handler.py Outdated Show resolved Hide resolved
ts/torch_handler/base_handler.py Outdated Show resolved Hide resolved
model-archiver/model_archiver/arg_parser.py Show resolved Hide resolved
@@ -13,4 +13,5 @@ pygit2==1.6.1
pyspelling
pre-commit
twine
onnxruntime
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be dynamically loaded based on model runtime type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do that for our common dependencies but for our developer dependencies we should install everything so we can run tests in CI

ts/torch_handler/base_handler.py Outdated Show resolved Hide resolved
@msaroufim
Copy link
Member Author

msaroufim commented Oct 11, 2022

Just a quick update, the main challenge for this PR will be correctly packaging setup.py with all the correct package versions and I'm seeing all sorts of basic issues even with basic models to setup some E2E test. Some relevant issues

Is linking to specific versions enough? Should I use git submodules? When do I update them? How will docker support work?

EDIT:

The installation experience is now pip install torchserve[tensorrt] if remote or pip install .[tensorrt] which will install the necessary packages and the full set of options is onnx, tensorrt, ipex

@msaroufim msaroufim changed the title Add ONNX and TensorRT support Add ONNX and ORT support Nov 4, 2022
@msaroufim
Copy link
Member Author

And here's an example of an inference running from my logs. Repro is in test/pytest/test_onnx.py

(base) ubuntu@ip-172-31-17-70:~$ curl -X POST http://127.0.0.1:8080/predictions/onnx --data-binary '1'
[
  -0.28003010153770447
](base) ubuntu@ip-172-31-17-70:~$ 

Comment on lines 17 to 19
onnx
onnx-runtime
numpy
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would potentially cause docker image (> 10g) too big if we want to support multi-platform. We need figure out a way to package for multi-platform support

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, I guess for dev dependencies that seems fine though. I can remove these if we disable the ONNX tests by default but I don't believe we should either

model-archiver/model_archiver/arg_parser.py Show resolved Hide resolved
ts/torch_handler/base_handler.py Show resolved Hide resolved
setup.py Show resolved Hide resolved
Copy link
Collaborator

@HamidShojanazeri HamidShojanazeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @msaroufim , I added some comments, but more generally I wonder if we have plans to move all the backend initialization logics + utilities into a backend specific files / utils. This would make the base_handler easier to maintain and more readable. Maybe its a good boot camp task?

ts/torch_handler/base_handler.py Show resolved Hide resolved

# Load class mapping for classifiers
mapping_file_path = os.path.join(model_dir, "index_to_name.json")
self.mapping = load_label_mapping(mapping_file_path)

self.initialized = True

def _load_onnx_model(self, model_onnx_path):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we going to replace the above Lines 120-137 with this function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a typo, will fix

ts/torch_handler/base_handler.py Outdated Show resolved Hide resolved
ts/torch_handler/base_handler.py Outdated Show resolved Hide resolved
@msaroufim
Copy link
Member Author

msaroufim commented Nov 8, 2022

Discussed offline with Li and Hamid

To merge this PR

  • Disable test by default in CI
  • Remove requirements from developer dependencies
  • Add documentation to describe how people should pre and post process with ONNX
  • Update test

Future

  • automation of pre and post processing
  • Refactor docker image to pull in necessary runtime dependencies optionally
  • Refactor base handler

@msaroufim
Copy link
Member Author

msaroufim commented Nov 9, 2022

@HamidShojanazeri @lxning I made all the changes we discussed

Logs

ubuntu@ip-172-31-17-70:~/serve/test/pytest$ curl -X POST http://127.0.0.1:8080/predictions/onnx --data-binary '1'
Prediction Succeeded

Pytest

(ort) ubuntu@ip-172-31-17-70:~/serve/test/pytest$ pytest test_onnx.py 
==================================== test session starts =====================================
platform linux -- Python 3.8.13, pytest-7.2.0, pluggy-1.0.0
rootdir: /home/ubuntu/serve
plugins: mock-3.10.0, cov-4.0.0
collected 5 items                                                                            

test_onnx.py .....                                                                     [100%]

===================================== 5 passed in 1.17s ======================================

Link check is failing because I'm linking to a test file that doesn't exist on master yet https://github.com/pytorch/serve/actions/runs/3425428486/jobs/5706209933#step:5:867 - this wont be a problem after we merge

@msaroufim msaroufim changed the title Add ONNX and ORT support Add ONNX and ORT support + Docs for TensorRT Nov 9, 2022
Copy link
Collaborator

@HamidShojanazeri HamidShojanazeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @msaroufim , LGTM. I also suggest for next set of PRs maybe having linting/ formatting as a separate PR makes it easier to focus on the changes.

docs/performance_guide.md Outdated Show resolved Hide resolved
docs/performance_guide.md Outdated Show resolved Hide resolved
@msaroufim msaroufim requested a review from lxning November 9, 2022 16:23
@msaroufim msaroufim merged commit 53a5613 into master Nov 9, 2022
@msaroufim msaroufim deleted the archivdata branch November 9, 2022 23:34
@msaroufim msaroufim linked an issue Nov 14, 2022 that may be closed by this pull request
@msaroufim msaroufim mentioned this pull request Nov 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization p0 high priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make TorchServe multi framework
3 participants