Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A guide for loading models in TorchServe #2592

Merged
merged 6 commits into from Sep 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/README.md
Expand Up @@ -7,6 +7,7 @@ TorchServe is a performant, flexible and easy to use tool for serving PyTorch ea
* [Serving Quick Start](https://github.com/pytorch/serve/blob/master/README.md#serve-a-model) - Basic server usage tutorial
* [Model Archive Quick Start](https://github.com/pytorch/serve/tree/master/model-archiver#creating-a-model-archive) - Tutorial that shows you how to package a model archive file.
* [Installation](https://github.com/pytorch/serve/blob/master/README.md#install-torchserve) - Installation procedures
* [Model loading](model_loading.md) - How to load a model in TorchServe?
* [Serving Models](server.md) - Explains how to use TorchServe
* [REST API](rest_api.md) - Specification on the API endpoint for TorchServe
* [gRPC API](grpc_api.md) - TorchServe supports gRPC APIs for both inference and management calls
Expand Down
35 changes: 35 additions & 0 deletions docs/model_loading.md
@@ -0,0 +1,35 @@
# How to load a model in TorchServe

There are multiple ways to load to model in TorchServe. The below flowchart tries to simplify the process and shows the various options



```mermaid
flowchart TD
id1[[How to load a model in TorchServe?]] --> id13{Handler has an initialize method?}
id13{Handler has an initialize method?} -- No, using BaseHandler initialize method --> id2{Model Type?} --> id3(PyTorch Eager) & id4(TorchScripted) & id5(ONNX) & id6(TensorRT)
id3(PyTorch Eager) --Required--> id7(Model File & weights file)
id4(TorchScripted) --Required--> id8(TorchScripted weights ending in '.pt')
id5(ONNX) --Required --> id9(Weights ending in '.onnx')
id6(TensorRT) --Required--> id10(TensorRT weights ending in '.pt')
id7(Model File & weights file) & id8(TorchScripted weights ending in '.pt') & id9(Weights ending in '.onnx') & id10(TensorRT weights ending in '.pt') --> id11(Created a model archive .mar file)
id13{Handler has an initialize method?} --Yes--> id21{"Does the initialize method inherit from BaseHandler?"}
id21{"Does the initialize method inherit from BaseHandler?"} -- Yes --> id2{Model Type?}
id21{Does the initialize method inherit from BaseHandler?} -- No --> id20("Create a custom method to
load the model in the handler") --> id11(Create a model archive .mar file)
id15["Create model archive by passing the
weights with --serialized-file option"]
id16["Specify path to the weights in model-config.yaml
Create model archive by specifying yaml file with --config-file "]
id11(Work on creating a model archive .mar file) --> id14{"Is your model large?"} --No--> id22{Do you want a self-contained model artifact} --Yes--> id15
id14{"Is your model large?"} --Yes--> id16
id22{Do you want a self-contained model artifact} --No, I want model archieving & loading to be faster--> id16
id15 & id16 --> id17["Start TorchServe.
Two ways of starting torchserve
- Pass the mar file with --models
- Start TorchServe and call the register API with mar file"]




```