## Model archiver

The key to understanding TorchServe is to first understand [torch-model-archiver](https://github.com/pytorch/serve/blob/master/model-archiver/README.md) which packages model artifacts into a single model archive file (`.mar`).  `torch-model-archive` needs the following inputs:

### Torchscript

Need a model checkpoint file

### Eager Mode (more common)

Need a model definition file and a state_dict file.

### CLI

The CLI produces a `.mar` file.  Below is [an example](https://github.com/pytorch/serve/blob/master/docs/getting_started.md#serve-a-model) of archiving an eager mode model.

In [13]:
!torch-model-archiver --model-name densenet161 \
    --version 1.0 \
    --model-file ./_serve/examples/image_classifier/densenet_161/model.py \
    --serialized-file densenet161-8d451a50.pth \
    --export-path model_store \
    --extra-files ./_serve/examples/image_classifier/index_to_name.json \
    --handler image_classifier \
    -f



This is the model file:

```{.python include="./_serve/examples/image_classifier/densenet_161/model.py" filename="_serve/examples/image_classifier/densenet_161/model.py"}
```

Options for model archiver:

In [2]:
! torch-model-archiver --help

usage: torch-model-archiver [-h] --model-name MODEL_NAME
                            [--serialized-file SERIALIZED_FILE]
                            [--model-file MODEL_FILE] --handler HANDLER
                            [--extra-files EXTRA_FILES]
                            [--runtime {python,python2,python3}]
                            [--export-path EXPORT_PATH]
                            [--archive-format {tgz,no-archive,default}] [-f]
                            -v VERSION [-r REQUIREMENTS_FILE]

Torch Model Archiver Tool

optional arguments:
  -h, --help            show this help message and exit
  --model-name MODEL_NAME
                        Exported model name. Exported file will be named as
                        model-name.mar and saved in current working directory if no --export-path is
                        specified, else it will be saved under the export path
  --serialized-file SERIALIZED_FILE
                        Path to .pt or .pth file containing state_dic

### Handler

TorchServe has the following handlers built-in that do post and pre-processing:

- image_classifier
- object_detector
- text_classifier
- image_segmenter

You can implement your own custom handler by following [these docs](https://pytorch.org/serve/custom_service.html?highlight=handlers). Most of the time you only need to subclass `BaseHandler` and override `preprocess` and/or `postprocess`.



##### `--extra-files ... index_to_name.json`:

From the docs:

> `image_classifier`, `text_classifier` and `object_detector` can all automatically map from numeric classes (0,1,2...) to friendly strings. To do this, simply include in your model archive a file, `index_to_name.json`, that contains a mapping of class number (as a string) to friendly name (also as a string).

## Serving

After archiving you can start the modeling server:

```bash
torchserve --start --ncs \
    --model-store model_store \
    --models densenet161.mar
```

TorchServe uses default ports `8080` / `8081` / `8082` for REST based inference, management & metrics APIs and `7070` / `7071` for gRPC APIs. 

In [14]:
!torchserve --help

usage: torchserve [-h] [-v | --start | --stop] [--ts-config TS_CONFIG]
                  [--model-store MODEL_STORE]
                  [--workflow-store WORKFLOW_STORE]
                  [--models MODEL_PATH1 MODEL_NAME=MODEL_PATH2... [MODEL_PATH1 MODEL_NAME=MODEL_PATH2... ...]]
                  [--log-config LOG_CONFIG] [--foreground]
                  [--no-config-snapshots] [--plugins-path PLUGINS_PATH]

Torchserve

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         Return TorchServe Version
  --start               Start the model-server
  --stop                Stop the model-server
  --ts-config TS_CONFIG
                        Configuration file for model server
  --model-store MODEL_STORE
                        Model store location from where local or default
                        models can be loaded
  --workflow-store WORKFLOW_STORE
                        Workflow store location from where local or default
                 

In [4]:
!curl -O https://raw.githubusercontent.com/pytorch/serve/master/docs/images/kitten_small.jpg

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  7341  100  7341    0     0   108k      0 --:--:-- --:--:-- --:--:--  108k


In [7]:
!curl http://127.0.0.1:8080/predictions/densenet161 -T kitten_small.jpg

{
  "tabby": 0.4783327877521515,
  "lynx": 0.19989627599716187,
  "tiger_cat": 0.1682717651128769,
  "tiger": 0.061949197202920914,
  "Egyptian_cat": 0.05116736516356468
}

:::{.callout-warning}

I wouldn't recommend installing torchserve and running it on a VM.  It's probably easier to use Docker.

`docker pull pytorch/torchserve`

:::

### Docker

See [these docs](https://github.com/pytorch/serve/blob/master/docker/README.md#start-a-container-with-a-torchserve-image).  We have to mount the necessary files and run the same commands.  We also have to expose all the ports, etc.

:::{.callout-important}

Note that you have to supply the `torchserve` command, which implies you can run other things (but I don't know what those are).

:::

```bash
docker run --rm -it --gpus '"device=0"' \
    -p 8080:8080 \
    -p 8081:8081 \
    -p 8082:8082 \
    -p 7070:7070 \
    -p 7071:7071 \
    --mount type=bind,source=/home/hamel/hamel/notes/serving/torchserve/model_store,target=/tmp/models \
    pytorch/torchserve:latest-gpu \
    torchserve \
    --model-store /tmp/models \
    --models densenet161.mar
```

In [9]:
!curl http://127.0.0.1:8080/predictions/densenet161 -T kitten_small.jpg

{
  "tabby": 0.4783327877521515,
  "lynx": 0.19989627599716187,
  "tiger_cat": 0.1682717651128769,
  "tiger": 0.061949197202920914,
  "Egyptian_cat": 0.05116736516356468
}

## Other Notes

I found these articles to be very important:

1. Source code for [BaseHandler](https://github.com/pytorch/serve/blob/master/ts/torch_handler/base_handler.py).
2. Performance guide: [Concurrency and number of workers](https://pytorch.org/serve/performance_guide.html#concurrency-and-number-of-workers).
3. `config.properties` [example 1](https://github.com/pytorch/serve/blob/master/examples/cloud_storage_stream_inference/config.properties) and [example 2](https://github.com/pytorch/serve/blob/master/docker/config.properties) of how you can pass [configuration files](https://pytorch.org/serve/configuration.html)