Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Add torchserve custom server bert sample #1185

Merged
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 121 additions & 0 deletions docs/samples/custom/torchserve/hugging-face-bert-sample.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Torchserve custom server with Hugging Face bert sample

## Model archive file creation

[Torchserve Huggingface Transformers Bert example](https://github.com/pytorch/serve/tree/master/examples/Huggingface_Transformers)
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved

## Download and install transformer models

```bash
pip install transformers
```

Modify setup_config.json

```json
{
"model_name":"bert-base-uncased",
"mode":"sequence_classification",
"do_lower_case":"True",
"num_labels":"2",
"save_mode":"torchscript",
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
"max_length":"150"
}
```

Download models

```bash
python Download_Transformer_models.py
```

Create a requirements.txt and add `transformers` to it.

Run the below command with all files in place to generate mar file.

```bash
torch-model-archiver --model-name BERTSeqClassification --version 1.0 --serialized-file Transformer_model/pytorch_model.bin --handler ./Transformer_handler_generalized.py --extra-files"Transformer_model/config.json,./setup_config.json,./Seq_classification_artifacts/index_to_name.json" -r requirements.txt
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
```

## Build and push the sample Docker Image

The custom torchserve image is wrapped with model inside the container and serves it with KFServing.

In this example we use Docker to build the torchserve image with marfile and config.properties into a container.

Add the below config in config.properties to install transformers

```json
install_py_dep_per_model=true
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
```

To build and push with Docker Hub, run these commands replacing {username} with your Docker Hub username:

```bash
# Build the container on your local machine

# For CPU
DOCKER_BUILDKIT=1 docker build --file Dockerfile -t torchserve:latest .
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved

# For GPU
DOCKER_BUILDKIT=1 docker build --file Dockerfile --build-argBASE_IMAGE=nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 -t torchserve:latest .
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved

# Push the container to docker registry
docker tag torchserve:latest {username}/torchserve:latest
docker push {username}/torchserve:latest
```

## Create the InferenceService

In the `torchserve-custom.yaml` file edit the container image and replace {username} with your Docker Hub username.

Apply the CRD

```bash
kubectl apply -f torchserve-custom.yaml
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
```

Expected Output

```bash
$inferenceservice.serving.kubeflow.org/torchserve-custom created
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
```

## Run a prediction

The first step is to [determine the ingress IP and ports](../../../../README.md#determine-the-ingress-ip-and-ports) and set `INGRESS_HOST` and `INGRESS_PORT`

```bash
MODEL_NAME=torchserve-custom
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
SERVICE_HOSTNAME=$(kubectl get route ${MODEL_NAME}-predictor-default -n <namespace> -o jsonpath='{.status.url}' | cut -d "/" -f 3)
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved

curl -v -H "Host: ${SERVICE_HOSTNAM}" http://${INGRESS_HOST}:${INGRESS_PORT}/predictions/BERTSeqClassification -T serve/examples/Huggingface_Transformers/sample_text.txt
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
```

Expected Output

```bash
* Trying 44.239.20.204...
* Connected to a881f5a8c676a41edbccdb0a394a80d6-2069247558.us-west-2.elb.amazonaws.com (44.239.20.204) port 80 (#0)
> PUT /predictions/BERTSeqClassification HTTP/1.1
> Host: torchserve-custom-predictor-default.kfserving-test.example.com
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
> User-Agent: curl/7.47.0
> Accept: */*
> Content-Length: 79
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 200 OK
< cache-control: no-cache; no-store, must-revalidate, private
< content-length: 8
< date: Wed, 04 Nov 2020 10:54:49 GMT
< expires: Thu, 01 Jan 1970 00:00:00 UTC
< pragma: no-cache
< x-request-id: 4b54d3ac-185f-444c-b344-b8a785fdeb50
< x-envoy-upstream-service-time: 2085
< server: istio-envoy
<
* Connection #0 to host torchserve-custom-predictor-default.kfserving-test.example.com left intact
jagadeeshi2i marked this conversation as resolved.
Show resolved Hide resolved
Accepted
```