# TFServing Deployment Template
**Date:** 2023-08-15

**Author:** example@example.com

## 1. Introduction
Brief description of the problem and the objectives of this analysis.

## 2. Load Libraries and Set constants

### 2.1 Load Libraries

In [None]:
import tensorflow as tf
from transformers import AutoTokenizer,AutoModel
import os
from getpass import getpass

### 2.2 Set Constants



In [None]:
MODEL_PATH = "model"
MODEL_VERSION = "1"
MODEL_SAVE_PATH = f"{MODEL_PATH}/{MODEL_VERSION}"
MODEL_NAME = "embedding_model"
PORT = "8501" # 
STRING = "这是一条测试用字符串"

## 3. Save Model For TFServing

### 3.1 Load Model

In [None]:
model = AutoModel.from_pretrained(MODEL_PATH,from_pt = True)

### 3.2 Transform To TF Model From PyTorch Model (Optional)

###  3.3 Save Model

In [None]:
tf.saved_model.save(model,MODEL_SAVE_PATH)

## 4. Running the TensorFlow Serving Docker Image

In [None]:
command = f"sudo docker run -p {PORT}:8501 --name={MODEL_NAME} --mount type=bind,source={os.getcwd()}/{MODEL_SAVE_PATH},target=/models/{MODEL_NAME} -e MODEL_NAME={MODEL_NAME} -t tensorflow/serving"

In [None]:
!echo {getpass()} | sudo -S {command}

## 5. Testing the Served Model

In [None]:
tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)

In [None]:
def tokenize_data_to_request(datas: BatchEncoding) -> json:
    """
    convert tokenize data into data that can be used to access
    the sentence bert interface
    :param datas: tokenize data
    :return: data in json format
    """
    tokenize_data_dict = {}
    batch_size = 0
    for k_name, data in datas.items():
        tokenize_data_dict[k_name] = data
        batch_size = len(data)
    instances = []
    for i in range(batch_size):
        instance = collections.defaultdict(list)
        for key in tokenize_data_dict.keys():
            instance[key] = tokenize_data_dict[key][i]
        instances.append(instance)

    tokenize_data_json = json.dumps({"instances": instances})
    return tokenize_data_json

In [None]:
encoded_input = tokenizer(
            [STRING],
            padding=True,
            truncation=True,
            max_length=128
    )
re = requests.post(
        url=f"http://localhost:{PORT}/v1/models/{MODEL_NAME}:predict",
        data=tokenize_data_to_request(datas=encoded_input)
    )

In [None]:
predictions = json.loads(re.text)['predictions']
print(predictions)