Export SavedModel for Serving #103

haoyuhu · 2019-06-03T12:32:32Z

from kashgari.corpus import SMP2018ECDTCorpus
from kashgari.tasks.classification import BLSTMModel

x_data, y_data = SMP2018ECDTCorpus.load_data()
classifier = BLSTMModel()
classifier.fit(x_data, y_data)
# export saved model to ./savedmodels/<timestamp>/
classifier.export('./savedmodels')

saved_model_cli show --dir /path/to/saved_models/1559562438/ --tag_set serve --signature_def serving_default
# Output:
# The given SavedModel SignatureDef contains the following input(s):
#  inputs['input:0'] tensor_info:
#       dtype: DT_FLOAT
#       shape: (-1, 15)
#       name: input:0
# The given SavedModel SignatureDef contains the following output(s):
#   outputs['dense/Softmax:0'] tensor_info:
#       dtype: DT_FLOAT
#       shape: (-1, 32)
#       name: dense/Softmax:0
# Method name is: tensorflow/serving/predict

tensorflow_model_server --rest_api_port=9000 --model_name=blstm --model_base_path=/path/to/saved_models/ --enable_batching=true
# Output:
# 2019-06-03 08:28:56.639941: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: blstm version: 1559562438}
# 2019-06-03 08:28:56.645217: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
# 2019-06-03 08:28:56.647192: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:9000 ...

curl -H "Content-type: application/json" -X POST -d '{"instances": [{"input:0": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}]}'  "http://localhost:9000/v1/models/blstm:predict"
# Output:
# {
#     "predictions": [[5.76590492e-06, 0.0334293731, 9.58459859e-05, 0.00066432351, 0.500331104, 0.0521887243, 0.000985755469, 0.000161868113, 0.00147783163, 0.0171929933, 0.00085421023, 0.00599030638, 1.79303879e-05, 0.00050331495, 3.7246391e-05, 3.13154237e-06, 0.0201187711, 0.000672292779, 0.000196203022, 4.57693459e-05, 2.69985958e-06, 8.66179619e-07, 1.03102286e-06, 3.53154815e-06, 0.0478210114, 0.00725555047, 0.000683069753, 0.262197495, 4.151143e-05, 0.046125982, 2.19863551e-07, 0.000894303957]
#     ]
# }

BrikerMan · 2019-06-03T14:04:15Z

This is cool, but have you tried this function with custom objects, like multi-head layer from BERTEmbedding.

haoyuhu · 2019-06-04T06:51:04Z

I tried BERTEmbedding as an example, and the prediction looked fine.

import logging
logging.basicConfig(level=logging.DEBUG)
import kashgari
from kashgari.corpus import SMP2018ECDTCorpus
from kashgari.tasks.classification import BLSTMModel
from kashgari.embeddings.bert_embedding import BERTEmbedding

x_data, y_data = SMP2018ECDTCorpus.load_data()
x_val, y_val = SMP2018ECDTCorpus.load_data('valid')
embedding = BERTEmbedding(task=kashgari.CLASSIFICATION, model_folder='/data/models/bert-base-chinese', from_saved_model=True, trainable=False)
classifier = BLSTMModel(embedding)
classifier.fit(x_data, y_data, x_val, y_val)
classifier.export('./savedmodels')

x_data[0]
# Output:
# ['现', '在', '有', '什', '么', '好', '看', '的', '电', '视', '剧', '？']
embedding.process_x_dataset([x_data[0]])
# Output:
# (array([[4385, 1762, 3300,  784,  720, 1962, 4692, 4638, 4510, 6228, 1196,
#         8043,    0,    0,    0]]), array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]))
embedding.process_y_dataset([y_data[0]])
# Output:
# array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
#         0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.]],
#       dtype=float32)
classifier.predict([x_data[0]], debug_info=True)
# Output:
# INFO:root:output: [[3.6619654e-06 6.7627948e-04 5.0349574e-04 7.7625271e-04 1.3573820e-03
#   3.1325928e-04 3.9416406e-02 1.4551771e-02 2.4546034e-04 3.5036734e-04
#   2.1456789e-04 4.1255872e-03 2.7694383e-03 5.6802204e-05 5.3648418e-04
#   6.5579050e-04 3.7221718e-05 8.9914782e-04 5.2839550e-03 4.0301870e-04
#   3.8410389e-04 4.3760013e-04 4.9373659e-04 1.7610782e-04 4.3451432e-03
#   2.4513726e-04 5.5251090e-04 1.2465163e-04 9.1961819e-01 9.2061753e-05
#   1.6616248e-04 1.8825983e-04]]
# INFO:root:output argmax: [28]
# ['epg']

Serving:

bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --rest_api_port=9000 --model_name=bert_blstm --model_base_path=/root/tf_serving/bert_blstm/ --enable_batching=true
# Output:
# 2019-06-04 00:38:23.845690: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: bert_blstm version: 1559618962}
# 2019-06-04 00:38:23.863891: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
# 2019-06-04 00:38:23.920985: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:9000 ...

saved_model_cli show --dir /root/tf_serving/bert_blstm/1559618962/ --tag_set serve --signature_def serving_default
# Output:
# The given SavedModel SignatureDef contains the following input(s):
#   inputs['Input-Segment:0'] tensor_info:
#       dtype: DT_FLOAT
#       shape: (-1, 15)
#       name: Input-Segment:0
#   inputs['Input-Token:0'] tensor_info:
#       dtype: DT_FLOAT
#       shape: (-1, 15)
#       name: Input-Token:0
# The given SavedModel SignatureDef contains the following output(s):
#   outputs['dense/Softmax:0'] tensor_info:
#       dtype: DT_FLOAT
#       shape: (-1, 32)
#       name: dense/Softmax:0
# Method name is: tensorflow/serving/predict

curl -H "Content-type: application/json" -X POST -d '{"instances": [{"Input-Segment:0": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], "Input-Token:0": [4385,1762,3300,784,720,1962,4692,4638,4510,6228,1196,8043,0,0,0]}]}'  "http://localhost:9000/v1/models/bert_blstm:predict"

# Output:
# {
#     "predictions": [[3.66197105e-06, 0.000676278665, 0.000503493298, 0.000776251429, 0.00135738368, 0.000313259545, 0.0394166, 0.0145518212, 0.000245459785, 0.000350368762, 0.000214567845, 0.0041255923, 0.00276944018, 5.68024116e-05, 0.000536485342, 0.000655789045, 3.7221711e-05, 0.000899151899, 0.00528393872, 0.00040301861, 0.000384103041, 0.000437600684, 0.00049373717, 0.000176107773, 0.00434515439, 0.000245138333, 0.000552510261, 0.00012465172, 0.919617951, 9.20617313e-05, 0.000166162121, 0.000188259597]
#     ]
# }

haoyuhu · 2019-06-04T07:11:46Z

BTW, I found that currently we cannot pass trainable=True to BERTEmbedding. I will dive into this later.

BrikerMan · 2019-06-04T07:15:59Z

BTW, I found that currently we cannot pass trainable=True to BERTEmbedding. I will dive into this later.

@alexwwang is working on that, #102.

alexwwang · 2019-06-04T07:19:57Z

Yes, I set a warning there. Because I have not got feedback from @CyberZHG about the expected behavior if we set trainable=True when loading a trained model from a checkpoint. You may check that issue here or below.

According to my instinct, when set trainable to True would make the bert model trainable and it could be a finetune process as a matter of fact. But I need assure from him, especially on the new model saving process.

Say if we finetuned the model, where would it be saved? The original checkpoint or a new place we assign? Then when we need to load the finetuned one in practice, should we modify our current codes and how?

BrikerMan · 2019-06-04T07:35:50Z

Yes, I set a warning there. Because I have not got feedback from @CyberZHG about the expected behavior if we set trainable=True when loading a trained model from a checkpoint. You may check that issue here or below.

According to my instinct, when set trainable to True would make the bert model trainable and it could be a finetune process as a matter of fact. But I need assure from him, especially on the new model saving process.

Say if we finetuned the model, where would it be saved? The original checkpoint or a new place we assign? Then when we need to load the finetuned one in practice, should we modify our current codes and how?

I think we don't have to worry about model saving. When you save the model, it will save its current weights, which include the fine-tuned weights. And we don't need to modify our loading function.

alexwwang · 2019-06-04T07:42:52Z

Yes, I've got feedback from @CyberZHG and you two gave a consistent answer.
I'll make another pr later to get rid of that warn.

haoyuhu · 2019-06-04T07:43:00Z

Yes, I set a warning there. Because I have not got feedback from @CyberZHG about the expected behavior if we set trainable=True when loading a trained model from a checkpoint. You may check that issue here or below.

According to my instinct, when set trainable to True would make the bert model trainable and it could be a finetune process as a matter of fact. But I need assure from him, especially on the new model saving process.

Say if we finetuned the model, where would it be saved? The original checkpoint or a new place we assign? Then when we need to load the finetuned one in practice, should we modify our current codes and how?

I think training indicates stage, such as predict or train. trainable indicates using fixed dynamic
BERT embedding or not. We should use training=True and trainable=False if we need a fixed BERT embedding, or training=True and trainable=True for training model with BERT embedding together.

CyberZHG · 2019-06-04T07:45:25Z

@haoyuhu

training=False for embedding.

alexwwang · 2019-06-04T07:48:23Z

@haoyuhu
My understanding based on original code is:
training=False : output feature embedding, trainable is useless;
training=trainable=True: load model for finetune weights.
Other combinations would be invalid.

@CyberZHG right?

CyberZHG · 2019-06-04T07:53:33Z

Although the names of the two arguments look alike, you don't need to set them to the same value. training controls the structure, set it to False if you only need an embedding, set it to True if you want to tune the output of NSP (one way of extracting sentence embedding).

haoyuhu · 2019-06-04T07:56:25Z

https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L112
I think using training=False is all we need. Just embed.

alexwwang · 2019-06-04T07:58:04Z

Yes, if training=False, trainable is meaningless.
While training=True, trainable=False is set for training a model from sketch.

@haoyuhu
I'm considering add support to bert finetune in our project, based on keras-bert, which would not be too hard.

CyberZHG · 2019-06-04T08:04:21Z

@alexwwang

😿 trainable is never meaningless because training has nothing to do with the trainability of the model.

haoyuhu · 2019-06-04T08:04:30Z

@alexwwang training=False and trainable=True does make sense.

alexwwang · 2019-06-04T08:05:33Z

@alexwwang

😿 trainable is never meaningless because training has nothing to do with the trainability of the model.

Sorry, I will check it again.

haoyuhu · 2019-06-04T08:05:55Z

training=True return a built model for training.
https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L70

alexwwang · 2019-06-04T08:13:10Z

training=True return a built model for training.
https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L70

Yes. Have you noticed that, if training=False, get_model will return the feature layer before trainable parameter take effect? So for feature output purpose, training=False is what we need.

trainable is a parameter controlling if weights could be updated in mlm_dense_layer, nsp_dense_layer and nsp_pred_layer.

These three layers are finetune related, I assume.

Updated:
These three layers are not finetune related.
If we need feature finetue, just set training=False and trainable=True.

CyberZHG · 2019-06-04T08:16:08Z

I think I have to update the docs and comments later. I didn't realize the ambiguity. 😹

training means whether we are training the language model.

training=True & trainable=True: train language model
training=False & trainable=False: use it as embedding
training=False & trainable=True: use it as trainable embedding

CyberZHG · 2019-06-04T08:16:49Z

training=True return a built model for training.
https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L70

Yes. Have you noticed that, if training=False, get_model will return the feature layer before trainable parameter take effect? So for feature output purpose, training=False is what we need.

trainable is a parameter controlling if weights could be updated in mlm_dense_layer, nsp_dense_layer and nsp_pred_layer.

These three layers are finetune related, I assume.

All layers will be influenced by trainable.

haoyuhu · 2019-06-04T08:17:44Z

training=True return a built model for training.
https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L70

Yes. Have you noticed that, if training=False, get_model will return the feature layer before trainable parameter take effect? So for feature output purpose, training=False is what we need.

trainable is a parameter controlling if weights could be updated in mlm_dense_layer, nsp_dense_layer and nsp_pred_layer.

These three layers are finetune related, I assume.

Yes, you are right. :P Params in bert will be trainable.

alexwwang · 2019-06-04T08:23:56Z

I think I have to update the docs and comments later. I didn't realize the ambiguity. 😹

training means whether we are training the language model.

training=True & trainable=True: train language model
training=False & trainable=False: use it as embedding
training=False & trainable=True: use it as trainable embedding

As you mentioned on that issue page in your project, you said training=True & trainable=False are set for training a whole new bert model from sketch, right?
So as I assume, training=True & trainable=True are set for continuing training a language model.
While training=False & trainable=True, indeed we could get a trainable embedding and this could be the finetune feature we need.

CyberZHG · 2019-06-04T08:25:12Z

I think I have to update the docs and comments later. I didn't realize the ambiguity. 😹
training means whether we are training the language model.
training=True & trainable=True: train language model
training=False & trainable=False: use it as embedding
training=False & trainable=True: use it as trainable embedding

As you mentioned on that issue page in your project, you said training=True & trainable=False are set for training a whole new bert model from sketch, right?
So as I assume, training=True & trainable=True are set for continuing training a language model.
But I don't understand why training=False & trainable=True would make the embedding trainable, since they're returned before trainable takes effect in the method get_model.
https://github.com/CyberZHG/keras-bert/blob/be1f2cdadeedfd39d6e05e410d47e60d27bd4d96/keras_bert/bert.py#L122

https://github.com/CyberZHG/keras-bert/blob/be1f2cdadeedfd39d6e05e410d47e60d27bd4d96/keras_bert/bert.py#L110

alexwwang · 2019-06-04T08:27:07Z

I think I have to update the docs and comments later. I didn't realize the ambiguity. 😹
training means whether we are training the language model.
training=True & trainable=True: train language model
training=False & trainable=False: use it as embedding
training=False & trainable=True: use it as trainable embedding

As you mentioned on that issue page in your project, you said training=True & trainable=False are set for training a whole new bert model from sketch, right?
So as I assume, training=True & trainable=True are set for continuing training a language model.
But I don't understand why training=False & trainable=True would make the embedding trainable, since they're returned before trainable takes effect in the method get_model.
https://github.com/CyberZHG/keras-bert/blob/be1f2cdadeedfd39d6e05e410d47e60d27bd4d96/keras_bert/bert.py#L122

https://github.com/CyberZHG/keras-bert/blob/be1f2cdadeedfd39d6e05e410d47e60d27bd4d96/keras_bert/bert.py#L110

Yes, yes, I found that nearly at once I posted the original one. 😹

alexwwang · 2019-06-04T09:14:06Z

#104

@haoyuhu You may have a try on finetune when this pr merged. 😃

haoyuhu · 2019-06-04T09:17:47Z

@alexwwang So great! I will try it later.

alexwwang · 2019-06-04T09:47:05Z

@haoyuhu However I have to point out, according to the original thesis, that, feature extraction approach actually do not have any fine-tune design. All the fine-tune attempts were made on the whole model end-to-end. So do you think we should support the whole model fine-tune (allow to set training=True and support more training parameters pass into.) here?

haoyuhu · 2019-06-05T10:04:11Z

@haoyuhu However I have to point out, according to the original thesis, that, feature extraction approach actually do not have any fine-tune design. All the fine-tune attempts were made on the whole model end-to-end. So do you think we should support the whole model fine-tune (allow to set training=True and support more training parameters pass into.) here?

In my use case, it will bring 2~3% reduction in eval_accuracy if I only train classifier on top of BERT instead of fine tuning(training=False and trainable=True). In my opinion, we should allow to set trainable=True. I think training=True is for pre-training, am I right? @alexwwang

alexwwang · 2019-06-05T10:15:29Z

Now we have allowed to set trainable=True.

The training parameter controls the output of the get_model method:

False, will make a feature layer output and its size depends on output_layer_nums;
True, will make a whole model output, which contains the MLM task layer and NSP task layer, which are used for training the respected unsupervised tasks during the pre-training procedure.

However, the orthodoxy fine-tune procedure should only be made when both training and trainable are set to True to execute the "end-to-end" tune, which may lead to an even longer training time. And I think maybe it's unnecessary to connect any more complex model structures but only a softmax layer behind the bert model if we want to use the orthodoxy fine-tuned bert model.

How do you think? @haoyuhu

haoyuhu · 2019-06-05T10:52:42Z

Thank you for your detailed explanation. I agree with you. It's unnecesary I think.

haoyuhu added 2 commits June 3, 2019 20:57

features: export saved model for serving

4c5d374

fixbug: crash when executing pip install . on Windows

5fae824

haoyuhu force-pushed the tf.keras-version branch from f15d807 to 5fae824 Compare June 3, 2019 12:57

BrikerMan merged commit 7ba5c6d into BrikerMan:tf.keras-version Jun 4, 2019

haoyuhu deleted the tf.keras-version branch June 4, 2019 07:13

alexwwang mentioned this pull request Jun 4, 2019

Do we support bert finetune classifier?我们支持 bert 的finetune 标签分类模型吗 CyberZHG/keras-bert#29

Closed

BrikerMan added a commit that referenced this pull request Jun 26, 2019

♻️ Refactoring model.export to convert_to_saved_model function. #103

c29f715

BrikerMan added a commit that referenced this pull request Jun 26, 2019

♻️ Refactoring model.export to convert_to_saved_model function. #103

5d58d73

BrikerMan added a commit that referenced this pull request Jun 26, 2019

♻️ Add tf-serving document, #103

e767c6a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Export SavedModel for Serving #103

Export SavedModel for Serving #103

haoyuhu commented Jun 3, 2019

BrikerMan commented Jun 3, 2019

haoyuhu commented Jun 4, 2019

haoyuhu commented Jun 4, 2019

BrikerMan commented Jun 4, 2019

alexwwang commented Jun 4, 2019 •

edited

Loading

BrikerMan commented Jun 4, 2019

alexwwang commented Jun 4, 2019

haoyuhu commented Jun 4, 2019 •

edited

Loading

CyberZHG commented Jun 4, 2019

alexwwang commented Jun 4, 2019 •

edited

Loading

CyberZHG commented Jun 4, 2019

haoyuhu commented Jun 4, 2019

alexwwang commented Jun 4, 2019 •

edited

Loading

CyberZHG commented Jun 4, 2019

haoyuhu commented Jun 4, 2019

alexwwang commented Jun 4, 2019

haoyuhu commented Jun 4, 2019

alexwwang commented Jun 4, 2019 •

edited

Loading

CyberZHG commented Jun 4, 2019

CyberZHG commented Jun 4, 2019

haoyuhu commented Jun 4, 2019 •

edited

Loading

alexwwang commented Jun 4, 2019 •

edited

Loading

CyberZHG commented Jun 4, 2019

alexwwang commented Jun 4, 2019

alexwwang commented Jun 4, 2019

haoyuhu commented Jun 4, 2019

alexwwang commented Jun 4, 2019 •

edited

Loading

haoyuhu commented Jun 5, 2019

alexwwang commented Jun 5, 2019

haoyuhu commented Jun 5, 2019

Export SavedModel for Serving #103

Export SavedModel for Serving #103

Conversation

haoyuhu commented Jun 3, 2019

BrikerMan commented Jun 3, 2019

haoyuhu commented Jun 4, 2019

haoyuhu commented Jun 4, 2019

BrikerMan commented Jun 4, 2019

alexwwang commented Jun 4, 2019 • edited Loading

BrikerMan commented Jun 4, 2019

alexwwang commented Jun 4, 2019

haoyuhu commented Jun 4, 2019 • edited Loading

CyberZHG commented Jun 4, 2019

alexwwang commented Jun 4, 2019 • edited Loading

CyberZHG commented Jun 4, 2019

haoyuhu commented Jun 4, 2019

alexwwang commented Jun 4, 2019 • edited Loading

CyberZHG commented Jun 4, 2019

haoyuhu commented Jun 4, 2019

alexwwang commented Jun 4, 2019

haoyuhu commented Jun 4, 2019

alexwwang commented Jun 4, 2019 • edited Loading

CyberZHG commented Jun 4, 2019

CyberZHG commented Jun 4, 2019

haoyuhu commented Jun 4, 2019 • edited Loading

alexwwang commented Jun 4, 2019 • edited Loading

CyberZHG commented Jun 4, 2019

alexwwang commented Jun 4, 2019

alexwwang commented Jun 4, 2019

haoyuhu commented Jun 4, 2019

alexwwang commented Jun 4, 2019 • edited Loading

haoyuhu commented Jun 5, 2019

alexwwang commented Jun 5, 2019

haoyuhu commented Jun 5, 2019

alexwwang commented Jun 4, 2019 •

edited

Loading

haoyuhu commented Jun 4, 2019 •

edited

Loading

alexwwang commented Jun 4, 2019 •

edited

Loading

alexwwang commented Jun 4, 2019 •

edited

Loading

alexwwang commented Jun 4, 2019 •

edited

Loading

haoyuhu commented Jun 4, 2019 •

edited

Loading

alexwwang commented Jun 4, 2019 •

edited

Loading

alexwwang commented Jun 4, 2019 •

edited

Loading