Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export SavedModel for Serving #103

Merged
merged 2 commits into from
Jun 4, 2019

Conversation

haoyuhu
Copy link
Contributor

@haoyuhu haoyuhu commented Jun 3, 2019

from kashgari.corpus import SMP2018ECDTCorpus
from kashgari.tasks.classification import BLSTMModel

x_data, y_data = SMP2018ECDTCorpus.load_data()
classifier = BLSTMModel()
classifier.fit(x_data, y_data)
# export saved model to ./savedmodels/<timestamp>/
classifier.export('./savedmodels')
saved_model_cli show --dir /path/to/saved_models/1559562438/ --tag_set serve --signature_def serving_default
# Output:
# The given SavedModel SignatureDef contains the following input(s):
#  inputs['input:0'] tensor_info:
#       dtype: DT_FLOAT
#       shape: (-1, 15)
#       name: input:0
# The given SavedModel SignatureDef contains the following output(s):
#   outputs['dense/Softmax:0'] tensor_info:
#       dtype: DT_FLOAT
#       shape: (-1, 32)
#       name: dense/Softmax:0
# Method name is: tensorflow/serving/predict

tensorflow_model_server --rest_api_port=9000 --model_name=blstm --model_base_path=/path/to/saved_models/ --enable_batching=true
# Output:
# 2019-06-03 08:28:56.639941: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: blstm version: 1559562438}
# 2019-06-03 08:28:56.645217: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
# 2019-06-03 08:28:56.647192: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:9000 ...

curl -H "Content-type: application/json" -X POST -d '{"instances": [{"input:0": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}]}'  "http://localhost:9000/v1/models/blstm:predict"
# Output:
# {
#     "predictions": [[5.76590492e-06, 0.0334293731, 9.58459859e-05, 0.00066432351, 0.500331104, 0.0521887243, 0.000985755469, 0.000161868113, 0.00147783163, 0.0171929933, 0.00085421023, 0.00599030638, 1.79303879e-05, 0.00050331495, 3.7246391e-05, 3.13154237e-06, 0.0201187711, 0.000672292779, 0.000196203022, 4.57693459e-05, 2.69985958e-06, 8.66179619e-07, 1.03102286e-06, 3.53154815e-06, 0.0478210114, 0.00725555047, 0.000683069753, 0.262197495, 4.151143e-05, 0.046125982, 2.19863551e-07, 0.000894303957]
#     ]
# }

@BrikerMan
Copy link
Owner

This is cool, but have you tried this function with custom objects, like multi-head layer from BERTEmbedding.

@haoyuhu
Copy link
Contributor Author

haoyuhu commented Jun 4, 2019

I tried BERTEmbedding as an example, and the prediction looked fine.

import logging
logging.basicConfig(level=logging.DEBUG)
import kashgari
from kashgari.corpus import SMP2018ECDTCorpus
from kashgari.tasks.classification import BLSTMModel
from kashgari.embeddings.bert_embedding import BERTEmbedding

x_data, y_data = SMP2018ECDTCorpus.load_data()
x_val, y_val = SMP2018ECDTCorpus.load_data('valid')
embedding = BERTEmbedding(task=kashgari.CLASSIFICATION, model_folder='/data/models/bert-base-chinese', from_saved_model=True, trainable=False)
classifier = BLSTMModel(embedding)
classifier.fit(x_data, y_data, x_val, y_val)
classifier.export('./savedmodels')

x_data[0]
# Output:
# ['现', '在', '有', '什', '么', '好', '看', '的', '电', '视', '剧', '?']
embedding.process_x_dataset([x_data[0]])
# Output:
# (array([[4385, 1762, 3300,  784,  720, 1962, 4692, 4638, 4510, 6228, 1196,
#         8043,    0,    0,    0]]), array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]))
embedding.process_y_dataset([y_data[0]])
# Output:
# array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
#         0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.]],
#       dtype=float32)
classifier.predict([x_data[0]], debug_info=True)
# Output:
# INFO:root:output: [[3.6619654e-06 6.7627948e-04 5.0349574e-04 7.7625271e-04 1.3573820e-03
#   3.1325928e-04 3.9416406e-02 1.4551771e-02 2.4546034e-04 3.5036734e-04
#   2.1456789e-04 4.1255872e-03 2.7694383e-03 5.6802204e-05 5.3648418e-04
#   6.5579050e-04 3.7221718e-05 8.9914782e-04 5.2839550e-03 4.0301870e-04
#   3.8410389e-04 4.3760013e-04 4.9373659e-04 1.7610782e-04 4.3451432e-03
#   2.4513726e-04 5.5251090e-04 1.2465163e-04 9.1961819e-01 9.2061753e-05
#   1.6616248e-04 1.8825983e-04]]
# INFO:root:output argmax: [28]
# ['epg']

Serving:

bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --rest_api_port=9000 --model_name=bert_blstm --model_base_path=/root/tf_serving/bert_blstm/ --enable_batching=true
# Output:
# 2019-06-04 00:38:23.845690: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: bert_blstm version: 1559618962}
# 2019-06-04 00:38:23.863891: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
# 2019-06-04 00:38:23.920985: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:9000 ...

saved_model_cli show --dir /root/tf_serving/bert_blstm/1559618962/ --tag_set serve --signature_def serving_default
# Output:
# The given SavedModel SignatureDef contains the following input(s):
#   inputs['Input-Segment:0'] tensor_info:
#       dtype: DT_FLOAT
#       shape: (-1, 15)
#       name: Input-Segment:0
#   inputs['Input-Token:0'] tensor_info:
#       dtype: DT_FLOAT
#       shape: (-1, 15)
#       name: Input-Token:0
# The given SavedModel SignatureDef contains the following output(s):
#   outputs['dense/Softmax:0'] tensor_info:
#       dtype: DT_FLOAT
#       shape: (-1, 32)
#       name: dense/Softmax:0
# Method name is: tensorflow/serving/predict

curl -H "Content-type: application/json" -X POST -d '{"instances": [{"Input-Segment:0": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], "Input-Token:0": [4385,1762,3300,784,720,1962,4692,4638,4510,6228,1196,8043,0,0,0]}]}'  "http://localhost:9000/v1/models/bert_blstm:predict"

# Output:
# {
#     "predictions": [[3.66197105e-06, 0.000676278665, 0.000503493298, 0.000776251429, 0.00135738368, 0.000313259545, 0.0394166, 0.0145518212, 0.000245459785, 0.000350368762, 0.000214567845, 0.0041255923, 0.00276944018, 5.68024116e-05, 0.000536485342, 0.000655789045, 3.7221711e-05, 0.000899151899, 0.00528393872, 0.00040301861, 0.000384103041, 0.000437600684, 0.00049373717, 0.000176107773, 0.00434515439, 0.000245138333, 0.000552510261, 0.00012465172, 0.919617951, 9.20617313e-05, 0.000166162121, 0.000188259597]
#     ]
# }

@BrikerMan BrikerMan merged commit 7ba5c6d into BrikerMan:tf.keras-version Jun 4, 2019
@haoyuhu
Copy link
Contributor Author

haoyuhu commented Jun 4, 2019

BTW, I found that currently we cannot pass trainable=True to BERTEmbedding. I will dive into this later.

@haoyuhu haoyuhu deleted the tf.keras-version branch June 4, 2019 07:13
@BrikerMan
Copy link
Owner

BTW, I found that currently we cannot pass trainable=True to BERTEmbedding. I will dive into this later.

@alexwwang is working on that, #102.

@alexwwang
Copy link
Collaborator

alexwwang commented Jun 4, 2019

Yes, I set a warning there. Because I have not got feedback from @CyberZHG about the expected behavior if we set trainable=True when loading a trained model from a checkpoint. You may check that issue here or below.

According to my instinct, when set trainable to True would make the bert model trainable and it could be a finetune process as a matter of fact. But I need assure from him, especially on the new model saving process.

Say if we finetuned the model, where would it be saved? The original checkpoint or a new place we assign? Then when we need to load the finetuned one in practice, should we modify our current codes and how?

@BrikerMan
Copy link
Owner

Yes, I set a warning there. Because I have not got feedback from @CyberZHG about the expected behavior if we set trainable=True when loading a trained model from a checkpoint. You may check that issue here or below.

According to my instinct, when set trainable to True would make the bert model trainable and it could be a finetune process as a matter of fact. But I need assure from him, especially on the new model saving process.

Say if we finetuned the model, where would it be saved? The original checkpoint or a new place we assign? Then when we need to load the finetuned one in practice, should we modify our current codes and how?

I think we don't have to worry about model saving. When you save the model, it will save its current weights, which include the fine-tuned weights. And we don't need to modify our loading function.

@alexwwang
Copy link
Collaborator

Yes, I've got feedback from @CyberZHG and you two gave a consistent answer.
I'll make another pr later to get rid of that warn.

@haoyuhu
Copy link
Contributor Author

haoyuhu commented Jun 4, 2019

Yes, I set a warning there. Because I have not got feedback from @CyberZHG about the expected behavior if we set trainable=True when loading a trained model from a checkpoint. You may check that issue here or below.

According to my instinct, when set trainable to True would make the bert model trainable and it could be a finetune process as a matter of fact. But I need assure from him, especially on the new model saving process.

Say if we finetuned the model, where would it be saved? The original checkpoint or a new place we assign? Then when we need to load the finetuned one in practice, should we modify our current codes and how?

I think training indicates stage, such as predict or train. trainable indicates using fixed dynamic
BERT embedding or not. We should use training=True and trainable=False if we need a fixed BERT embedding, or training=True and trainable=True for training model with BERT embedding together.

@CyberZHG
Copy link
Contributor

CyberZHG commented Jun 4, 2019

@haoyuhu

training=False for embedding.

@alexwwang
Copy link
Collaborator

alexwwang commented Jun 4, 2019

@haoyuhu
My understanding based on original code is:
training=False : output feature embedding, trainable is useless;
training=trainable=True: load model for finetune weights.
Other combinations would be invalid.

@CyberZHG right?

@CyberZHG
Copy link
Contributor

CyberZHG commented Jun 4, 2019

Although the names of the two arguments look alike, you don't need to set them to the same value. training controls the structure, set it to False if you only need an embedding, set it to True if you want to tune the output of NSP (one way of extracting sentence embedding).

@haoyuhu
Copy link
Contributor Author

haoyuhu commented Jun 4, 2019

https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L112
I think using training=False is all we need. Just embed.

@alexwwang
Copy link
Collaborator

alexwwang commented Jun 4, 2019

Yes, if training=False, trainable is meaningless.
While training=True, trainable=False is set for training a model from sketch.

@haoyuhu
I'm considering add support to bert finetune in our project, based on keras-bert, which would not be too hard.

@CyberZHG
Copy link
Contributor

CyberZHG commented Jun 4, 2019

@alexwwang

😿 trainable is never meaningless because training has nothing to do with the trainability of the model.

@haoyuhu
Copy link
Contributor Author

haoyuhu commented Jun 4, 2019

@alexwwang training=False and trainable=True does make sense.

@alexwwang
Copy link
Collaborator

@alexwwang

😿 trainable is never meaningless because training has nothing to do with the trainability of the model.

Sorry, I will check it again.

@haoyuhu
Copy link
Contributor Author

haoyuhu commented Jun 4, 2019

training=True return a built model for training.
https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L70

@alexwwang
Copy link
Collaborator

alexwwang commented Jun 4, 2019

training=True return a built model for training.
https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L70

Yes. Have you noticed that, if training=False, get_model will return the feature layer before trainable parameter take effect? So for feature output purpose, training=False is what we need.

trainable is a parameter controlling if weights could be updated in mlm_dense_layer, nsp_dense_layer and nsp_pred_layer.

These three layers are finetune related, I assume.

Updated:
These three layers are not finetune related.
If we need feature finetue, just set training=False and trainable=True.

@CyberZHG
Copy link
Contributor

CyberZHG commented Jun 4, 2019

I think I have to update the docs and comments later. I didn't realize the ambiguity. 😹

training means whether we are training the language model.

training=True & trainable=True: train language model
training=False & trainable=False: use it as embedding
training=False & trainable=True: use it as trainable embedding

@CyberZHG
Copy link
Contributor

CyberZHG commented Jun 4, 2019

training=True return a built model for training.
https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L70

Yes. Have you noticed that, if training=False, get_model will return the feature layer before trainable parameter take effect? So for feature output purpose, training=False is what we need.

trainable is a parameter controlling if weights could be updated in mlm_dense_layer, nsp_dense_layer and nsp_pred_layer.

These three layers are finetune related, I assume.

All layers will be influenced by trainable.

@haoyuhu
Copy link
Contributor Author

haoyuhu commented Jun 4, 2019

training=True return a built model for training.
https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L70

Yes. Have you noticed that, if training=False, get_model will return the feature layer before trainable parameter take effect? So for feature output purpose, training=False is what we need.

trainable is a parameter controlling if weights could be updated in mlm_dense_layer, nsp_dense_layer and nsp_pred_layer.

These three layers are finetune related, I assume.

Yes, you are right. :P Params in bert will be trainable.

@alexwwang
Copy link
Collaborator

alexwwang commented Jun 4, 2019

I think I have to update the docs and comments later. I didn't realize the ambiguity. 😹

training means whether we are training the language model.

training=True & trainable=True: train language model
training=False & trainable=False: use it as embedding
training=False & trainable=True: use it as trainable embedding

As you mentioned on that issue page in your project, you said training=True & trainable=False are set for training a whole new bert model from sketch, right?
So as I assume, training=True & trainable=True are set for continuing training a language model.
While training=False & trainable=True, indeed we could get a trainable embedding and this could be the finetune feature we need.

@CyberZHG
Copy link
Contributor

CyberZHG commented Jun 4, 2019

I think I have to update the docs and comments later. I didn't realize the ambiguity. 😹
training means whether we are training the language model.
training=True & trainable=True: train language model
training=False & trainable=False: use it as embedding
training=False & trainable=True: use it as trainable embedding

As you mentioned on that issue page in your project, you said training=True & trainable=False are set for training a whole new bert model from sketch, right?
So as I assume, training=True & trainable=True are set for continuing training a language model.
But I don't understand why training=False & trainable=True would make the embedding trainable, since they're returned before trainable takes effect in the method get_model.
https://github.com/CyberZHG/keras-bert/blob/be1f2cdadeedfd39d6e05e410d47e60d27bd4d96/keras_bert/bert.py#L122

https://github.com/CyberZHG/keras-bert/blob/be1f2cdadeedfd39d6e05e410d47e60d27bd4d96/keras_bert/bert.py#L110

@alexwwang
Copy link
Collaborator

I think I have to update the docs and comments later. I didn't realize the ambiguity. 😹
training means whether we are training the language model.
training=True & trainable=True: train language model
training=False & trainable=False: use it as embedding
training=False & trainable=True: use it as trainable embedding

As you mentioned on that issue page in your project, you said training=True & trainable=False are set for training a whole new bert model from sketch, right?
So as I assume, training=True & trainable=True are set for continuing training a language model.
But I don't understand why training=False & trainable=True would make the embedding trainable, since they're returned before trainable takes effect in the method get_model.
https://github.com/CyberZHG/keras-bert/blob/be1f2cdadeedfd39d6e05e410d47e60d27bd4d96/keras_bert/bert.py#L122

https://github.com/CyberZHG/keras-bert/blob/be1f2cdadeedfd39d6e05e410d47e60d27bd4d96/keras_bert/bert.py#L110

Yes, yes, I found that nearly at once I posted the original one. 😹

@alexwwang
Copy link
Collaborator

#104

@haoyuhu You may have a try on finetune when this pr merged. 😃

@haoyuhu
Copy link
Contributor Author

haoyuhu commented Jun 4, 2019

@alexwwang So great! I will try it later.

@alexwwang
Copy link
Collaborator

alexwwang commented Jun 4, 2019

@haoyuhu However I have to point out, according to the original thesis, that, feature extraction approach actually do not have any fine-tune design. All the fine-tune attempts were made on the whole model end-to-end. So do you think we should support the whole model fine-tune (allow to set training=True and support more training parameters pass into.) here?

@haoyuhu
Copy link
Contributor Author

haoyuhu commented Jun 5, 2019

@haoyuhu However I have to point out, according to the original thesis, that, feature extraction approach actually do not have any fine-tune design. All the fine-tune attempts were made on the whole model end-to-end. So do you think we should support the whole model fine-tune (allow to set training=True and support more training parameters pass into.) here?

In my use case, it will bring 2~3% reduction in eval_accuracy if I only train classifier on top of BERT instead of fine tuning(training=False and trainable=True). In my opinion, we should allow to set trainable=True. I think training=True is for pre-training, am I right? @alexwwang

@alexwwang
Copy link
Collaborator

Now we have allowed to set trainable=True.

The training parameter controls the output of the get_model method:

  • False, will make a feature layer output and its size depends on output_layer_nums;
  • True, will make a whole model output, which contains the MLM task layer and NSP task layer, which are used for training the respected unsupervised tasks during the pre-training procedure.

However, the orthodoxy fine-tune procedure should only be made when both training and trainable are set to True to execute the "end-to-end" tune, which may lead to an even longer training time. And I think maybe it's unnecessary to connect any more complex model structures but only a softmax layer behind the bert model if we want to use the orthodoxy fine-tuned bert model.

How do you think? @haoyuhu

@haoyuhu
Copy link
Contributor Author

haoyuhu commented Jun 5, 2019

Thank you for your detailed explanation. I agree with you. It's unnecesary I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants