-
-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export SavedModel for Serving #103
Export SavedModel for Serving #103
Conversation
haoyuhu
commented
Jun 3, 2019
This is cool, but have you tried this function with custom objects, like multi-head layer from BERTEmbedding. |
I tried BERTEmbedding as an example, and the prediction looked fine. import logging
logging.basicConfig(level=logging.DEBUG)
import kashgari
from kashgari.corpus import SMP2018ECDTCorpus
from kashgari.tasks.classification import BLSTMModel
from kashgari.embeddings.bert_embedding import BERTEmbedding
x_data, y_data = SMP2018ECDTCorpus.load_data()
x_val, y_val = SMP2018ECDTCorpus.load_data('valid')
embedding = BERTEmbedding(task=kashgari.CLASSIFICATION, model_folder='/data/models/bert-base-chinese', from_saved_model=True, trainable=False)
classifier = BLSTMModel(embedding)
classifier.fit(x_data, y_data, x_val, y_val)
classifier.export('./savedmodels')
x_data[0]
# Output:
# ['现', '在', '有', '什', '么', '好', '看', '的', '电', '视', '剧', '?']
embedding.process_x_dataset([x_data[0]])
# Output:
# (array([[4385, 1762, 3300, 784, 720, 1962, 4692, 4638, 4510, 6228, 1196,
# 8043, 0, 0, 0]]), array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]))
embedding.process_y_dataset([y_data[0]])
# Output:
# array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
# 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.]],
# dtype=float32)
classifier.predict([x_data[0]], debug_info=True)
# Output:
# INFO:root:output: [[3.6619654e-06 6.7627948e-04 5.0349574e-04 7.7625271e-04 1.3573820e-03
# 3.1325928e-04 3.9416406e-02 1.4551771e-02 2.4546034e-04 3.5036734e-04
# 2.1456789e-04 4.1255872e-03 2.7694383e-03 5.6802204e-05 5.3648418e-04
# 6.5579050e-04 3.7221718e-05 8.9914782e-04 5.2839550e-03 4.0301870e-04
# 3.8410389e-04 4.3760013e-04 4.9373659e-04 1.7610782e-04 4.3451432e-03
# 2.4513726e-04 5.5251090e-04 1.2465163e-04 9.1961819e-01 9.2061753e-05
# 1.6616248e-04 1.8825983e-04]]
# INFO:root:output argmax: [28]
# ['epg'] Serving: bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --rest_api_port=9000 --model_name=bert_blstm --model_base_path=/root/tf_serving/bert_blstm/ --enable_batching=true
# Output:
# 2019-06-04 00:38:23.845690: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: bert_blstm version: 1559618962}
# 2019-06-04 00:38:23.863891: I tensorflow_serving/model_servers/server.cc:324] Running gRPC ModelServer at 0.0.0.0:8500 ...
# 2019-06-04 00:38:23.920985: I tensorflow_serving/model_servers/server.cc:344] Exporting HTTP/REST API at:localhost:9000 ...
saved_model_cli show --dir /root/tf_serving/bert_blstm/1559618962/ --tag_set serve --signature_def serving_default
# Output:
# The given SavedModel SignatureDef contains the following input(s):
# inputs['Input-Segment:0'] tensor_info:
# dtype: DT_FLOAT
# shape: (-1, 15)
# name: Input-Segment:0
# inputs['Input-Token:0'] tensor_info:
# dtype: DT_FLOAT
# shape: (-1, 15)
# name: Input-Token:0
# The given SavedModel SignatureDef contains the following output(s):
# outputs['dense/Softmax:0'] tensor_info:
# dtype: DT_FLOAT
# shape: (-1, 32)
# name: dense/Softmax:0
# Method name is: tensorflow/serving/predict
curl -H "Content-type: application/json" -X POST -d '{"instances": [{"Input-Segment:0": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0], "Input-Token:0": [4385,1762,3300,784,720,1962,4692,4638,4510,6228,1196,8043,0,0,0]}]}' "http://localhost:9000/v1/models/bert_blstm:predict"
# Output:
# {
# "predictions": [[3.66197105e-06, 0.000676278665, 0.000503493298, 0.000776251429, 0.00135738368, 0.000313259545, 0.0394166, 0.0145518212, 0.000245459785, 0.000350368762, 0.000214567845, 0.0041255923, 0.00276944018, 5.68024116e-05, 0.000536485342, 0.000655789045, 3.7221711e-05, 0.000899151899, 0.00528393872, 0.00040301861, 0.000384103041, 0.000437600684, 0.00049373717, 0.000176107773, 0.00434515439, 0.000245138333, 0.000552510261, 0.00012465172, 0.919617951, 9.20617313e-05, 0.000166162121, 0.000188259597]
# ]
# } |
BTW, I found that currently we cannot pass |
@alexwwang is working on that, #102. |
Yes, I set a warning there. Because I have not got feedback from @CyberZHG about the expected behavior if we set According to my instinct, when set trainable to Say if we finetuned the model, where would it be saved? The original checkpoint or a new place we assign? Then when we need to load the finetuned one in practice, should we modify our current codes and how? |
I think we don't have to worry about model saving. When you save the model, it will save its current weights, which include the fine-tuned weights. And we don't need to modify our loading function. |
Yes, I've got feedback from @CyberZHG and you two gave a consistent answer. |
I think |
|
Although the names of the two arguments look alike, you don't need to set them to the same value. |
https://github.com/CyberZHG/keras-bert/blob/master/keras_bert/bert.py#L112 |
Yes, if @haoyuhu |
😿 |
@alexwwang |
Sorry, I will check it again. |
|
Yes. Have you noticed that, if
These three layers are finetune related, I assume. Updated: |
I think I have to update the docs and comments later. I didn't realize the ambiguity. 😹
|
All layers will be influenced by |
Yes, you are right. :P Params in bert will be trainable. |
As you mentioned on that issue page in your project, you said |
|
Yes, yes, I found that nearly at once I posted the original one. 😹 |
@alexwwang So great! I will try it later. |
@haoyuhu However I have to point out, according to the original thesis, that, feature extraction approach actually do not have any fine-tune design. All the fine-tune attempts were made on the whole model end-to-end. So do you think we should support the whole model fine-tune (allow to set |
In my use case, it will bring 2~3% reduction in |
Now we have allowed to set The
However, the orthodoxy fine-tune procedure should only be made when both How do you think? @haoyuhu |
Thank you for your detailed explanation. I agree with you. It's unnecesary I think. |