-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can not load pretrained bert weights when loading chinese_L-12_H-768_A-12/bert_model.ckpt #80
Comments
I'm not able to reproduce, can you try posting a minimal but complete executable example, i.e. something like: import os
import bert
from tensorflow import keras
model_name = "chinese_L-12_H-768_A-12"
model_dir = bert.fetch_google_bert_model(model_name, ".models")
model_ckpt = os.path.join(model_dir, "bert_model.ckpt")
bert_params = bert.params_from_pretrained_ckpt(model_dir)
l_bert = bert.BertModelLayer.from_params(bert_params, name="bert")
# use in Keras Model here, and call model.build()
model = keras.models.Sequential([
keras.layers.InputLayer(input_shape=(128,)),
l_bert,
keras.layers.Lambda(lambda x: x[:, 0, :]),
keras.layers.Dense(2)
])
model.build(input_shape=(None, 128))
bert.load_bert_weights(l_bert, model_ckpt)
model.summary() |
ohh... I see, you are trying to replace/extend the default embeddings layer, cool! I believe, because the output = bert_layer.encoders_layer(new_emb) the prefix/name_scope is missing. As a workaround, you could put the relevant peaces (or everything) in a name_scope like this: from tensorflow.python.keras import backend as K
# https://github.com/tensorflow/tensorflow/issues/27298
with K.get_graph().as_default(), K.name_scope('bert'):
emb_mask = bert_layer.embeddings_layer(mask_ids) # shape(1, seq_len, emb_size)
output = bert_layer.encoders_layer(new_emb) as a minimal example: import os
import bert
from tensorflow import keras
from tensorflow.python.keras import backend as K
model_name = "chinese_L-12_H-768_A-12"
model_dir = bert.fetch_google_bert_model(model_name, ".models")
model_ckpt = os.path.join(model_dir, "bert_model.ckpt")
bert_params = bert.params_from_pretrained_ckpt(model_dir)
# https://github.com/tensorflow/tensorflow/issues/27298
with K.get_graph().as_default(), K.name_scope('bert'):
l_bert = bert.BertModelLayer.from_params(bert_params, name="bert")
inp_ids = keras.layers.Input(shape=(128,), dtype='int32')
new_emb = l_bert.embeddings_layer(inp_ids)
output = l_bert.encoders_layer(new_emb)
output = keras.layers.Dense(3, activation='softmax')(output + new_emb)
model = keras.models.Model(inp_ids, output, name='bert')
bert.load_bert_weights(l_bert, model_ckpt)
model.summary() as an alternative consider extending |
Thanks for your reply. And yes, it is because of missing the prefix/name_scope. Your example is feasible. Cool! |
Here is my code snippets
When I run it, get the problem of loading pretrained weights. Can anyone help me? Thanks!
The text was updated successfully, but these errors were encountered: