-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exposing more than last layer with BERT? #290
Comments
We don't have plans to include a signature for intermediate layers, since this would mean that different BERT modules would have slightly different signatures. However, this is doable post-hoc with a few lines of extra code, shown below.
|
Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks! |
@rmothukuru Hi there, I am trying the solution you provided to get the immediate output of Bert, but when the code BTW, I have tried the following code: module = hub.load("/home/xiepengyu/google_quest/input/bert-base-from-tfhub/bert_en_uncased_L-12_H-768_A-12/")
# token_signature = module.signatures["tokens"] # this will give me KeyError
token_signature = module.signatures["serving_default"]
module_input = dict(
input_word_ids=tf.constant(3, shape=[1, 4]),
input_mask=tf.constant(1, shape=[1, 4]),
input_type_ids=tf.constant(4, shape=[1, 4]),
)
output = token_signature(**module_input)
print(output)
layer_6 = get_intermediate_layer(
last_layer=output["bert_model_1"], # output[1] will give me KeyError
total_layers=12,
desired_layer=6)
print(layer_6) And it gives me I got the same input_ids = tf.zeros([1, 1], tf.int32)
input_mask = tf.zeros([1, 1], tf.int32)
segment_ids = tf.zeros([1, 1], tf.int32)
bert_inputs = dict(
input_ids=input_ids,
input_mask=input_mask,
segment_ids=segment_ids)
module = hub.KerasLayer("/home/xiepengyu/google_quest/input/bert-base-from-tfhub/bert_en_uncased_L-12_H-768_A-12/")
output = module([input_ids, input_mask, segment_ids])
layer_6 = get_intermediate_layer(
last_layer=output[1],
total_layers=12,
desired_layer=6)
print(layer_6) |
It looks like the BERT hub model currently only exposes the last layer of outputs, per 'sequence_output' (https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1).
Are there any plans to expose the intermediate layers?
Section 5.4 in the original paper (https://arxiv.org/pdf/1810.04805.pdf) suggests a useful path around using BERT as a feature-based input. This would be useful in cases, e.g., where fine-tuning the entire model is prohibitive (eg, very long documents). Simply taking the last layer is an option, but performance clearly is below what it otherwise could be.
The text was updated successfully, but these errors were encountered: