Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposing more than last layer with BERT? #290

Closed
cbockman opened this issue May 7, 2019 · 3 comments
Closed

Exposing more than last layer with BERT? #290

cbockman opened this issue May 7, 2019 · 3 comments

Comments

@cbockman
Copy link

cbockman commented May 7, 2019

It looks like the BERT hub model currently only exposes the last layer of outputs, per 'sequence_output' (https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1).

Are there any plans to expose the intermediate layers?

Section 5.4 in the original paper (https://arxiv.org/pdf/1810.04805.pdf) suggests a useful path around using BERT as a feature-based input. This would be useful in cases, e.g., where fine-tuning the entire model is prohibitive (eg, very long documents). Simply taking the last layer is an option, but performance clearly is below what it otherwise could be.

@rmothukuru
Copy link

We don't have plans to include a signature for intermediate layers, since this would mean that different BERT modules would have slightly different signatures.

However, this is doable post-hoc with a few lines of extra code, shown below.

import tensorflow as tf
import tensorflow_hub as hub

def get_intermediate_layer(last_layer, total_layers, desired_layer):
  intermediate_layer_name = last_layer.name.replace(str(total_layers + 1),
                                                    str(desired_layer + 1))
  print("Intermediate layer name: ", intermediate_layer_name)
  return tf.get_default_graph().get_tensor_by_name(intermediate_layer_name)

with tf.Graph().as_default() as g:
  input_ids	= tf.zeros([1, 1], tf.int32)
  input_mask = tf.zeros([1, 1], tf.int32)
  segment_ids	= tf.zeros([1, 1], tf.int32)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_module = hub.Module("https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1")
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)
  layer_6 = get_intermediate_layer(
      last_layer=bert_outputs["sequence_output"],
      total_layers=12,
      desired_layer=6)
  with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    print(session.run(layer_6))

@rmothukuru
Copy link

Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

@iamxpy
Copy link

iamxpy commented Jan 3, 2020

@rmothukuru Hi there, I am trying the solution you provided to get the immediate output of Bert, but when the code hub.Module() raises the error RuntimeError: Missing implementation that supports: loader .... As this comment points out, I am using a SavedModel in TensorFlow 2 format, so I do not have the file tfhub_module.pb. In this case, can I have some another way to have access to middle layer output of Bert? e.g. could you please provide the code in tf 2.0 style? or is it not possible to achieve this in tf2? Thanks in advance!

BTW, I have tried the following code:

module = hub.load("/home/xiepengyu/google_quest/input/bert-base-from-tfhub/bert_en_uncased_L-12_H-768_A-12/")
# token_signature = module.signatures["tokens"]  # this will give me KeyError
token_signature = module.signatures["serving_default"]
module_input = dict(
    input_word_ids=tf.constant(3, shape=[1, 4]),
    input_mask=tf.constant(1, shape=[1, 4]),
    input_type_ids=tf.constant(4, shape=[1, 4]),
)
output = token_signature(**module_input)
print(output)

layer_6 = get_intermediate_layer(
    last_layer=output["bert_model_1"],  # output[1] will give me KeyError
    total_layers=12,
    desired_layer=6)

print(layer_6)

And it gives me AttributeError: Tensor.name is meaningless when eager execution is enabled.

I got the same AttributeError when I changed to hub.KerasLayer with the following code:

input_ids = tf.zeros([1, 1], tf.int32)
input_mask = tf.zeros([1, 1], tf.int32)
segment_ids = tf.zeros([1, 1], tf.int32)
bert_inputs = dict(
    input_ids=input_ids,
    input_mask=input_mask,
    segment_ids=segment_ids)

module = hub.KerasLayer("/home/xiepengyu/google_quest/input/bert-base-from-tfhub/bert_en_uncased_L-12_H-768_A-12/")
output = module([input_ids, input_mask, segment_ids])

layer_6 = get_intermediate_layer(
    last_layer=output[1],
    total_layers=12,
    desired_layer=6)

print(layer_6)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants