Exposing more than last layer with BERT? #290

cbockman · 2019-05-07T01:57:10Z

It looks like the BERT hub model currently only exposes the last layer of outputs, per 'sequence_output' (https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1).

Are there any plans to expose the intermediate layers?

Section 5.4 in the original paper (https://arxiv.org/pdf/1810.04805.pdf) suggests a useful path around using BERT as a feature-based input. This would be useful in cases, e.g., where fine-tuning the entire model is prohibitive (eg, very long documents). Simply taking the last layer is an option, but performance clearly is below what it otherwise could be.

rmothukuru · 2019-06-18T08:01:43Z

We don't have plans to include a signature for intermediate layers, since this would mean that different BERT modules would have slightly different signatures.

However, this is doable post-hoc with a few lines of extra code, shown below.

import tensorflow as tf
import tensorflow_hub as hub

def get_intermediate_layer(last_layer, total_layers, desired_layer):
  intermediate_layer_name = last_layer.name.replace(str(total_layers + 1),
                                                    str(desired_layer + 1))
  print("Intermediate layer name: ", intermediate_layer_name)
  return tf.get_default_graph().get_tensor_by_name(intermediate_layer_name)

with tf.Graph().as_default() as g:
  input_ids	= tf.zeros([1, 1], tf.int32)
  input_mask = tf.zeros([1, 1], tf.int32)
  segment_ids	= tf.zeros([1, 1], tf.int32)
  bert_inputs = dict(
      input_ids=input_ids,
      input_mask=input_mask,
      segment_ids=segment_ids)
  bert_module = hub.Module("https://tfhub.dev/google/bert_uncased_L-12_H-768_A-12/1")
  bert_outputs = bert_module(
      inputs=bert_inputs,
      signature="tokens",
      as_dict=True)
  layer_6 = get_intermediate_layer(
      last_layer=bert_outputs["sequence_output"],
      total_layers=12,
      desired_layer=6)
  with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    print(session.run(layer_6))

rmothukuru · 2019-06-21T05:24:00Z

Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

tensorflow/hub#290

iamxpy · 2020-01-03T14:04:38Z

@rmothukuru Hi there, I am trying the solution you provided to get the immediate output of Bert, but when the code hub.Module() raises the error RuntimeError: Missing implementation that supports: loader .... As this comment points out, I am using a SavedModel in TensorFlow 2 format, so I do not have the file tfhub_module.pb. In this case, can I have some another way to have access to middle layer output of Bert? e.g. could you please provide the code in tf 2.0 style? or is it not possible to achieve this in tf2? Thanks in advance!

BTW, I have tried the following code:

module = hub.load("/home/xiepengyu/google_quest/input/bert-base-from-tfhub/bert_en_uncased_L-12_H-768_A-12/")
# token_signature = module.signatures["tokens"]  # this will give me KeyError
token_signature = module.signatures["serving_default"]
module_input = dict(
    input_word_ids=tf.constant(3, shape=[1, 4]),
    input_mask=tf.constant(1, shape=[1, 4]),
    input_type_ids=tf.constant(4, shape=[1, 4]),
)
output = token_signature(**module_input)
print(output)

layer_6 = get_intermediate_layer(
    last_layer=output["bert_model_1"],  # output[1] will give me KeyError
    total_layers=12,
    desired_layer=6)

print(layer_6)

And it gives me AttributeError: Tensor.name is meaningless when eager execution is enabled.

I got the same AttributeError when I changed to hub.KerasLayer with the following code:

input_ids = tf.zeros([1, 1], tf.int32)
input_mask = tf.zeros([1, 1], tf.int32)
segment_ids = tf.zeros([1, 1], tf.int32)
bert_inputs = dict(
    input_ids=input_ids,
    input_mask=input_mask,
    segment_ids=segment_ids)

module = hub.KerasLayer("/home/xiepengyu/google_quest/input/bert-base-from-tfhub/bert_en_uncased_L-12_H-768_A-12/")
output = module([input_ids, input_mask, segment_ids])

layer_6 = get_intermediate_layer(
    last_layer=output[1],
    total_layers=12,
    desired_layer=6)

print(layer_6)

rmothukuru self-assigned this May 7, 2019

rmothukuru added subtype:text-embedding type:support labels May 7, 2019

rmothukuru added stat:awaiting tensorflower stat:awaiting response and removed stat:awaiting tensorflower labels Jun 12, 2019

rmothukuru closed this as completed Jun 21, 2019

0x0539 mentioned this issue Oct 28, 2019

Get intermediate layer output of tf-hub #396

Closed

juesato mentioned this issue Nov 19, 2019

How can I gain feature vectors or other middle outputs? google-deepmind/deepmind-research#6

Closed

shabnamsadegh added a commit to shabnamsadegh/bert that referenced this issue Nov 22, 2019

tf hub does not provide all layers. needed to use a hack

2adc381

tensorflow/hub#290

iamxpy mentioned this issue Jan 3, 2020

Could not load local module, with RuntimeError: Missing implementation that supports: loader(*('/tmp/tfhub_modules/mobilenet_module',), **{}) #212

Closed

iamxpy mentioned this issue Jan 4, 2020

Where to get tfhub_module.pb? #465

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exposing more than last layer with BERT? #290

Exposing more than last layer with BERT? #290

cbockman commented May 7, 2019 •

edited

rmothukuru commented Jun 18, 2019

rmothukuru commented Jun 21, 2019

iamxpy commented Jan 3, 2020 •

edited

Exposing more than last layer with BERT? #290

Exposing more than last layer with BERT? #290

Comments

cbockman commented May 7, 2019 • edited

rmothukuru commented Jun 18, 2019

rmothukuru commented Jun 21, 2019

iamxpy commented Jan 3, 2020 • edited

cbockman commented May 7, 2019 •

edited

iamxpy commented Jan 3, 2020 •

edited