Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Pooler layer in BertModelLayer #82

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

mrinaald
Copy link

@mrinaald mrinaald commented Nov 9, 2020

Implement the Pooler layer from the BERT model architecture, which creates a pooled feature vector using the first token from the output sequence. In many of the online blogs and examples, they mention to take the pooled output from BERT directly and add dense layers (or other layers) on this pooled output.

With this change, the pooler layer weights available in the downloaded checkpoint files of various models can also be loaded into the BertModelLayer object.

  • Original Behaviour:
    Done loading 37 BERT weights from: ~/Downloads/BERT/BERT-Weights/uncased_L-2_H-128_A-2/bert_model.ckpt into <bert.model.BertModelLayer object at 0x7f6a9c64df40> (prefix:bert_orig). Count of weights not found in the checkpoint was: [0]. Count of weights with mismatched shape: [0]
    Unused weights from checkpoint:
    bert/pooler/dense/bias
    bert/pooler/dense/kernel
    cls/predictions/output_bias
    cls/predictions/transform/LayerNorm/beta
    cls/predictions/transform/LayerNorm/gamma
    cls/predictions/transform/dense/bias
    cls/predictions/transform/dense/kernel
    cls/seq_relationship/output_bias
    cls/seq_relationship/output_weights

  • Modified Behaviour:
    Done loading 39 BERT weights from: ~/Downloads/BERT/BERT-Weights/uncased_L-2_H-128_A-2/bert_model.ckpt into <bert.model.BertModelLayer object at 0x7f6a9d026a30> (prefix:bert_pooled). Count of weights not found in the checkpoint was: [0]. Count of weights with mismatched shape: [0]
    Unused weights from checkpoint:
    cls/predictions/output_bias
    cls/predictions/transform/LayerNorm/beta
    cls/predictions/transform/LayerNorm/gamma
    cls/predictions/transform/dense/bias
    cls/predictions/transform/dense/kernel
    cls/seq_relationship/output_bias
    cls/seq_relationship/output_weights

To get the pooler layer output, we need to initialize the BertModelLayer as follows:
bert_params.return_pooler_output = True
l_bert = bert.BertModelLayer.from_params(bert_params, name="bert")

@Ahmedn1
Copy link

Ahmedn1 commented Aug 31, 2021

Would someone merge this branch, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants