Skip to content
This repository has been archived by the owner on Jun 9, 2021. It is now read-only.

Transformer hugginface BERT model not working #271

Open
bksaini078 opened this issue May 21, 2021 · 3 comments
Open

Transformer hugginface BERT model not working #271

bksaini078 opened this issue May 21, 2021 · 3 comments

Comments

@bksaini078
Copy link

bksaini078 commented May 21, 2021

While fine-tuning the transformers model i.e.transformers.TFDistilBertModel.from_pretrained(pretrained_weights)
I got this error message.
image
Can someone please help how to resolve this issue? Or, someone able to run the Transfomer BERT models in mac M1?

Reference code:

def BERT_model(max_len,pretrained_weights):
    '''BERT model creation with pretrained weights
    max_len: input length '''
    # parameter declaration
    learning_rate=2e-5
    optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate)

    bert=transformers.TFDistilBertModel.from_pretrained(pretrained_weights)

    # declaring inputs, BERT take input_ids and attention_mask as input
    input_ids= Input(shape=(max_len,),dtype=tf.int32,name='input_ids')
    attention_mask=Input(shape=(max_len,),dtype=tf.int32,name='attention_mask')

    distillbert= bert(input_ids,attention_mask=attention_mask)
    x= distillbert[0][:,0,:]
    x=tf.keras.layers.Dropout(0.2)(x)
    x= tf.keras.layers.Dense(64)(x)
    x=tf.keras.layers.Dense(32)(x)

    output=tf.keras.layers.Dense(2,activation='sigmoid')(x)

    model=Model(inputs=[input_ids,attention_mask],outputs=[output])
    # compiling model 
    model.compile(optimizer=optimizer,loss='binary_crossentropy', metrics=['accuracy'])
    return model
model.fit(x_train,y_train,batch_size=8,epochs=3,validation_split=0.2,verbose=1)
@haesookimDev
Copy link

haesookimDev commented May 27, 2021

change layer to
x=tf.keras.layers.Dropout(0.2)(x)
x= tf.keras.layers.Dense(64)(x)
x=tf.keras.layers.Dense(32)(x)
x=tf.keras.layers.Dense(2,activation='sigmoid')(x)
output=tf.keras.layers.Dropout(0)(x)

Because if there's an Activation function on the last layer, there's a problem, so I'm going to add a Dropout layer that doesn't do anything on the last layer.

@bksaini078
Copy link
Author

Thank you for your reply,
I tried the proposed approach. Unfortunately, it is showing the same error message.
Did you run the BERT model successfully on your end?

@jaismith
Copy link

hey there @bksaini078, were you able to load the BERT from tensorflow-hub? if so, would you mind showing how you did that? i'm unable to load the BERT model using hub.KerasLayer (see #276)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants