Skip to content

Conversation

susnato
Copy link
Contributor

@susnato susnato commented Feb 23, 2023

What does this PR do?

As discussed in this issue, this PR updates the previous training script for finetuning BERT on SST2.

The colab link is this : https://colab.research.google.com/drive/1afTO0ahF3vZrJtkVSGXwBLV2OGeDfomL?usp=sharing

Valdiation Scores achieved :

Epoch 1/2
4210/4210 [==============================] - 828s 191ms/step - loss: 0.3782 - sparse_categorical_accuracy: 0.8299 - val_loss: 0.4344 - val_sparse_categorical_accuracy: 0.8165
Epoch 2/2
4210/4210 [==============================] - 783s 186ms/step - loss: 0.2409 - sparse_categorical_accuracy: 0.9039 - val_loss: 0.4626 - val_sparse_categorical_accuracy: 0.8222

@chenmoneygithub

@google-cla
Copy link

google-cla bot commented Feb 23, 2023

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@susnato
Copy link
Contributor Author

susnato commented Feb 23, 2023

I signed the form so cla/google is green now.

@susnato susnato changed the title bert_tiny_uncased_en_sst2 added retrained bert_tiny_uncased_en_sst2_training.ipynb Feb 23, 2023
@susnato
Copy link
Contributor Author

susnato commented Feb 26, 2023

Hi @chenmoneygithub I finetuned BertClassifier again on SST2 as you said. Please check it.

@chenmoneygithub
Copy link
Contributor

@susnato Thanks a lot! Looks beautiful overall!

Another thing you may want to do is to use a decayed learning rate:

lr = tf.keras.optimizers.schedules.PolynomialDecay(
    5e-5,
    decay_steps={total_training_steps},
    end_learning_rate=0.0,
)

Please let me know how it works, thx again!

@mattdangerw
Copy link
Member

Just a drive by comment, but we should replace, not copy, the old colab with this PR. We don't want to keep spawning versions of the colab each time we update something.

@susnato
Copy link
Contributor Author

susnato commented Mar 1, 2023

Hi @chenmoneygithub thanks for the reply! I ran as you instructed, it gave -

Epoch 1/2
4210/4210 [==============================] - 871s 201ms/step - loss: 0.4560 - sparse_categorical_accuracy: 0.7970 - val_loss: 0.5304 - val_sparse_categorical_accuracy: 0.7534
Epoch 2/2
4210/4210 [==============================] - 818s 194ms/step - loss: 0.3556 - sparse_categorical_accuracy: 0.8541 - val_loss: 0.5900 - val_sparse_categorical_accuracy: 0.7385

Since it's working relatively worse should I stick with constant lr?

@chenmoneygithub
Copy link
Contributor

@susnato Hi! sorry for the late reply, I was on vacation the last week. This is a bit odd to me, in my experiments the decayed lr generally works better (but I tried it with BERT base model). If this is stable on your side, it sounds good to stay with the constant lr.

"nbformat_minor": 0
}
]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline at the end

"visibility": null,
"width": "20px"
}
"cell_type": "code",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: delete the empty cell

@susnato susnato force-pushed the bert_tiny_uncased_en_sst2 branch from 6c5136d to 1cfd306 Compare March 6, 2023 04:52
@susnato
Copy link
Contributor Author

susnato commented Mar 6, 2023

Hi @chenmoneygithub thanks a lot for your comments! I made the changes you requested, please check them.

@chenmoneygithub
Copy link
Contributor

@susnato Oh one more thing - could you run ./shell/format.sh in your branch? the style check is broken.

@susnato
Copy link
Contributor Author

susnato commented Mar 7, 2023

@chenmoneygithub Done!

@susnato susnato force-pushed the bert_tiny_uncased_en_sst2 branch from 03866b4 to bceae22 Compare March 7, 2023 05:40
@susnato
Copy link
Contributor Author

susnato commented Mar 8, 2023

Hi @chenmoneygithub the Check the code format is passed! but there seems to be an error with keras-nlp-accelerator-testing is this something related to my code?

@chenmoneygithub
Copy link
Contributor

@susnato That one is okay, don't worry.

@chenmoneygithub chenmoneygithub merged commit 54e7b16 into keras-team:master Mar 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants