-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added example for fine tuning BERT on Text Extraction task (SQuAD) #46
Conversation
updating repo
Update fork
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR. It looks great! Very useful example.
I fixed various minor nits. Other than that my comments deal with documentation improvements.
@@ -0,0 +1,318 @@ | |||
""" | |||
Title: BERT for Text Extraction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go with "BERT (from HuggingFace Transformers) for Text Extraction" -- in the future we will have BERT examples that don't use HuggingFace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh okay. By the way I am very interested in writing examples of BERT and GPT with just tf.keras
. Maybe just the pertaining part for example. We can discuss over over a separate issue if you think it would be a good addition.
Author: [Apoorv Nandan](https://twitter.com/NandanApoorv) | ||
Date created: 2020/05/23 | ||
Last modified: 2020/05/23 | ||
Description: Fine tune pretrained BERT from HuggingFace on SQuAD. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"HuggingFace Transformers"
return model | ||
|
||
|
||
use_tpu = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a paragraph of text explaining that this example should preferably be run on the Colab TPU runtime.
|
||
class ExactMatch(keras.callbacks.Callback): | ||
def on_epoch_end(self, epoch, logs=None): | ||
pred_start, pred_end = self.model.predict(x_eval) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than using x_eval
from the outer scope, pass it to __init__
and set it as callback attribute
In general: data should be passed as argument, functions / classes can be fetched from the outer scope.
return text | ||
|
||
|
||
class ExactMatch(keras.callbacks.Callback): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a docstring with a quick explanation of how the callback works
print(f"{len(eval_squad_examples)} evaluation points created.") | ||
|
||
""" | ||
Create the Question Answering Model using BERT and Functional API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question-Answering
|
||
use_tpu = True | ||
if use_tpu: | ||
# create distribution strategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: please capitalize comments
Thanks for the comments. Making the required changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you. Please add the general files (or let me know if you want me to generate them instead).
I'll be running the use_tpu = False and add some lines like # Remove these lines to train on the entire data
x_train = [_[:10,:] for _ in x_train]
y_train = [_[:10,:] for _ in y_train]
model.fit(x_train, y_train, ...) Or is there a better way? |
How long does it take to run the example on a V100? (as an approximative estimate) If less than ~20 min on a V100, I can run it on my side and generate the files |
1 epoch took 1 hour 12 min on Colab GPU (K80 I think?). Anyhow, I copied everything over to Colab, and ran |
model.fit( | ||
x_train, | ||
y_train, | ||
epochs=1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment here that the recommend number of epochs is 3, not 1. You don't need to regenerate the files, you can juste edit the ipynb and md files directly to add the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you! This is a very nice script, it will be valuable.
bert-base-uncased
from HuggingFace and their tokenizers.