Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added example for fine tuning BERT on Text Extraction task (SQuAD) #46

Merged
merged 18 commits into from
May 24, 2020

Conversation

apoorvnandan
Copy link
Contributor

  • Uses pretrained bert-base-uncased from HuggingFace and their tokenizers.
  • Gets an exact match score of 79.4 after 2 epoch on TPU. (Paper shows 80.8, most likely due to a bunch of extra post-processing steps that I saw in the original implementation while converting the predicted token indexes back to the answer text)
  • We will have to limit the data before training begins because this will take hours on CPU.

Copy link
Member

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. It looks great! Very useful example.

I fixed various minor nits. Other than that my comments deal with documentation improvements.

@@ -0,0 +1,318 @@
"""
Title: BERT for Text Extraction
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go with "BERT (from HuggingFace Transformers) for Text Extraction" -- in the future we will have BERT examples that don't use HuggingFace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh okay. By the way I am very interested in writing examples of BERT and GPT with just tf.keras. Maybe just the pertaining part for example. We can discuss over over a separate issue if you think it would be a good addition.

Author: [Apoorv Nandan](https://twitter.com/NandanApoorv)
Date created: 2020/05/23
Last modified: 2020/05/23
Description: Fine tune pretrained BERT from HuggingFace on SQuAD.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"HuggingFace Transformers"

return model


use_tpu = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a paragraph of text explaining that this example should preferably be run on the Colab TPU runtime.


class ExactMatch(keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
pred_start, pred_end = self.model.predict(x_eval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than using x_eval from the outer scope, pass it to __init__ and set it as callback attribute

In general: data should be passed as argument, functions / classes can be fetched from the outer scope.

return text


class ExactMatch(keras.callbacks.Callback):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a docstring with a quick explanation of how the callback works

print(f"{len(eval_squad_examples)} evaluation points created.")

"""
Create the Question Answering Model using BERT and Functional API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question-Answering


use_tpu = True
if use_tpu:
# create distribution strategy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please capitalize comments

@apoorvnandan
Copy link
Contributor Author

Thanks for the comments. Making the required changes.

Copy link
Member

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you. Please add the general files (or let me know if you want me to generate them instead).

@apoorvnandan
Copy link
Contributor Author

apoorvnandan commented May 23, 2020

I'll be running the add_example command on my laptop. So, I'll change

use_tpu = False

and add some lines like

# Remove these lines to train on the entire data
x_train  = [_[:10,:] for _ in x_train]
y_train  = [_[:10,:] for _ in y_train]
model.fit(x_train, y_train, ...)

Or is there a better way?

@fchollet
Copy link
Member

fchollet commented May 23, 2020

How long does it take to run the example on a V100? (as an approximative estimate)

If less than ~20 min on a V100, I can run it on my side and generate the files

@apoorvnandan
Copy link
Contributor Author

1 epoch took 1 hour 12 min on Colab GPU (K80 I think?).

Anyhow, I copied everything over to Colab, and ran python autogen.py add_example ... with TPU runtime. Generated files have been added.
I set epochs=1 for this.

model.fit(
x_train,
y_train,
epochs=1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment here that the recommend number of epochs is 3, not 1. You don't need to regenerate the files, you can juste edit the ipynb and md files directly to add the comment.

Copy link
Member

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you! This is a very nice script, it will be valuable.

@fchollet fchollet merged commit 3eedad1 into keras-team:master May 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants