Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lasertagger Colab Notebook request #1

Open
daltonj opened this issue Nov 6, 2019 · 4 comments
Open

Lasertagger Colab Notebook request #1

daltonj opened this issue Nov 6, 2019 · 4 comments

Comments

@daltonj
Copy link

daltonj commented Nov 6, 2019

The goal is to run lasertagger in a Google Colab notebook, similar to the BERT finetuning notebooks.

A few issues involved:

  • Requirements -- Install BERT
  • Colab works only with TF 1.x for TPU support
  • Default 256 batch size results in OOM error on colab k80 GPU
  • Add support for training / export using TFHub / Google Cloud.
@daltonj
Copy link
Author

daltonj commented Nov 6, 2019

@varepsilon
Copy link

Looks like one needs to store the data on Google Cloud bucket in order to be able to use Google's TPU.

E.g., for the BERT data:*

INPUT_DIR = 'gs://{}/bert/{}'.format(BUCKET, BERT_MODEL)
tf.gfile.MakeDirs(INPUT_DIR)

for f in tf.gfile.Glob('/content/{}/*'.format(BERT_MODEL)):
  tf.gfile.Copy(f, os.path.join(INPUT_DIR, f.split('/')[-1]))

%env BERT_BASE_DIR=$INPUT_DIR

And for the output data:

GS_OUTPUT_DIR = 'gs://{}/output'.format(BUCKET)
tf.gfile.MakeDirs(GS_OUTPUT_DIR)

for f in tf.gfile.Glob('/content/output/*'):
  if tf.gfile.Stat(f).is_directory:
    continue
  tf.gfile.Copy(f, os.path.join(GS_OUTPUT_DIR, f.split('/')[-1]))

%env OUTPUT_DIR=$GS_OUTPUT_DIR

(*) Assuming the data was previously stored locally, e.g.,

bert_url = 'https://storage.googleapis.com/bert_models/2018_10_18/' + BERT_MODEL + '.zip'
bert_zip = BERT_MODEL + '.zip'
!wget $bert_url
!unzip $bert_zip

@daltonj
Copy link
Author

daltonj commented Nov 7, 2019

The notebook has been updated to support TPU, via writing all data to a cloud bucket.

Note: the prediction step - predict_main.py doesn't have flags for using a TPU. It's currently running very slowly, roughly 1 example per second. Is this expected?

I1107 15:31:14.475278 139814862706560 predict_main.py:89] 0 examples processed, 0 converted to tf.Example.
I1107 15:33:06.323075 139814862706560 predict_main.py:89] 100 examples processed, 100 converted to tf.Example.

@ekQ
Copy link
Collaborator

ekQ commented Nov 27, 2019

Hi and sorry for the slow reply! Having a Colab would indeed be very useful but atm I don't have time to do that. However, if you'd like to create a pull request, I'd be very happy to review it.

Regarding slow inference: This is indeed an issue and the expected behavior when you run the code as such. Internally, we heavily parallelize inference so it's not an issue in that case. To make it faster, one should ideally increase the batch size (currently it's 1 [*]) which requires small code changes.

A quicker fix is to use LaserTaggerFF by setting use_t2t_decoder to false in configs/lasertagger_config.json. This should already make prediction about 40 times faster (at least on GPU). This may hurt the accuracy slightly but not radically at least in our experiments.

[*] https://github.com/google-research/lasertagger/blob/master/predict_utils.py#L57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants