Can we run this in Google Colab? #11

logfella · 2019-11-13T20:26:09Z

hey I would love to try this out but I'm not very proficient with ML.

is it possible to run the biggest T5 model on a Google Colab Notebook? did anyone set one up? thanks!

adarob · 2019-11-13T20:47:13Z

It is possible to do inference on the 11B model in a Colab and to train/fine-tune the 3B param model using the free TPU. We will be releasing a notebook in the near future.

If you launch a larger TPU in Cloud, you can connect to it and train the 11B model via Colab.

anatoly-khomenko · 2019-11-18T22:28:02Z

Hi @adarob ,

I was trying to fine-tune the 11B model in google cloud and got out of memory error on TPU.
Is there anything to change in the parameters?

Here is how I run the fine tuning:

`export PROJECT=projectname
export ZONE=us-central1-b
export BUCKET=gs://uniquebucketname
export TPU_NAME=t5-ex2
export DATA_DIR="${BUCKET}/t5-boolq-data-dir"
export MODEL_DIR="${BUCKET}/t5_boolq-small-model_dir"

ctpu up --name=$TPU_NAME --project=$PROJECT --zone=$ZONE --tpu-size=v3-8 --tpu-only --tf-version=1.15.dev20190821

t5_mesh_transformer --tpu="${TPU_NAME}" --gcp_project="${PROJECT}" --tpu_zone="${ZONE}" --model_dir="${MODEL_DIR}" --t5_tfds_data_dir="${DATA_DIR}" --gin_file="dataset.gin" --gin_param="utils.tpu_mesh_shape.model_parallelism = 1" --gin_param="utils.tpu_mesh_shape.tpu_topology = '2x2'" --gin_param="MIXTURE_NAME = 'super_glue_boolq_v102'" --gin_file="gs://t5-data/pretrained_models/11B/operative_config.gin"`

The complete stack trace is attached:

T5-11B-TPU-stack-trace.txt

alespeggio · 2019-11-21T10:52:47Z

It is possible to do inference on the 11B model in a Colab and to train/fine-tune the 3B param model using the free TPU. We will be releasing a notebook in the near future.

If you launch a larger TPU in Cloud, you can connect to it and train the 11B model via Colab.

Hi @adarob,

I'm trying to fine-tune the T5-small pre-trained model (60 million parameters) on Google Colab (with free TPU) on a custom dataset. However, even if I use an extremely small dataset, the notebook runs out of RAM (35GB on Google Colab). You said that it is possible to fine-tune the 3B param model using free TPU. Hence, I'm wondering if I'm doing something wrong.

I describe below the list of commands that I run.
I install the package with the following command:
pip install t5[gcp]

and I execute this command to fine-tune the model with my dataset.
t5_mesh_transformer --model_dir="/content/small" --gin_file="dataset.gin" --gin_file="/content/small/operative_config.gin" --gin_param="utils.run.train_dataset_fn = @t5.models.mesh_transformer.tsv_dataset_fn" --gin_param="tsv_dataset_fn.filename = 'custom_dataset.tsv'" --gin_file="learning_rate_schedules/constant_0_001.gin" --gin_param="run.train_steps = 1010000"

Have others encountered the same issue?

Thank you!

t5-copybara closed this as completed in e1e406d Dec 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we run this in Google Colab? #11

Can we run this in Google Colab? #11

logfella commented Nov 13, 2019

adarob commented Nov 13, 2019 •

edited

Loading

anatoly-khomenko commented Nov 18, 2019

alespeggio commented Nov 21, 2019

Can we run this in Google Colab? #11

Can we run this in Google Colab? #11

Comments

logfella commented Nov 13, 2019

adarob commented Nov 13, 2019 • edited Loading

anatoly-khomenko commented Nov 18, 2019

alespeggio commented Nov 21, 2019

adarob commented Nov 13, 2019 •

edited

Loading