How to set batch size of training? #21

desperadoola · 2019-12-08T11:08:50Z

When I try to change the batch size using --gin_param="sequences_per_batch=128" or --gin_param="tokens_per_batch=65536", the batch size seems always to be 32?

INFO:tensorflow:serialize_num_microbatches: tokens_per_microbatch_per_replica=2048 batch_dim=Dimension(name='batch', size=32) sequence_length={'inputs': 512, 'targets': 114} batch_per_replica=4 num_microbatches=1 I1208 11:05:22.407459 140391696871040 utils.py:1440] serialize_num_microbatches: tokens_per_microbatch_per_replica=2048 batch_dim=Dimension(name='batch', size=32) sequence_length={'inputs': 512, 'targets': 114} batch_per_replica=4 num_microbatches=1

The text was updated successfully, but these errors were encountered:

desperadoola · 2019-12-08T11:31:16Z

I successfully set it using --gin_param="utils.run.batch_size=('tokens_per_batch', 65536)"

craffel · 2019-12-08T19:08:50Z

Good work!

desperadoola · 2019-12-25T17:31:57Z

Is there any instruction on how to set tokens_per_microbatch_per_replica ?

craffel · 2019-12-25T21:10:22Z

You should only need to set tokens_per_microbatch_per_replica to something other than None if you want to use a batch size which is too large to fit in memory. Our training code will automatically split up too-large batches into microbatches and accumulate gradients so that the full batch size is computed.

@nshazeer FYI

desperadoola · 2019-12-26T02:20:34Z

Thanks, but still, I don't figure out serialize_num_microbatches and mtf.tensor_dim_to_size_per_split in mesh_tensorflow transformer.

Is the default value tokens_per_microbatch_per_replica=2048 ok for different settings, for example, when we change the model size and use different TPU?

The newer versions of the T5 library simply ignore `--gin_param="tokens_per_batch = 65536" \`: google-research/text-to-text-transfer-transformer#21

The newer versions of the T5 library simply ignore --gin_param="tokens_per_batch = 65536" \: google-research/text-to-text-transfer-transformer#21

craffel closed this as completed Dec 8, 2019

rodrigonogueira4 added a commit to castorini/pygaggle that referenced this issue Feb 4, 2021

Avoid ignored batch size argument

85f16e8

The newer versions of the T5 library simply ignore `--gin_param="tokens_per_batch = 65536" \`: google-research/text-to-text-transfer-transformer#21

rodrigonogueira4 mentioned this issue Feb 4, 2021

Avoid ignored batch size argument castorini/pygaggle#154

Merged

rodrigonogueira4 added a commit to castorini/pygaggle that referenced this issue Feb 4, 2021

Avoid ignored batch size argument (#154)

1965b58

The newer versions of the T5 library simply ignore `--gin_param="tokens_per_batch = 65536" \`: google-research/text-to-text-transfer-transformer#21

rodrigonogueira4 added a commit to castorini/pygaggle that referenced this issue Feb 4, 2021

Avoid ignored batch size argument - for duoT5

f3d7ac0

The newer versions of the T5 library simply ignore --gin_param="tokens_per_batch = 65536" \: google-research/text-to-text-transfer-transformer#21

rodrigonogueira4 mentioned this issue Feb 4, 2021

Avoid ignored batch size argument - for duoT5 castorini/pygaggle#155

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to set batch size of training? #21

How to set batch size of training? #21

desperadoola commented Dec 8, 2019

desperadoola commented Dec 8, 2019

craffel commented Dec 8, 2019

desperadoola commented Dec 25, 2019

craffel commented Dec 25, 2019

desperadoola commented Dec 26, 2019

How to set batch size of training? #21

How to set batch size of training? #21

Comments

desperadoola commented Dec 8, 2019

desperadoola commented Dec 8, 2019

craffel commented Dec 8, 2019

desperadoola commented Dec 25, 2019

craffel commented Dec 25, 2019

desperadoola commented Dec 26, 2019