Dynamic Batch Sizes #183

twerkmeister · 2019-04-29T08:44:25Z

During some of my trainings I noticed, that the memory consumption differs greatly as the sentence size increases within an epoch. For short sentences in the beginning, TTS only consumed about 3GB, but then later for the long example sentences over 7,4GB. I guess a fixed batch size comes from tasks that deal with fixed size tensors, like in image classification. Given that batch size seems important for learning attention, it might be worth experimenting with dynamic batch sizes. That can probably double the batch size for medium sized sentences.

Any thoughts?

erogol · 2019-04-29T13:41:50Z

@twerkmeister makes sense. My only concern is the learning rate. If the initial batch size vs the last batch size in an epoch is too different, it might turbulence the training. Anyhow, it is worth to try.

twerkmeister · 2019-04-29T14:05:46Z

Yeah let's keep it in mind as an interesting experiment

erogol · 2019-11-12T00:09:08Z

It is partially implemented. Not considering the sequence length and active memory usage but I don't see any use for now.

erogol added this to In Progress in v0.0.1 Jul 11, 2019

erogol moved this from In Progress to TODO in v0.0.1 Sep 11, 2019

erogol closed this as completed Nov 12, 2019

v0.0.1 automation moved this from TODO to Done Nov 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic Batch Sizes #183

Dynamic Batch Sizes #183

twerkmeister commented Apr 29, 2019

erogol commented Apr 29, 2019

twerkmeister commented Apr 29, 2019

erogol commented Nov 12, 2019

Dynamic Batch Sizes #183

Dynamic Batch Sizes #183

Comments

twerkmeister commented Apr 29, 2019

erogol commented Apr 29, 2019

twerkmeister commented Apr 29, 2019

erogol commented Nov 12, 2019