Reduction in memory requirements: Add SplitInitializer for separate initialization #4

marhlder · 2018-03-25T21:07:40Z

This dramatically reduces memory requirements, as there will no longer be kept an extra copy of the concatenated weight tensor for each timestep (During backprop)

This dramatically reduces memory requirements, as there will no longer be kept a copy of the weight tensor for each timestep

hannw · 2018-04-30T14:44:01Z

Hi @marhlder, would you mind explaining a bit more how this works? Just from the code, I do not quite understand how this would reduce the memory requirement. Since in the original code, the kernels are also concatenated. Specifically, why does memory consumption relate to the initializer? My understanding is that using dynamic_rnn will prevent the copying from happening.

marhlder · 2018-05-03T20:25:48Z

@hannw Thx for your response. The problem is that the default backpropagation code in TensorFlow will save a copy of the concatenated weight tensor (Kernel) for each timestep (In the original code), as the concatenation op will become a part of the graph and run for each timestep. You are correct that dynamic RNN wont make extra copies of the individual kernels, only the concatenated results. The concatenation will only happen once when using the provided custom initializer, which in turn will no longer require the backpropagation code to keep this extra intermediate value for each timestep. This is not noticeable for networks with few units in each layer and short sequences, but it does become very noticeable once you turn up the heat, e.g. sequences of length 200+, nesting level 3, and 512 units in each layer. Try for instance to compare the memory consumption of the original implementation, nesting level of 3, with 3 layers of regular stacked LSTM.

add SplitInitializer for separate initialization

cb76976

This dramatically reduces memory requirements, as there will no longer be kept a copy of the weight tensor for each timestep

marhlder changed the title ~~add SplitInitializer for separate initialization~~ Reduction in memory requirements: add SplitInitializer for separate initialization Mar 25, 2018

marhlder changed the title ~~Reduction in memory requirements: add SplitInitializer for separate initialization~~ Reduction in memory requirements: Add SplitInitializer for separate initialization Mar 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduction in memory requirements: Add SplitInitializer for separate initialization #4

Reduction in memory requirements: Add SplitInitializer for separate initialization #4

marhlder commented Mar 25, 2018 •

edited

hannw commented Apr 30, 2018

marhlder commented May 3, 2018 •

edited

Reduction in memory requirements: Add SplitInitializer for separate initialization #4

Are you sure you want to change the base?

Reduction in memory requirements: Add SplitInitializer for separate initialization #4

Conversation

marhlder commented Mar 25, 2018 • edited

hannw commented Apr 30, 2018

marhlder commented May 3, 2018 • edited

marhlder commented Mar 25, 2018 •

edited

marhlder commented May 3, 2018 •

edited