100% original Adam that allows to train x2 or x3 bigger networks. #478

iperov · 2019-03-13T17:05:22Z

- What I did

added 100% original Adam optimizer with new option

tf_cpu_mode: only for tensorflow backend
0 - default, no changes.
1 - allows to train x2 bigger network on same VRAM consuming RAM
2 - allows to train x3 bigger network on same VRAM consuming RAM*2 and CPU power.

Batch size is very important parameter for GAN networks. So getting rid of optimizer's weights from VRAM, we can train higher batch size, sacrificing 10-20% of time per iteration.

- How I did it

accidentally discovered.

- How you can verify it

add tf_cpu_mode=1 to Adam
and try x2 bigger network

RaphaelMeudec · 2019-03-28T12:50:51Z

@iperov Hello, thanks for your PR! However, I don't think this should be a keras-contrib feature. Fix for Adam optimizer should be directy integrated in Keras to avoid a duplicate between keras and keras-contrib. Please consider opening a PR on keras-team/keras !

iperov · 2019-03-28T12:56:41Z

Adam optimizer should be directy integrated in Keras

I dont think so, because Keras is a standard, that other frameworks should implement.
Theano, plaidML, and others cannot implement the function of placing and working with tensors on CPU.

Close then.

RaphaelMeudec · 2019-03-28T13:00:19Z

Adam is already on keras here.
Keras-contrib is just an extension of Keras, meant to test new features before eventually integrating them into keras.

iperov added 3 commits March 13, 2019 20:59

100% original Adam that allows to train x2 or x3 bigger networks.

89d8dc0

fix for travis

8045ed6

fix for travis

5933285

iperov mentioned this pull request Mar 17, 2019

suggestion: allow to train x2 or x3 bigger networks on same vram with TF backend titu1994/keras-adabound#6

Closed

iperov closed this Mar 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

100% original Adam that allows to train x2 or x3 bigger networks. #478

100% original Adam that allows to train x2 or x3 bigger networks. #478

iperov commented Mar 13, 2019 •

edited

RaphaelMeudec commented Mar 28, 2019

iperov commented Mar 28, 2019

RaphaelMeudec commented Mar 28, 2019

100% original Adam that allows to train x2 or x3 bigger networks. #478

100% original Adam that allows to train x2 or x3 bigger networks. #478

Conversation

iperov commented Mar 13, 2019 • edited

RaphaelMeudec commented Mar 28, 2019

iperov commented Mar 28, 2019

RaphaelMeudec commented Mar 28, 2019

iperov commented Mar 13, 2019 •

edited