Skip to content
This repository has been archived by the owner on Nov 3, 2022. It is now read-only.

100% original Adam that allows to train x2 or x3 bigger networks. #478

Closed
wants to merge 3 commits into from
Closed

Conversation

iperov
Copy link

@iperov iperov commented Mar 13, 2019

- What I did

added 100% original Adam optimizer with new option

tf_cpu_mode: only for tensorflow backend
0 - default, no changes.
1 - allows to train x2 bigger network on same VRAM consuming RAM
2 - allows to train x3 bigger network on same VRAM consuming RAM*2 and CPU power.

Batch size is very important parameter for GAN networks. So getting rid of optimizer's weights from VRAM, we can train higher batch size, sacrificing 10-20% of time per iteration.

- How I did it

accidentally discovered.

- How you can verify it

add tf_cpu_mode=1 to Adam
and try x2 bigger network

@RaphaelMeudec
Copy link
Contributor

@iperov Hello, thanks for your PR! However, I don't think this should be a keras-contrib feature. Fix for Adam optimizer should be directy integrated in Keras to avoid a duplicate between keras and keras-contrib. Please consider opening a PR on keras-team/keras !

@iperov
Copy link
Author

iperov commented Mar 28, 2019

Adam optimizer should be directy integrated in Keras

I dont think so, because Keras is a standard, that other frameworks should implement.
Theano, plaidML, and others cannot implement the function of placing and working with tensors on CPU.

Close then.

@iperov iperov closed this Mar 28, 2019
@RaphaelMeudec
Copy link
Contributor

Adam is already on keras here.
Keras-contrib is just an extension of Keras, meant to test new features before eventually integrating them into keras.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants