Skip to content
This repository has been archived by the owner on Oct 15, 2019. It is now read-only.

[WIP] fixing memory issue in RNN #112

Merged
merged 10 commits into from
Jan 2, 2017
Merged

[WIP] fixing memory issue in RNN #112

merged 10 commits into from
Jan 2, 2017

Conversation

jermainewang
Copy link
Member

@jermainewang jermainewang commented Dec 19, 2016

Main reason of RNN memory issue:

  • We are not using the backward order to compute gradient. This leaves many intermediate values saved in the tape.

Changes:

  • The tape.get_gradient now takes a tuple of arrays as input and return the tuple as results.
    • This could be further simplified however. The core.grad function is self-contained (all the inputs and outputs are specified through the interface). Therefore the tape.get_gradient could take no arguments since all arguments required to get gradient must be the root node of the tape.
  • The tape.get_gradient function should implement a reverse order from the last node to the root node. The forward path could be saved since it is already done when generating grad_records.

@hotpxl

1. Tape autograd logic.
2. Use generated id rather than builtin id function for tape.
3. Fix bug in tape: gradient += is also recorded in tape.
Conflicts:
	minpy/primitive.py
	minpy/tape.py
@jermainewang
Copy link
Member Author

Everything on minpy side should be fixed in this PR right now. @lryta @ZihengJiang @hotpxl please have a look. @HrWangChengdu would you mind help me rerun all the examples to see whether they are still working?

@jermainewang jermainewang merged commit bfe1328 into master Jan 2, 2017
@jermainewang jermainewang deleted the rnn-memory branch January 2, 2017 15:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant