Skip to content
This repository has been archived by the owner on Aug 15, 2019. It is now read-only.

Optimize tf.tidy() and tf.keep(). #1621

Merged
merged 3 commits into from
Mar 12, 2019
Merged

Optimize tf.tidy() and tf.keep(). #1621

merged 3 commits into from
Mar 12, 2019

Conversation

nsthorat
Copy link
Contributor

@nsthorat nsthorat commented Mar 12, 2019

We do this by:

  • Removing global tf.keep() tracking. Instead, we add a "kept" bit to tensors and we implicitly remove kept tensors from tracking mechanisms. When we are going to track in a parent scope, we avoid doing that if the kept bit is set on the tensor.
  • Add an integer ID to every scope. track sets the tensor's scope ID and then when we're checking whether the tensor belongs to the current scope when tracking in a parent, we simply check that id matching the current scope ID.

No unit tests added since we have pretty serious coverage over memory and this is an internal optimization.


This change is Reviewable

@nsthorat nsthorat changed the title Optimize tf.keep(). Optimize tf.tidy() and tf.keep(). Mar 12, 2019
@nsthorat nsthorat requested review from dsmilkov and caisq March 12, 2019 20:35
Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @caisq and @dsmilkov)

@nsthorat nsthorat merged commit 3ed9016 into master Mar 12, 2019
@nsthorat nsthorat deleted the keep branch March 12, 2019 20:58
dsmilkov added a commit that referenced this pull request Mar 13, 2019
BUG
Improve memory management of tensors during training. Op authors now explicitly save intermediate tensors that are needed for backwards mode. This allows the engine to optimally dispose memory during forward pass, and keep only tensors needed for the backward pass.

Based on the [layers benchmark](https://github.com/tensorflow/tfjs-layers/tree/v1.0.0/integration_tests/benchmarks), this change along with #1621 led to:

## 2-3X memory reduction
**Before**
![before-mem](https://user-images.githubusercontent.com/2294279/54299829-13b9db80-4592-11e9-97a4-04a95012b5c4.png)

**After**
![after-mem](https://user-images.githubusercontent.com/2294279/54299854-203e3400-4592-11e9-8887-62165d0bfbbb.png)


## 1.5-1.7x improvement in fit() for GRU and LSTM ops.

**Before** 
![before](https://user-images.githubusercontent.com/2294279/54299715-d5242100-4591-11e9-8a4c-b5944e991f57.png)

**After**
![after](https://user-images.githubusercontent.com/2294279/54299724-d9e8d500-4591-11e9-8ea9-e340e10a41ce.png)

 

- When the user writes the forward pass of an op, they are given a `save` function that allows them to save inputs or intermediate tensors to be reused for the backwards pass
- Before this change, the `save` function was a no-op, a placeholder for when we decide to optimize disposal of tensors in the training process in the future.
- However, `save` being a no-op caused a bug for existing users who rely on it (e.g. Magenta.js).
- After this change, the `save` function makes a shallow copy of the tensor, and keeps it until the backwards pass is done.
- `save` used to take an array of tensors. Now it takes a `NamedTensorMap` which improves code readability, and reduces chances of off-by-one index bugs.

Fixes tensorflow/tfjs#1320

PERF
BUG
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants