Skip to content
This repository has been archived by the owner on Oct 17, 2021. It is now read-only.

Add new method dispose() to tf.layers.Layer and tf.Model #282

Merged
merged 5 commits into from
Aug 7, 2018

Conversation

caisq
Copy link
Contributor

@caisq caisq commented Aug 6, 2018

FEATURE

This allows a layer or a model to release the memory (e.g., WebGL
textures) allocated for its weights.

Usage example:

const model = tf.sequential();
model.add(tf.layers.dense({
  units: 10, activation: 'relu', inputShape: [20]}));
model.add(tf.layers.dense({units: 3, activation: 'softmax'}));
model.summary();

model.decRef();
// This disposes the four weights that belong to the model's layers.

We use reference counting because layers may be shared among multiple
models. Also, a model may be nested under other models.

When the reference counter of a layer or model decreases to zero, its
weights will be disposed. After that, the layer or model cannot be used
in apply, predict, evaluate, fit or save calls anymore.

Fixes: tensorflow/tfjs#533


This change is Reviewable

FEATURE

This allows a layer or a model to release the memory (e.g., WebGL
textures) allocated for its weights.

Usage example:

```js
const model = tf.sequential();
model.add(tf.layers.dense({
  units: 10, activation: 'relu', inputShape: [20]}));
model.add(tf.layers.dense({units: 3, activation: 'softmax'}));
model.summary();

model.decRef();
// This disposes the four weights that belong to the model's layers.
```

We use reference counting because layers may be shared among multiple
models. Also, a model may be nested under other models.

When the reference counter of a layer or model decreases to zero, its
weights will be disposed. After that, the layer or model cannot be used
in `apply`, `predict`, `evaluate`, `fit` or `save` calls anymore.

Fixes: tensorflow/tfjs#533
@caisq caisq changed the title Add decRef to tf.layer.Layer and tf.Model Add decRef to tf.layers.Layer and tf.Model Aug 6, 2018
@nsthorat
Copy link

nsthorat commented Aug 6, 2018 via email

Copy link
Contributor Author

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment, @nsthorat ! I considered the dispose. I think the downside is that it is potentially misleading. For example, when a Layer is shared among multiple Model instances, calling the method may not actually dispose the layer's weights. Similarly, when a sequential model is nested in another sequential model, calling the method will not immediately dispose the weights. This is different from the situation with core tensors and variables, for which calling the dispose method is guaranteed to free the memory IIUC.

So I would prefer either

  • keep the decRef() name, with the understanding that this is a pretty niche method that only a small group of hard-core clients of ours will deal with
  • name it something more consistent with dispose, but more truthful, e.g., tryDispose.

Let me know what you think.

Reviewable status: 0 of 1 approvals obtained (waiting on @nsthorat, @davidsoergel, and @dsmilkov)

@nsthorat
Copy link

nsthorat commented Aug 6, 2018 via email

Copy link
Contributor Author

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds a little potentially misleading :p Looking at the doc string at https://js.tensorflow.org/api/0.12.0/#dispose, it doesn't mention the decRef behavior, which I think it should. Maybe let's discuss in person what to do with this method's name.

Reviewable status: 0 of 1 approvals obtained (waiting on @nsthorat, @davidsoergel, and @dsmilkov)

@nsthorat
Copy link

nsthorat commented Aug 6, 2018 via email

@caisq caisq changed the title Add decRef to tf.layers.Layer and tf.Model Add new method dispose() to tf.layers.Layer and tf.Model Aug 6, 2018
Copy link
Contributor Author

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After offline discussion, I'm renaming decRef to dispose. Rationale

  1. Consistent with the dispose method naming in core. Less scary looking name.
  2. The special case of shared Layers and nested Sequential models are relatively rare.
  3. Those special cases are now covered in the doc string.

PTAL.

Reviewable status: 0 of 1 approvals obtained (waiting on @nsthorat, @davidsoergel, and @dsmilkov)

Copy link

@nsthorat nsthorat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM -- quick question, where is the actual model.dispose()? Is that container.ts?

Reviewable status: 0 of 1 approvals obtained (waiting on @nsthorat, @davidsoergel, and @dsmilkov)

@caisq
Copy link
Contributor Author

caisq commented Aug 6, 2018

@nsthorat To answer your question, it is in container.ts under class Container. It is inherited by Model and Sequential.

Copy link

@nsthorat nsthorat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 7 files at r1, 5 of 5 files at r2.
Reviewable status: 0 of 1 approvals obtained (waiting on @caisq, @davidsoergel, and @dsmilkov)


src/engine/container.ts, line 641 at r2 (raw file):

  /**
   * Attempt to dispose a Container's weights.

instead of saying "Container" can you say "Model"? This will show better in the docs (Container is an implementation detail).

Also, this first line should just say something like "Disposes a model, freeing up used memory that is not shared with another model." since that's exactly what it's doing. The fact that another model may share weights doesn't reduce that fact that the first model is actually disposed.

Then after that we can talk about the refcounting stuff

Copy link
Contributor Author

@caisq caisq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @caisq, @davidsoergel, and @dsmilkov)


src/engine/container.ts, line 641 at r2 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

instead of saying "Container" can you say "Model"? This will show better in the docs (Container is an implementation detail).

Also, this first line should just say something like "Disposes a model, freeing up used memory that is not shared with another model." since that's exactly what it's doing. The fact that another model may share weights doesn't reduce that fact that the first model is actually disposed.

Then after that we can talk about the refcounting stuff

Good point. Done.

@caisq caisq merged commit 202127d into tensorflow:master Aug 7, 2018
Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sorry I just saw this. This solution is ok, but adds memory management complexity in layers.

An alternative (a low maintenance approach) would have been to have model.clone() which calls layer.clone() which calls v.clone() for each variable v of a given layer. Then calling dispose() on a model would just call dispose on its layers, which calls dispose() on the variables used by those layers. This way you don't have to do any ref counting, since all of the ref counts are impl detail handled in tfjs-core via the .clone() / dispose() mechanism. This way you also have a symmetry of clone()<--->dispose().

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @nsthorat, @davidsoergel, and @dsmilkov)

@caisq
Copy link
Contributor Author

caisq commented Aug 10, 2018

@dsmilkov Thanks for the comment, @dsmilkov . Let's discuss this in person when we meet.

@nsthorat
Copy link

nsthorat commented Aug 13, 2018

Just spoke to Daniel offline -- as far as I understand, that approach unfortunately won't work because the layer itself is shared. Multiple models can both update the same weights.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants