Fix graph executor to not dispose output tensors #7505

chunnienc · 2023-03-21T18:00:53Z

Fix #7504
Added a new test case for this and verified that the old executor fails the test.

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.

This change is

tfjs-converter/src/executor/graph_executor.ts

mattsoulanille · 2023-03-21T18:37:37Z

tfjs-converter/src/executor/graph_executor.ts

    }
-    if (liveUntilNodes == null) {
+
+    if (isControlFlow(node) || liveUntilNodes == null) {


I see you moved the check for node disposability to check the liveUntilNodes (this makes sense since those are the ones we're disposing). Do we need this isControlFlow(node) check? That means we'll never dispose any nodes that live until a control flow node if I'm not mistaken.

Is this to make sure we always have a control flow node's inputs available (e.g. in the case of a while)?

Is this to make sure we always have a control flow node's inputs available (e.g. in the case of a while)?

Yes. That is the behavior of the old algorithm. The branch here checks whether the current node allows us to dispose the input of it, while the checks in the loop are for whether we should dispose the input itself. See https://github.com/tensorflow/tfjs/pull/7505/files#diff-44e0a825cd7c6f31c03d9333db5dc21a8937d66a5b28d719c699863eef96ad8dR329

But the behavior is changed. The old function does not allow disposing inputs of an output node no matter it's control flow or not, while this PR allows it. I don't see any current or future problems can have with it.

Do we need this isControlFlow(node) check?

No, bc a graph with control flow should not be sync. But I still keep it in case the upstream graph builder has some special cases.

Co-authored-by: Matthew Soulanille <matthew@soulanille.net>

mattsoulanille

LGTM

pyu10055

thank you for fixing this bug, given this is not caught by our e2e model tests, can we add an e2e test for mobilenet graph model conversion to prevent future breakage.

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @chunnienc and @mattsoulanille)

chunnienc · 2023-03-22T17:26:14Z

thank you for fixing this bug, given this is not caught by our e2e model tests, can we add an e2e test for mobilenet graph model conversion to prevent future breakage.

Reviewable status: complete! 2 of 1 approvals obtained (waiting on @chunnienc and @mattsoulanille)

Sure. I'm looking into some ways to add golden model tests. Actually our existing standard models are not sufficient to catch this bug either. I did manual tests on our model set with local benchmark before our release, and all of them worked. This bug only occurs when the output nodes are not leaves or parents of outputs in the graph. So we may need to run golden model e2e tests + random inspect intermediate tensors.

vladmandic · 2023-03-22T18:34:46Z

This bug only occurs when the output nodes are not leaves or parents of leaves in the graph

Btw, isn't that somewhat typical for feature vectors?

chunnienc · 2023-03-22T18:55:38Z

Btw, isn't that somewhat typical for feature vectors?

Yes. But it turns out that all our (converted) models for testing and benchmarking have an additional "Identity" op node after each model output node, which makes them always leaves. It's something we need to fix on our side to make the tests more robust.

chunnienc added 2 commits March 21, 2023 10:58

fix

a3a3af1

fix lint

edcea52

chunnienc requested review from pyu10055 and mattsoulanille March 21, 2023 18:00

Merge branch 'master' into fix-output-disposal

1ea1fe7

chunnienc marked this pull request as ready for review March 21, 2023 18:02

mattsoulanille reviewed Mar 21, 2023

View reviewed changes

Update tfjs-converter/src/executor/graph_executor.ts

a74301b

Co-authored-by: Matthew Soulanille <matthew@soulanille.net>

mattsoulanille approved these changes Mar 21, 2023

View reviewed changes

pyu10055 approved these changes Mar 22, 2023

View reviewed changes

chunnienc merged commit 18eab05 into tensorflow:master Mar 22, 2023

chunnienc deleted the fix-output-disposal branch March 22, 2023 17:05

gaikwadrahul8 mentioned this pull request Apr 25, 2023

Tensor is disposed too early for FaceLandmarkDetection&architecture=attention_mesh #7445

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix graph executor to not dispose output tensors #7505

Fix graph executor to not dispose output tensors #7505

chunnienc commented Mar 21, 2023 •

edited

mattsoulanille Mar 21, 2023

chunnienc Mar 21, 2023 •

edited

mattsoulanille left a comment

pyu10055 left a comment

chunnienc commented Mar 22, 2023 •

edited

vladmandic commented Mar 22, 2023 •

edited by chunnienc

chunnienc commented Mar 22, 2023

Fix graph executor to not dispose output tensors #7505

Fix graph executor to not dispose output tensors #7505

Conversation

chunnienc commented Mar 21, 2023 • edited

mattsoulanille Mar 21, 2023

Choose a reason for hiding this comment

chunnienc Mar 21, 2023 • edited

Choose a reason for hiding this comment

mattsoulanille left a comment

Choose a reason for hiding this comment

pyu10055 left a comment

Choose a reason for hiding this comment

chunnienc commented Mar 22, 2023 • edited

vladmandic commented Mar 22, 2023 • edited by chunnienc

chunnienc commented Mar 22, 2023

chunnienc commented Mar 21, 2023 •

edited

chunnienc Mar 21, 2023 •

edited

chunnienc commented Mar 22, 2023 •

edited

vladmandic commented Mar 22, 2023 •

edited by chunnienc