Split NonMaxSuppressionV3 and NonMaxSuppresionV5 to two APIs #2557

lina128 · 2019-12-14T01:25:11Z

Because in the C implementation, the V3 has 1 output but the V5 has 3 outputs, we have to define different APIs for them. Otherwise, models that uses NonMaxSuppressionV5 may break because subsequent Op may need to access the second or the third output.

Changes made:

Add one more API for V5 in backend.ts
Add one more API for V5 in image_op.ts
Add corresponding implementations for CPU and GPU.
Add corresponding tests.

Note:

@kangyizhang identified this issue, because the Node test break, expecting different output shape. After this change, tested Node code locally and it passed.
Python API also splits into two APIs (non_max_suppression and non_max_suppression_with_score). tfjs naming follows the Python API naming found here: https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression
This implementation naturally includes V4. Will add V4 API in another PR. V4 will also require a different API because the output is also different.

This change is

pyu10055

thanks for fixing this issue.

Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @lina128, and @pyu10055)

tfjs-core/src/backends/backend.ts, line 600 at r1 (raw file):

      boxes: Tensor2D, scores: Tensor1D, maxOutputSize: number,
      iouThreshold: number, scoreThreshold?: number,
      softNmsSigma?: number): [Tensor1D, Tensor1D, Scalar] {

It might be better to define a interface for the output, otherwise user might not know the meaning of each tensor.

tfjs-core/src/backends/non_max_suppression_impl.ts, line 34 at r1 (raw file):

}

export function nonMaxSuppressionV3(

the method names might be good to be consistent with the external APIs.

dsmilkov

Really nice work!! One blocking question about the user-facing https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression_with_scores having 2 outputs instead of 3.

Reviewed 6 of 6 files at r1.
Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @lina128, and @pyu10055)

tfjs-core/src/backends/backend.ts, line 597 at r1 (raw file):

  }

  nonMaxSuppressionWithScore(

Going forward we decided to add any new kernels in a modular way in order to avoid the ever growing library size. This means that instead of adding a new method to the Backend interface, we add the new kernels in standalone files and call registerKernel(). See src/backends/webgl/square.ts and src/backends/cpu/square.ts for examples of modularized Square. You don't have to modularize in this PR, but I think we should modularize before we release new tfjs-core. This way, after we release new tfjs-core, tfjs-node just needs to call registerKernel with its own implementation.

tfjs-core/src/backends/backend.ts, line 600 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

It might be better to define a interface for the output, otherwise user might not know the meaning of each tensor.

Array of tensors is ok here since this is a kernel (internal api) and the inference and backprop infra expects tensor|tensor[] as a result of a kernel, analogous to TF C++ kernels which return array of tensors.

tfjs-core/src/ops/image_ops.ts, line 239 at r1 (raw file):

 *     - A 1D tensor with the corresponding scores for each selected box.
 *     - A number representing the number of valid elements in the selected
 *       box indices.

The user-facing https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression_with_scores has 2 outputs instead of 3. Let's match that API at the user-facing level.

nsthorat

Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @lina128, and @pyu10055)

tfjs-core/src/backends/backend.ts, line 597 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Going forward we decided to add any new kernels in a modular way in order to avoid the ever growing library size. This means that instead of adding a new method to the Backend interface, we add the new kernels in standalone files and call registerKernel(). See src/backends/webgl/square.ts and src/backends/cpu/square.ts for examples of modularized Square. You don't have to modularize in this PR, but I think we should modularize before we release new tfjs-core. This way, after we release new tfjs-core, tfjs-node just needs to call registerKernel with its own implementation.

I'll lightly push towards doing it in this PR, of course still optional! :) Na, if the way this gets done is confusing let us know and we can sync with you on a GVC!

tfjs-core/src/ops/image_ops.ts, line 239 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

The user-facing https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression_with_scores has 2 outputs instead of 3. Let's match that API at the user-facing level.

also I think we should return an object with named return values (an object). It's fine to keep the kernel an array.

tfjs-core/src/ops/image_ops_test.ts, line 251 at r1 (raw file):

      const scoreThreshold = 0;
      const softNmsSigma = 0.5;
      const indices = await tf.image.nonMaxSuppressionWithScoreAsync(

can you also add a test for memory here? Since we don't wrap async methods in "op" we have to be careful with tensor leaking manually (e.g. make sure tf.memory().numTensors before and after the call only increase tensors by 1, the output)

lina128 · 2019-12-16T23:11:54Z

thanks for fixing this issue.

Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @lina128, and @pyu10055)

tfjs-core/src/backends/backend.ts, line 600 at r1 (raw file):
      boxes: Tensor2D, scores: Tensor1D, maxOutputSize: number,
      iouThreshold: number, scoreThreshold?: number,
      softNmsSigma?: number): [Tensor1D, Tensor1D, Scalar] {
It might be better to define a interface for the output, otherwise user might not know the meaning of each tensor.

tfjs-core/src/backends/non_max_suppression_impl.ts, line 34 at r1 (raw file):
}

export function nonMaxSuppressionV3(
the method names might be good to be consistent with the external APIs.

Hi Ping, thanks for the review! Changed to use NamedTensorMap for the output. As for method names, I keep it in sync with wasm method name: https://github.com/tensorflow/tfjs/blob/master/tfjs-backend-wasm/src/kernels/NonMaxSuppressionV3.ts and TF C++ version: https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/non-max-suppression-v5
I don't have a strong preference on whether the internal kernel name should be consistent with external APIs, as long as we have some consistency within our library. Maybe it makes sense for now, to keep our kernel names to be in sync with C version's kernel names, and keep our external API names to be in sync with Python version's API names? This helps us identify amount of feature parity at each level and help tracing reference.

lina128 · 2019-12-16T23:21:05Z

Really nice work!! One blocking question about the user-facing https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression_with_scores having 2 outputs instead of 3.

Reviewed 6 of 6 files at r1.
Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @lina128, and @pyu10055)

tfjs-core/src/backends/backend.ts, line 597 at r1 (raw file):
  }

  nonMaxSuppressionWithScore(
Going forward we decided to add any new kernels in a modular way in order to avoid the ever growing library size. This means that instead of adding a new method to the Backend interface, we add the new kernels in standalone files and call registerKernel(). See src/backends/webgl/square.ts and src/backends/cpu/square.ts for examples of modularized Square. You don't have to modularize in this PR, but I think we should modularize before we release new tfjs-core. This way, after we release new tfjs-core, tfjs-node just needs to call registerKernel with its own implementation.

tfjs-core/src/backends/backend.ts, line 600 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…
Array of tensors is ok here since this is a kernel (internal api) and the inference and backprop infra expects tensor|tensor[] as a result of a kernel, analogous to TF C++ kernels which return array of tensors.

tfjs-core/src/ops/image_ops.ts, line 239 at r1 (raw file):
 *     - A 1D tensor with the corresponding scores for each selected box.
 *     - A number representing the number of valid elements in the selected
 *       box indices.
The user-facing https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression_with_scores has 2 outputs instead of 3. Let's match that API at the user-facing level.

Ah, really nice catch! Thank you Daniel! Changed output to 2. The numValidOutputs is only for V4, and for V5, the flag to pad the result to the maxOutputSize is always false, therefore, no need for this output. I made sure the unused tensor is disposed. Also changed to write the new Op using the modular pattern, that design is really smart, it makes it more flexible to name function than the interface method. Also explicitly registering Ops makes it more readable and more flexible to manage Ops. Thank you for letting me know this new pattern!

lina128 · 2019-12-16T23:25:04Z

Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @lina128, and @pyu10055)

tfjs-core/src/backends/backend.ts, line 597 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…
I'll lightly push towards doing it in this PR, of course still optional! :) Na, if the way this gets done is confusing let us know and we can sync with you on a GVC!

tfjs-core/src/ops/image_ops.ts, line 239 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…
also I think we should return an object with named return values (an object). It's fine to keep the kernel an array.

tfjs-core/src/ops/image_ops_test.ts, line 251 at r1 (raw file):
      const scoreThreshold = 0;
      const softNmsSigma = 0.5;
      const indices = await tf.image.nonMaxSuppressionWithScoreAsync(
can you also add a test for memory here? Since we don't wrap async methods in "op" we have to be careful with tensor leaking manually (e.g. make sure tf.memory().numTensors before and after the call only increase tensors by 1, the output)

Hi Nikhil, thank you for your review! Changed to the modular pattern. Also changed the output to use NamedTensorMap. And thanks for asking to test memory leak, I added the test and it did keep unnecessary Tensor because the generic implementation generates three Tensors, whereas V5 only needs two. I disposed the unused Tensor and it passed the memory test.

lina128 · 2019-12-16T23:26:18Z

Hi @pyu10055 @dsmilkov @nsthorat , thanks for your great comments! I changed the new op to the modular pattern and the output to named object. Added test for memory leak. Please re-review the PR. Thank you!

pyu10055

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @dsmilkov, @lina128, @nsthorat, and @pyu10055)

tfjs-core/src/backends/non_max_suppression_impl.ts, line 34 at r1 (raw file):

Previously, lina128 (Na Li) wrote…

Hi Ping, thanks for the review! Changed to use NamedTensorMap for the output. As for method names, I keep it in sync with wasm method name: https://github.com/tensorflow/tfjs/blob/master/tfjs-backend-wasm/src/kernels/NonMaxSuppressionV3.ts and TF C++ version: https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/non-max-suppression-v5 I don't have a strong preference on whether the internal kernel name should be consistent with external APIs, as long as we have some consistency within our library. Maybe it makes sense for now, to keep our kernel names to be in sync with C version's kernel names, and keep our external API names to be in sync with Python version's API names? This helps us identify amount of feature parity at each level and help tracing reference.

this name sounds good, since we are going to register the kernel for these ops, it would be better to use the TF kernel name.

tfjs-core/src/backends/webgl/non_max_suppression_with_score.ts, line 38 at r3 (raw file):

registerKernel({
  kernelName: 'NonMaxSuppressionWithScore',

It would be better to use the TF op name.

lina128 · 2019-12-17T21:07:06Z

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @dsmilkov, @lina128, @nsthorat, and @pyu10055)

tfjs-core/src/backends/non_max_suppression_impl.ts, line 34 at r1 (raw file):

Previously, lina128 (Na Li) wrote…
this name sounds good, since we are going to register the kernel for these ops, it would be better to use the TF kernel name.

tfjs-core/src/backends/webgl/non_max_suppression_with_score.ts, line 38 at r3 (raw file):
registerKernel({
  kernelName: 'NonMaxSuppressionWithScore',
It would be better to use the TF op name.

Agreed. Done.

dsmilkov

Really nice work! I left one small API suggestion. Otherwise LGTM!

Reviewed 5 of 10 files at r2, 5 of 5 files at r4.
Reviewable status: complete! 2 of 1 approvals obtained (waiting on @lina128, @nsthorat, and @pyu10055)

tfjs-core/src/ops/image_ops.ts, line 239 at r1 (raw file):

Previously, lina128 (Na Li) wrote…

Ah, really nice catch! Thank you Daniel! Changed output to 2. The numValidOutputs is only for V4, and for V5, the flag to pad the result to the maxOutputSize is always false, therefore, no need for this output. I made sure the unused tensor is disposed. Also changed to write the new Op using the modular pattern, that design is really smart, it makes it more flexible to name function than the interface method. Also explicitly registering Ops makes it more readable and more flexible to manage Ops. Thank you for letting me know this new pattern!

Awesome!

tfjs-core/src/ops/image_ops.ts, line 264 at r4 (raw file):

                     attrs) as Tensor[];

  return {selectedIndices: result[0], selectedScores: result[1]};

tiny API suggestion: indices and scores to make it shorter since "selected" is implicit given what this method does.

lina128 · 2019-12-17T23:23:03Z

Really nice work! I left one small API suggestion. Otherwise LGTM!

Reviewed 5 of 10 files at r2, 5 of 5 files at r4.
Reviewable status: complete! 2 of 1 approvals obtained (waiting on @lina128, @nsthorat, and @pyu10055)

tfjs-core/src/ops/image_ops.ts, line 239 at r1 (raw file):

Previously, lina128 (Na Li) wrote…
Awesome!

tfjs-core/src/ops/image_ops.ts, line 264 at r4 (raw file):
                     attrs) as Tensor[];

  return {selectedIndices: result[0], selectedScores: result[1]};
tiny API suggestion: indices and scores to make it shorter since "selected" is implicit given what this method does.

Agreed. I used this naming mainly to keep in sync with Python API's naming: https://www.tensorflow.org/api_docs/python/tf/image/non_max_suppression_with_scores?version=stable

…rn to create new op; Fix output to two instead of three

lina128 requested review from dsmilkov and pyu10055 December 14, 2019 01:25

googlebot added the cla: yes label Dec 14, 2019

pyu10055 requested changes Dec 15, 2019

View reviewed changes

dsmilkov suggested changes Dec 16, 2019

View reviewed changes

nsthorat reviewed Dec 16, 2019

View reviewed changes

lina128 force-pushed the core branch from d7259d3 to 2456b0f Compare December 16, 2019 22:47

pyu10055 approved these changes Dec 17, 2019

View reviewed changes

lina128 force-pushed the core branch from 5d4b34b to 6dd9afd Compare December 17, 2019 21:06

dsmilkov approved these changes Dec 17, 2019

View reviewed changes

lina128 added 5 commits December 19, 2019 12:24

Split NonMaxSuppressionV3 and NonMaxSuppresionV5 to two APIs

b29187b

Change output to named object instead of array; Use modularized patte…

817de6b

…rn to create new op; Fix output to two instead of three

Add memory leak test for async method.

6700aaf

Fix lint

ac933fa

Use same kernal name as TF C++ version

026409b

lina128 force-pushed the core branch from c3b228f to 026409b Compare December 19, 2019 20:25

Merge branch 'master' into core

2bc4b8e

lina128 merged commit e5a0c84 into tensorflow:master Dec 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Split NonMaxSuppressionV3 and NonMaxSuppresionV5 to two APIs #2557

Split NonMaxSuppressionV3 and NonMaxSuppresionV5 to two APIs #2557

Uh oh!

lina128 commented Dec 14, 2019 •

edited

Loading

Uh oh!

pyu10055 left a comment

Uh oh!

dsmilkov left a comment

Uh oh!

nsthorat left a comment

Uh oh!

lina128 commented Dec 16, 2019

Uh oh!

lina128 commented Dec 16, 2019

Uh oh!

lina128 commented Dec 16, 2019

Uh oh!

lina128 commented Dec 16, 2019

Uh oh!

pyu10055 left a comment

Uh oh!

lina128 commented Dec 17, 2019

Uh oh!

dsmilkov left a comment

Uh oh!

lina128 commented Dec 17, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Split NonMaxSuppressionV3 and NonMaxSuppresionV5 to two APIs #2557

Split NonMaxSuppressionV3 and NonMaxSuppresionV5 to two APIs #2557

Uh oh!

Conversation

lina128 commented Dec 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

dsmilkov left a comment

Choose a reason for hiding this comment

Uh oh!

nsthorat left a comment

Choose a reason for hiding this comment

Uh oh!

lina128 commented Dec 16, 2019

Uh oh!

lina128 commented Dec 16, 2019

Uh oh!

lina128 commented Dec 16, 2019

Uh oh!

lina128 commented Dec 16, 2019

Uh oh!

pyu10055 left a comment

Choose a reason for hiding this comment

Uh oh!

lina128 commented Dec 17, 2019

Uh oh!

dsmilkov left a comment

Choose a reason for hiding this comment

Uh oh!

lina128 commented Dec 17, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lina128 commented Dec 14, 2019 •

edited

Loading