Skip to content

Conversation

@lina128
Copy link
Collaborator

@lina128 lina128 commented Jun 16, 2020

Implement op, kernels and mapping for dilation2d. This PR fixes issue #3039

This PR has below changes:

  1. Implement op dilation2d in tfjs-core.
  2. Implement cpu kernel and node kernel. cpu implementation is for implementation reference. node implementation is for validating with tensorflow c++ results.
  3. Add mapping in tfjs-converter.

This change is Reviewable

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 26 of 29 files at r1, 5 of 5 files at r2.
Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @pyu10055, and @tafsiri)


tfjs-backend-cpu/src/kernels/Dilation2D.ts, line 66 at r2 (raw file):

        util.makeZerosNestedTypedArray(outShape, x.dtype) as number[][][][];

    for (let b = 0; b < batchSize; ++b) {

mind adding some comments on a high level what is following loop trying to achieve?
same for the gradient loops.


tfjs-converter/src/operations/executors/convolution_executor_test.ts, line 445 at r2 (raw file):

      node.attrParams['strides'] = createNumericArrayAttr([1, 1, 1, 1]);
      node.attrParams['pad'] = createStrAttr('same');
      node.attrParams['dataFormat'] = createStrAttr('NHWC');

dataFormat is not a param in the configuration.


tfjs-core/src/jasmine_util.ts, line 118 at r2 (raw file):

  // Account for --grep flag passed to karma by saving the existing specFilter.
  const grepFilter = env.specFilter;

Does --grep filter still work on the node tests?


tfjs-core/src/util.ts, line 654 at r2 (raw file):

    shape: number[], dtype: D) {
  const size = shape.reduce((prev, curr) => prev * curr, 1);
  if (dtype == null || dtype === 'float32' || dtype === 'complex64') {

complex64 contains two parts (real, imag), should this method support that complext64 dtype?


tfjs-core/src/ops/conv_util.ts, line 109 at r2 (raw file):

  // outDepth, it assumes input's depth.
  // Input shape: [batch, height, width, inChannels]
  const inputDepth = inputShape[3];

inputChannels?


tfjs-core/src/ops/dilation2d.ts, line 72 at r2 (raw file):

  if ($x.rank === 3) {
    x4D = reshape($x, [1, $x.shape[0], $x.shape[1], $x.shape[2]]);

[1, ...$x.shape]


tfjs-node/src/index.ts, line 73 at r2 (raw file):

  return new NodeJSKernelBackend(bindings as TFJSBinding, pjson.name);
}, 3 /* priority */);
import './register_all_kernels';

Add comment on why moved the import here?

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thank you!

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @pyu10055, and @tafsiri)

Copy link
Collaborator Author

@lina128 lina128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the review, Ping!

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @pyu10055, and @tafsiri)


tfjs-backend-cpu/src/kernels/Dilation2D.ts, line 66 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

mind adding some comments on a high level what is following loop trying to achieve?
same for the gradient loops.

Done.


tfjs-converter/src/operations/executors/convolution_executor_test.ts, line 445 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

dataFormat is not a param in the configuration.

Nice catch. Removed.


tfjs-core/src/jasmine_util.ts, line 118 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

Does --grep filter still work on the node tests?

Yes. Nothing is changed here except adding some doccomment.

Copy link
Collaborator Author

@lina128 lina128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @pyu10055, and @tafsiri)


tfjs-core/src/util.ts, line 654 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

complex64 contains two parts (real, imag), should this method support that complext64 dtype?

Nice catch, removed.


tfjs-core/src/ops/conv_util.ts, line 109 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

inputChannels?

Done.


tfjs-core/src/ops/dilation2d.ts, line 72 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

[1, ...$x.shape]

ts linter will throw error:
Argument of type '[number, ...number[]]' is not assignable to parameter of type '[number, number, number, number]'.
Type '[number, ...number[]]' is missing the following properties from type '[number, number, number, number]': 1, 2, 3 ts(2345)

Casting may work, but is not more elegant that just write it out.


tfjs-node/src/index.ts, line 73 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

Add comment on why moved the import here?

Done. Kernel registration have to be after backend registration.

@lina128 lina128 requested a review from pyu10055 June 17, 2020 21:31
Copy link
Contributor

@tafsiri tafsiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 29 files at r1, 4 of 5 files at r2, 14 of 14 files at r3.
Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @pyu10055, and @tafsiri)


tfjs-backend-cpu/run_tests.ts, line 40 at r3 (raw file):

const runner = new jasmineCtor();
runner.loadConfig({spec_files: [cpuTests, coreTests], random: false});

Is this comment no longer accurate? Just wondering about its removal


tfjs-backend-cpu/src/kernels/Dilation2D.ts, line 19 at r3 (raw file):

import {backend_util, Dilation2D, Dilation2DAttrs, Dilation2DInputs, TypedArray, util} from '@tensorflow/tfjs-core';
import {KernelConfig} from '@tensorflow/tfjs-core';

add this to the import above.


tfjs-backend-cpu/src/kernels/Dilation2DBackpropFilter.ts, line 117 at r3 (raw file):

    const dataId = cpuBackend.write(
        util.toTypedArray(gradients, x.dtype, false /* debug mode */),

Not a comment on this PR, but it seems unfortunate that the debug mode param (which should probably be optional is in the middle). From a code ergonomics perspective it seems that it would be better at the end (i'm guessing that in the vast majority of calls this will be false).


tfjs-converter/src/operations/executors/convolution_executor_test.ts, line 445 at r2 (raw file):

Previously, lina128 (Na Li) wrote…

Nice catch. Removed.

Just curious about this (not knowing the background), why would it not be in the configuration if it is an attribute of the kernel? Is it because it is always a specific fixed value that it doesn't get written to model.json files?


tfjs-core/src/ops/conv_util.ts, line 107 at r3 (raw file):

  // `computerConv2DInfo` require filterShape to be in the dimension of:
  // `[filterHeight, filterWidth, depth, outDepth]`, dilation2d doesn't have
  // outDepth, it assumes input's depth.

could you clarify 'it assumes input's depth'. Does this mean the same as "for dilation2d outDepth equals the depth of the input"?


tfjs-core/src/ops/dilation2d.ts, line 76 at r3 (raw file):

  }

  util.assert(

I think this assert should be moved above the reshape and should assert that the rank of x (not x4d) is either 3 or 4 (as that is what the op can take and it will handle the reshape to 4d internally). After that we shouldn't need to assert our reshape op was correct.


tfjs-core/src/ops/dilation2d.ts, line 80 at r3 (raw file):

      () => `Error in dilation2d: input must be rank 4, but got rank ` +
          `${x4D.rank}.`);
  util.assert(

I'd move this and the assert below above the reshape as well (do all the input validation upfront).


tfjs-node/src/index.ts, line 19 at r3 (raw file):

// Import all kernels.
import './kernels/all_kernels';

This and registerAllKernels have the same effect, though they use different styles. Only one of them should be kept. In node tree shaking is not a goal so the side effects of registering a kernel in the file its defined does not matter as much. However it we want to stay consistent with the pattern in other backend then the kernels referenced in this file should be updated to the 'export-a-kernelConfig' style.


tfjs-node/src/register_all_kernels.ts, line 1 at r3 (raw file):

/**

See comment above on all_kernels vs register_all_kernels.

Copy link
Collaborator Author

@lina128 lina128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detailed review, Yannick!

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @pyu10055, and @tafsiri)


tfjs-backend-cpu/run_tests.ts, line 40 at r3 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

Is this comment no longer accurate? Just wondering about its removal

I added a doccomment in jasmin_util, so that we don't need to add this comment in every backends.


tfjs-backend-cpu/src/kernels/Dilation2D.ts, line 19 at r3 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

add this to the import above.

Done.


tfjs-backend-cpu/src/kernels/Dilation2DBackpropFilter.ts, line 117 at r3 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

Not a comment on this PR, but it seems unfortunate that the debug mode param (which should probably be optional is in the middle). From a code ergonomics perspective it seems that it would be better at the end (i'm guessing that in the vast majority of calls this will be false).

Agree. I encountered similar problem several times. Wondering should we consider using named parameters for anything optional. Or even enforce named parameters for all APIs.


tfjs-converter/src/operations/executors/convolution_executor_test.ts, line 445 at r2 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

Just curious about this (not knowing the background), why would it not be in the configuration if it is an attribute of the kernel? Is it because it is always a specific fixed value that it doesn't get written to model.json files?

Exactly. The python API has the dataFormat in the parameter: https://www.tensorflow.org/api_docs/python/tf/nn/dilation2d
And hardcode it to NHWC and pass to c++ API (https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/dilation2-d), which doesn't have dataFormat.
Python API writes, only NHWC is supported, leaving the door open to support NCHW in the future. This way, by having dataFormat in the API, the python API is future-proofed.


tfjs-core/src/ops/conv_util.ts, line 107 at r3 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

could you clarify 'it assumes input's depth'. Does this mean the same as "for dilation2d outDepth equals the depth of the input"?

Yes. And revised wording.


tfjs-core/src/ops/dilation2d.ts, line 76 at r3 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

I think this assert should be moved above the reshape and should assert that the rank of x (not x4d) is either 3 or 4 (as that is what the op can take and it will handle the reshape to 4d internally). After that we shouldn't need to assert our reshape op was correct.

Good idea. Moved above.


tfjs-core/src/ops/dilation2d.ts, line 80 at r3 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

I'd move this and the assert below above the reshape as well (do all the input validation upfront).

Done.


tfjs-node/src/index.ts, line 73 at r2 (raw file):

Previously, lina128 (Na Li) wrote…

Done. Kernel registration have to be after backend registration.

Moved to the top in another PR, along with removing import all_kernels: #3476


tfjs-node/src/index.ts, line 19 at r3 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

This and registerAllKernels have the same effect, though they use different styles. Only one of them should be kept. In node tree shaking is not a goal so the side effects of registering a kernel in the file its defined does not matter as much. However it we want to stay consistent with the pattern in other backend then the kernels referenced in this file should be updated to the 'export-a-kernelConfig' style.

Done in another PR, thank you for the suggestion! #3476


tfjs-node/src/register_all_kernels.ts, line 1 at r3 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

See comment above on all_kernels vs register_all_kernels.

Moved everything in all_kernels to this file in this PR, will merge that one first: #3476

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @pyu10055, and @tafsiri)


tfjs-converter/src/operations/executors/convolution_executor_test.ts, line 445 at r2 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

Just curious about this (not knowing the background), why would it not be in the configuration if it is an attribute of the kernel? Is it because it is always a specific fixed value that it doesn't get written to model.json files?

The TF op does not have the field as part of the op definition. I assume it only support a single data format (NHWC)


tfjs-core/src/util.ts, line 635 at r3 (raw file):

export function makeZerosTypedArray<D extends DataType>(
    size: number, dtype: D): DataTypeMap[D] {
  if (dtype == null || dtype === 'float32' || dtype === 'complex64') {

Will this code also have the same issue as below?

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @pyu10055, and @tafsiri)

Copy link
Collaborator Author

@lina128 lina128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @annxingyuan, @pyu10055, and @tafsiri)


tfjs-core/src/util.ts, line 635 at r3 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

Will this code also have the same issue as below?

Discussed offline, will remove in a separate PR.

Copy link
Contributor

@tafsiri tafsiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo resolving whether data_format is in ops.pbtxt for Dilation2d (so one possible change requested).

Reviewed 4 of 4 files at r4, 2 of 3 files at r5.
Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @lina128, @pyu10055, and @tafsiri)


tfjs-backend-cpu/src/kernels/Dilation2DBackpropFilter.ts, line 117 at r3 (raw file):

Previously, lina128 (Na Li) wrote…

Agree. I encountered similar problem several times. Wondering should we consider using named parameters for anything optional. Or even enforce named parameters for all APIs.

could you make an issue to change the ordering of this param to move it to the end? We should do it before we implement a bunch more kernels.

Named params would definitely be nice in a number of places, but don't match the convention we mostly used. Though I should say at a high level I often thinks it reflects an opportunity to refactor to clearer methods of mechanism (e.g. in this case should it use an ENV flag for debug mode rather than an argument etc). I even wonder why this method needed a debug mode as part of its API (as opposed to tests).


tfjs-converter/src/operations/executors/convolution_executor_test.ts, line 445 at r2 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

The TF op does not have the field as part of the op definition. I assume it only support a single data format (NHWC)

If the TF op doesn't have it then I think our attribute interface shouldn't have it. If the TF op ever adds it to the kernel definition they would probably do a Dilation2Dv2 at which point we should add that and its clear how the two differ. You can leave it in the public JavaScript api though. I'll leave a comment below on the interface definition i am talking about.


tfjs-core/src/kernel_names.ts, line 200 at r5 (raw file):

  strides: [number, number]|number;
  pad: 'valid'|'same'|number;
  dataFormat: 'NHWC';

Is this in ops.pbtxt for this kernel? If not then we should remove it here. If it is then we can keep it.

Copy link
Collaborator Author

@lina128 lina128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 1 approvals obtained (waiting on @annxingyuan, @pyu10055, and @tafsiri)


tfjs-backend-cpu/src/kernels/Dilation2DBackpropFilter.ts, line 117 at r3 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

could you make an issue to change the ordering of this param to move it to the end? We should do it before we implement a bunch more kernels.

Named params would definitely be nice in a number of places, but don't match the convention we mostly used. Though I should say at a high level I often thinks it reflects an opportunity to refactor to clearer methods of mechanism (e.g. in this case should it use an ENV flag for debug mode rather than an argument etc). I even wonder why this method needed a debug mode as part of its API (as opposed to tests).

For toTypedArray, debugMode is the last param, added an issue to consider removing it in favor of using env().getBool() directly. #3489


tfjs-converter/src/operations/executors/convolution_executor_test.ts, line 445 at r2 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

If the TF op doesn't have it then I think our attribute interface shouldn't have it. If the TF op ever adds it to the kernel definition they would probably do a Dilation2Dv2 at which point we should add that and its clear how the two differ. You can leave it in the public JavaScript api though. I'll leave a comment below on the interface definition i am talking about.

Done.


tfjs-core/src/kernel_names.ts, line 200 at r5 (raw file):

Previously, tafsiri (Yannick Assogba) wrote…

Is this in ops.pbtxt for this kernel? If not then we should remove it here. If it is then we can keep it.

Removed.

@lina128 lina128 merged commit 96e9c1f into tensorflow:master Jun 23, 2020
@lina128 lina128 deleted the temp4 branch June 23, 2020 11:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants