Skip to content
This repository has been archived by the owner on Aug 15, 2019. It is now read-only.

Add cropAndResize op #1252

Merged
merged 10 commits into from
Aug 30, 2018
Merged

Add cropAndResize op #1252

merged 10 commits into from
Aug 30, 2018

Conversation

AviKndr
Copy link
Contributor

@AviKndr AviKndr commented Aug 27, 2018

Description

FEATURE

Adds cropAndResize op.
See: tensorflow/tfjs#624


For repository owners only:

Please remember to apply all applicable tags to your pull request.
Tags: FEATURE, BREAKING, BUG, PERF, DEV, DOC, SECURITY

For more info see: https://github.com/tensorflow/tfjs/blob/master/DEVELOPMENT.md


This change is Reviewable

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, the PR looks great, couple minor comments.

Reviewed 2 of 6 files at r1.
Reviewable status: 0 of 1 approvals obtained (waiting on @AviKndr)


src/kernels/webgl/crop_and_resize_gpu.ts, line 35 at r1 (raw file):

    const methodId = method === 'bilinear' ? 1 : 0;

    const [xHeightFloat, xWidthFloat] =

instead of calling xHeightFloat, and yHeightFloat, ...
might be more clear to call them inputHeightFloat, outputHeightFloat


src/kernels/webgl/crop_and_resize_gpu.ts, line 42 at r1 (raw file):

    const [heightRatio, heightScale, inY] = cropHeight > 1 ?
      [
        `${xHeightFloat}/${yHeightFloat}`,

will${xHeightFloat/yHeightFloat} eliminate an extra division during runtime?
similar for the x below.


src/kernels/webgl/crop_and_resize_gpu.ts, line 93 at r1 (raw file):

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/crop_and_resize_op.cc

will this cause error if the extrapolationValue is set as a float (i.e. 0.5) ?


src/ops/image_ops.ts, line 249 at r1 (raw file):

 *     defaults to bilinear, which specifies the sampling method for resizing
 * @param extrapolationValue A threshold for deciding when to remove boxes based
 *     on score. Defaults to -inf, which means any score is accepted.

default to 0?


src/ops/image_ops.ts, line 282 at r1 (raw file):

util.assert(
    cropSize.length === 2,
    `Error in cropAndResize: new shape must 2D, but got shape ` +

instead of call it new shape, it will be more clear to say cropSize.
'got shape' does not match to the value cropSize.


src/ops/image_ops.ts, line 289 at r1 (raw file):

util.assert(
    cropSize[0] >= 1 && cropSize[1] >= 1,
    `size must be atleast [1,1], but was ${cropSize}`);

cropSize instead of just size


src/ops/image_ops_test.ts, line 243 at r1 (raw file):

    const image:tf.Tensor4D = tf.tensor4d([1,2,3,4],[1,2,2,1]);
    const boxes:tf.Tensor2D = tf.tensor2d([1,1,0,0],[1,4]);
    // const boxInd:tf.Tensor1D = tf.tensor1d([0],'int32');

remove the commented out line? also in other tests.

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @AviKndr)


src/kernels/backend_cpu.ts, line 2260 at r1 (raw file):

        0;
      const widthScale = (cropWidth > 1) ?
        (y2 - y1) * (imageWidth - 1) / (cropWidth - 1) :

x2 - x1?

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work. Took a pass. Very few comments. Thanks!

Reviewed 6 of 6 files at r1.
Reviewable status: 0 of 1 approvals obtained (waiting on @AviKndr)


src/kernels/backend_cpu.ts, line 2272 at r1 (raw file):

          for (let x = 0; x < cropWidth; x++) {
            for (let c = 0; c < numChannels; c++) {
              output.set(extrapolationValue, b, y, x, c);

Thanks for using flat indexing for the inputs. Can you use flat Indexing for the output as well? I.e. use output.values[flatIndx] = someValue instead of output.set(...). Here and elsewhere.


src/kernels/webgl/crop_and_resize_gpu.ts, line 83 at r1 (raw file):

        // get image in batch index
        int bInd = int(floor(getBoxInd(b)));

use round(getBoxInd(b)) (more stable with limited precision)


src/kernels/webgl/crop_and_resize_gpu.ts, line 93 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/crop_and_resize_op.cc

will this cause error if the extrapolationValue is set as a float (i.e. 0.5) ?

Good catch! To be safe, you can wrap it in float(), i.e. setOutput(float(${extrapolationValue})) and let webgl figure it out. You have to do it here and few lines below.


src/ops/image_ops.ts, line 233 at r1 (raw file):

/**
 * Extracts crops from the input image tensor and resizes them using bilinear
 * sampling or nearest neighbor sampling (possibl with aspect ratio change)

typo: possibly


src/ops/image_ops.ts, line 235 at r1 (raw file):

 * sampling or nearest neighbor sampling (possibl with aspect ratio change)
 * to a common output size specified by crop_size;
 *

use period instead of semi-colon at the end of sentence.


src/ops/image_ops.ts, line 237 at r1 (raw file):

 *
 * @param image 4d tensor of shape `[batch,imageHeight,imageWidth, depth]`.
 *     where image_height and image_width must be positive, specifying the

imageHeight and imageWidth (remove the underscore)


src/ops/image_ops.ts, line 268 at r1 (raw file):

const numBoxes = $boxes.shape[0];
util.assert(

indent this method by 2 spaces. clang-format and vscode can auto-format this for you. See https://github.com/tensorflow/tfjs-core/blob/master/DEVELOPMENT.md#development for details.


src/ops/image_ops_test.ts, line 303 at r1 (raw file):

    const image:tf.Tensor4D = tf.tensor4d([1,2,3,4,5,6,7,8,9],[1,3,3,1]);
    const boxes:tf.Tensor2D = tf.tensor2d([0,0,1,1,0,0,0.5,0.5],[2,4]);
    // const boxInd:tf.Tensor1D = tf.tensor1d([0,0],'int32');

curious why you commented out boxInd with int32?


src/ops/image_ops_test.ts, line 364 at r1 (raw file):

    expect(output.shape).toEqual([0,3,3,1]);
    expectArraysClose(output,[]);
  });

add a unit test where the batchSize is 2, and have different boxes for the two images.

Copy link
Contributor Author

@AviKndr AviKndr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There we go.

Reviewable status: 0 of 1 approvals obtained (waiting on @pyu10055 and @AviKndr)


src/kernels/backend_cpu.ts, line 2260 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

x2 - x1?

Good catch. I realized that I never tested with boxes of non-square shape. Added 2 corresponding tests.


src/kernels/backend_cpu.ts, line 2272 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Thanks for using flat indexing for the inputs. Can you use flat Indexing for the output as well? I.e. use output.values[flatIndx] = someValue instead of output.set(...). Here and elsewhere.

Done.


src/kernels/webgl/crop_and_resize_gpu.ts, line 35 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

instead of calling xHeightFloat, and yHeightFloat, ...
might be more clear to call them inputHeightFloat, outputHeightFloat

Done.


src/kernels/webgl/crop_and_resize_gpu.ts, line 42 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

will${xHeightFloat/yHeightFloat} eliminate an extra division during runtime?
similar for the x below.

Done.


src/kernels/webgl/crop_and_resize_gpu.ts, line 83 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

use round(getBoxInd(b)) (more stable with limited precision)

Done.


src/kernels/webgl/crop_and_resize_gpu.ts, line 93 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Good catch! To be safe, you can wrap it in float(), i.e. setOutput(float(${extrapolationValue})) and let webgl figure it out. You have to do it here and few lines below.

Fixed and Tested. I just computed the string using Number.isInteger() since I wanted to optimize for the shader program. Is casting to a float within the shader program preferred?


src/ops/image_ops.ts, line 233 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

typo: possibly

Done.


src/ops/image_ops.ts, line 235 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

use period instead of semi-colon at the end of sentence.

Done.


src/ops/image_ops.ts, line 237 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

imageHeight and imageWidth (remove the underscore)

Done.


src/ops/image_ops.ts, line 249 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

default to 0?

Done.


src/ops/image_ops.ts, line 268 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

indent this method by 2 spaces. clang-format and vscode can auto-format this for you. See https://github.com/tensorflow/tfjs-core/blob/master/DEVELOPMENT.md#development for details.

Done.


src/ops/image_ops.ts, line 282 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

instead of call it new shape, it will be more clear to say cropSize.
'got shape' does not match to the value cropSize.

Done.


src/ops/image_ops.ts, line 289 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

cropSize instead of just size

Done.


src/ops/image_ops_test.ts, line 243 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

remove the commented out line? also in other tests.

Done.


src/ops/image_ops_test.ts, line 303 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

curious why you commented out boxInd with int32?

I was originally confused why the shader program was compiling getBoxInd to return a float but realized that this was the case for all tensors regardless of dtype. I also guarenteed in cropAndResize_ that the tensor would be int32.


src/ops/image_ops_test.ts, line 364 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

add a unit test where the batchSize is 2, and have different boxes for the two images.

Done.

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great on my side (left 1 tiny comment as Number.isInteger). Will wait for @pyu10055 to approve as well before merging. Really nice work!

Reviewed 2 of 4 files at r2.
Reviewable status: 0 of 1 approvals obtained (waiting on @pyu10055 and @AviKndr)


src/kernels/webgl/crop_and_resize_gpu.ts, line 93 at r1 (raw file):

Previously, AviKndr (Avinash Kondareddy) wrote…

Fixed and Tested. I just computed the string using Number.isInteger() since I wanted to optimize for the shader program. Is casting to a float within the shader program preferred?

Shader compilers are so optimized that float(4) gets pre-computed and marked as constant, effectively becoming a no-op. I would just wrap it in float() and not worry about doing the Number.isInteger test.


src/ops/image_ops.ts, line 263 at r2 (raw file):

  const $image = convertToTensor(image, 'image', 'cropAndResize', 'float32');
  const $boxes = convertToTensor(boxes, 'boxes', 'cropAndResize', 'float32');
  const $boxInd = convertToTensor(boxInd, 'boxInd', 'cropAndResize', 'int32');

Just a note: convertToTensor() won't necessarily cast to int32 if the input is already a tensor, but this reminds me that we should do that.


src/ops/image_ops_test.ts, line 303 at r1 (raw file):

Previously, AviKndr (Avinash Kondareddy) wrote…

I was originally confused why the shader program was compiling getBoxInd to return a float but realized that this was the case for all tensors regardless of dtype. I also guarenteed in cropAndResize_ that the tensor would be int32.

Yeah, in the shader world, we only sample floats, regardless of the dtype. This was much simpler to build.

Copy link
Contributor Author

@AviKndr AviKndr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055 and @AviKndr)


src/kernels/webgl/crop_and_resize_gpu.ts, line 93 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Shader compilers are so optimized that float(4) gets pre-computed and marked as constant, effectively becoming a no-op. I would just wrap it in float() and not worry about doing the Number.isInteger test.

That makes sense, I forgot about compiler optimizations. Made the necessary changes to the gpu implementation.


src/ops/image_ops.ts, line 263 at r2 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Just a note: convertToTensor() won't necessarily cast to int32 if the input is already a tensor, but this reminds me that we should do that.

Thanks, I added a dtype assertion in the op and changed the tests to use boxInd of 'int32' to stick to the reference implementation


src/ops/image_ops_test.ts, line 303 at r1 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Yeah, in the shader world, we only sample floats, regardless of the dtype. This was much simpler to build.

Thanks, that clears things up for me.

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r3.
Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @pyu10055)

Copy link
Collaborator

@pyu10055 pyu10055 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nicely done, thanks!

Reviewed 6 of 6 files at r3.
Reviewable status: :shipit: complete! 1 of 1 approvals obtained

@dsmilkov dsmilkov merged commit e448859 into tensorflow:master Aug 30, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants