Support LogSoftmax op #1342

yhwang · 2018-10-25T03:51:03Z

Althought LogSoftmax is log(Softmax) but the implementation is not.
Add the LogSoftmax op and use the implementation from tensorflow.

fixes issue 684
Signed-off-by: Yihong Wang yh.wang@ibm.com

Description

For repository owners only:

Please remember to apply all applicable tags to your pull request.
Tags: FEATURE, BREAKING, BUG, PERF, DEV, DOC, SECURITY

For more info see: https://github.com/tensorflow/tfjs/blob/master/DEVELOPMENT.md

This change is

yhwang · 2018-10-25T03:58:06Z

I used the log_softmax implementation in tensorflow here and use the gradient function here

I haven't added test cases yet. Any suggestion for the test cases? Can I duplicate the test cases for softmax and update the expect results accordingly?

I directly put the logSoftmax() with softmax() in softmax.ts. Should I separate them?

Edit
I added some test cases. However, all of the test cases are for log softmax function and no test case for the gradient function yet.

pyu10055

Thank you for the contribution, on high level, it is preferred to implement the op on GPU as an independent op, meaning the logic should not relies on other ops, mainly for performance concern.

Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov)

yhwang · 2018-11-05T17:50:01Z

@pyu10055 thanks for the comments. I followed the softmax and came up this PR. In logSoftmax it needs exp().sum().log() operations, same as softmax in tfjs-core. I guess the reason that original softmax impl doesn't put those calculation into WebGL is the sum() calculation in the middle. Therefore, I didn't try to move everything to WebGL. but that's my guess. Also the LogSoftmax implementation in tensorflow here seems has no specialization for GPU. Any suggestion to move this to WebGL and enhance the performance?

pyu10055

got it, I did not realize our softmax did not have its own GPU implementation. Thanks for the explanation.

Reviewed 3 of 3 files at r1.
Reviewable status: 0 of 1 approvals obtained (waiting on @yhwang and @dsmilkov)

src/ops/softmax.ts, line 102 at r1 (raw file):

    axis = $logits.rank - 1;
  }
  if (axis !== $logits.rank - 1) {

can you explain why non-last dim axis is not supported? thanks.

src/ops/softmax.ts, line 113 at r1 (raw file):

    const shifted = logits.sub(xMax);
    const value =
        shifted.toFloat().sub(shifted.exp().sum(axis, keepDims).log()) as T;

do we need to cast the shifted to float?

src/ops/softmax.ts, line 115 at r1 (raw file):

        shifted.toFloat().sub(shifted.exp().sum(axis, keepDims).log()) as T;

    const gradFunc = (dy: T) => {

do you have tf implementation of the grad function for reference? thanks.

yhwang

Reviewable status: 0 of 1 approvals obtained (waiting on @pyu10055 and @dsmilkov)

src/ops/softmax.ts, line 102 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

can you explain why non-last dim axis is not supported? thanks.

basically, log_softmax and softmax need to be performed against the last dimension. In order to support non-last dim axis, we need to do a transpose to swap the axis in the beginning and sawp it back at the end. the original softmax in tfjs-core doesn't support that. So I follow the convention to not support it. but I do see tf supports it at python level API here:https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/nn_ops.py#L1687-L1704

src/ops/softmax.ts, line 113 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

do we need to cast the shifted to float?

this also follows the softmax impl above (line 61)

src/ops/softmax.ts, line 115 at r1 (raw file):

Previously, pyu10055 (Ping Yu) wrote…

do you have tf implementation of the grad function for reference? thanks.

here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/nn_grad.py#L247-L262

yhwang

BTW, I found the tf GPU impl here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/softmax_op_gpu.cu.cc#L135-L191

Reviewable status: 0 of 1 approvals obtained (waiting on @pyu10055 and @dsmilkov)

nsthorat

This looks great. Ping is right we should be doing this in a kernel, but it looks like we don't do that for softmax. One small comment about the test then I will merge :)

Reviewed 1 of 3 files at r1.
Reviewable status: 0 of 1 approvals obtained (waiting on @pyu10055, @yhwang, and @dsmilkov)

src/ops/softmax.ts, line 102 at r1 (raw file):

Previously, yhwang (Yihong Wang) wrote…

basically, log_softmax and softmax need to be performed against the last dimension. In order to support non-last dim axis, we need to do a transpose to swap the axis in the beginning and sawp it back at the end. the original softmax in tfjs-core doesn't support that. So I follow the convention to not support it. but I do see tf supports it at python level API here:https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/nn_ops.py#L1687-L1704

This is what we do for softmax, so this is fine.

src/ops/softmax_test.ts, line 169 at r1 (raw file):

    expect(f).toThrowError();
  });

can you add a unit test for the gradient?

yhwang

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @pyu10055, @nsthorat, and @dsmilkov)

src/ops/softmax_test.ts, line 169 at r1 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

can you add a unit test for the gradient?

thanks for the review. I just added a gradient test case with hard-code expected results. any suggestion on adding gradient test cases?

yhwang

I saw CI failed on a test case here:
matmul_test.ts:310
It should not be related to this PR.

Reviewable status: complete! 1 of 1 approvals obtained (waiting on @pyu10055, @nsthorat, and @dsmilkov)

Althought LogSoftmax is log(Softmax) but the implementation is not. Add the LogSoftmax op and use the implementation from tensorflow. Signed-off-by: Yihong Wang <yh.wang@ibm.com>

nsthorat · 2018-11-08T00:33:18Z

@pyu10055 lets wait till the questions are all addressed before merging next time :)

yhwang · 2018-11-08T17:47:54Z

@nsthorat @pyu10055 thanks for the reviews again. One question, may I also work on tfjs-converter to put logSoftmax into normalization ops?

nsthorat · 2018-11-08T17:54:13Z

@yhwang go for it!

yhwang mentioned this pull request Oct 25, 2018

ValueError: Unsupported Ops in the model before optimization LogSoftmax tensorflow/tfjs#684

Closed

yhwang force-pushed the add-op-log-softmax branch 11 times, most recently from 59d80c8 to ab5ed1d Compare October 30, 2018 21:16

pyu10055 requested a review from dsmilkov November 3, 2018 00:33

pyu10055 reviewed Nov 3, 2018

View reviewed changes

pyu10055 suggested changes Nov 6, 2018

View reviewed changes

yhwang commented Nov 6, 2018

View reviewed changes

nsthorat approved these changes Nov 6, 2018

View reviewed changes

yhwang force-pushed the add-op-log-softmax branch from ab5ed1d to 1592f2d Compare November 6, 2018 21:38

yhwang commented Nov 6, 2018

View reviewed changes

pyu10055 approved these changes Nov 7, 2018

View reviewed changes

Support LogSoftmax op

e054b61

Althought LogSoftmax is log(Softmax) but the implementation is not. Add the LogSoftmax op and use the implementation from tensorflow. Signed-off-by: Yihong Wang <yh.wang@ibm.com>

yhwang force-pushed the add-op-log-softmax branch from 1592f2d to e054b61 Compare November 7, 2018 19:12

Merge branch 'master' into add-op-log-softmax

a525425

pyu10055 merged commit ff27d84 into tensorflow:master Nov 8, 2018

yhwang deleted the add-op-log-softmax branch November 8, 2018 17:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support LogSoftmax op #1342

Support LogSoftmax op #1342

yhwang commented Oct 25, 2018 •

edited by dsmilkov

Loading

yhwang commented Oct 25, 2018 •

edited

Loading

pyu10055 left a comment

yhwang commented Nov 5, 2018

pyu10055 left a comment

yhwang left a comment

yhwang left a comment

nsthorat left a comment

yhwang left a comment

yhwang left a comment

nsthorat commented Nov 8, 2018

yhwang commented Nov 8, 2018

nsthorat commented Nov 8, 2018

Support LogSoftmax op #1342

Support LogSoftmax op #1342

Conversation

yhwang commented Oct 25, 2018 • edited by dsmilkov Loading

Description

For repository owners only:

yhwang commented Oct 25, 2018 • edited Loading

pyu10055 left a comment

Choose a reason for hiding this comment

yhwang commented Nov 5, 2018

pyu10055 left a comment

Choose a reason for hiding this comment

yhwang left a comment

Choose a reason for hiding this comment

yhwang left a comment

Choose a reason for hiding this comment

nsthorat left a comment

Choose a reason for hiding this comment

yhwang left a comment

Choose a reason for hiding this comment

yhwang left a comment

Choose a reason for hiding this comment

nsthorat commented Nov 8, 2018

yhwang commented Nov 8, 2018

nsthorat commented Nov 8, 2018

yhwang commented Oct 25, 2018 •

edited by dsmilkov

Loading

yhwang commented Oct 25, 2018 •

edited

Loading