Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Fix softmax_cross_entropy #8254

Closed
wants to merge 4 commits into from
Closed

Conversation

kevinthesun
Copy link
Contributor

@kevinthesun kevinthesun commented Oct 13, 2017

Description

Fix softmax_cross_entropy operator "Not enough argument to call operator softmax_cross_entropy" issue #6874

Checklist

Essentials

  • Passed code style checking (make lint)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • For user-facing API changes, API doc string has been updated.
  • To my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Fix "Not enough argument to call operator softmax_cross_entropy" issue for softmax_cross_entropy operator.
  • Change output to be shape (batch_size,), which is consistent with gluon softmax loss.

@piiswrong
Copy link
Contributor

softmax_cross_entropy's output currently has shape (1,)
It should behave like gluon.loss.SoftmaxCrossEntropyLoss instead.
Since this is currently broken and no one is using it, we should take the chance to fix it

@kevinthesun
Copy link
Contributor Author

Got it. I'll try to fix it.

@piiswrong
Copy link
Contributor

closing due to inactive

@piiswrong piiswrong closed this Dec 12, 2017
@eric-haibin-lin
Copy link
Member

This issue breaks existing examples #6874
We should fix this

@eric-haibin-lin
Copy link
Member

@kevinthesun you can remove the sumall_except_dim at https://github.com/apache/incubator-mxnet/blob/master/src/operator/loss_binary_op-inl.h#L80 to make sure it doesn't return scalar result.
Also update the infer shape function

@Roshrini
Copy link
Member

+1 Currently https://github.com/apache/incubator-mxnet/tree/master/example/model-parallel/lstm example breaks because of this issue.

@kevinthesun
Copy link
Contributor Author

@eric-haibin-lin Will work on this today.

@kevinthesun
Copy link
Contributor Author

@ehsanmok
Copy link
Contributor

Any update for this fix?

@eric-haibin-lin
Copy link
Member

@kevinthesun any update?

@kevinthesun
Copy link
Contributor Author

Already fixed.

@piiswrong piiswrong closed this Jan 31, 2018
@CircleXing001
Copy link

@kevinthesun why it reproduce

@tao-sun
Copy link

tao-sun commented Jun 22, 2018

I use docker python:1.1.0_gpu_cuda8 and the same error is reproduced.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants