Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Op [reduce_sum_bp] execution failed bug #8360

Closed
longzhendong opened this issue Nov 6, 2019 · 9 comments · Fixed by KonduitAI/deeplearning4j#35
Closed

Op [reduce_sum_bp] execution failed bug #8360

longzhendong opened this issue Nov 6, 2019 · 9 comments · Fixed by KonduitAI/deeplearning4j#35
Assignees
Labels
Milestone

Comments

@longzhendong
Copy link
Contributor

@longzhendong longzhendong commented Nov 6, 2019

#8233 ## Version Information

Please indicate relevant versions, including, if relevant:

  • 1.0.0-beta5
  • win10
  • CUDA version, no used

Code:
public static void main(String[] args) {

	SameDiff sd = SameDiff.create();
	SDVariable label = sd.var("label", DataType.FLOAT, new int[] { 3, 3, 4 });

	SDVariable m = sd.math().logSumExp(label, 1);
	sd.setLossVariables(m);

	ExternalErrorsFunction fn = sd.f().externalErrors(m);
	fn.outputVariable();
	System.out.println(sd.summary());
	if (!sd.hasGradientFunction()) {
		sd.createGradFunction("label");
	}
	sd.assignArray(Nd4j.rand(new int[] { 3, 3, 4 }), label);
	sd.execBackwards(null);

}

Exception:

Mismatched shape: [2, 3, 4, 4, 1, 8192, 1, 99]
Shape requested: : {1, 1, 1}
o.n.l.c.n.o.NativeOpExecutioner - Failed to execute op reduce_sum_bp. Attempted to execute with 2 inputs, 1 outputs, 1 targs,0 bargs and 0 iargs. Inputs: [(FLOAT,[3,3,4],c), (FLOAT,[3,4],c)]. Outputs: [(FLOAT,[3,3,4],c)]. tArgs: [0.0]. iArgs: -. bArgs: -. Op own name: "reduce_sum_bp_1". Input var names: [label, divide]. Output var names: [reduce_sum_bp_1] - Please see above message (printed out from c++) for a possible cause of error.
Exception in thread "main" java.lang.RuntimeException: Op [reduce_sum_bp] execution failed
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1710)
at org.nd4j.autodiff.samediff.internal.InferenceSession.getOutputsHelper(InferenceSession.java:505)
at org.nd4j.autodiff.samediff.internal.InferenceSession.getOutputs(InferenceSession.java:119)
at org.nd4j.autodiff.samediff.internal.InferenceSession.getOutputs(InferenceSession.java:56)
at org.nd4j.autodiff.samediff.internal.AbstractSession.output(AbstractSession.java:335)
at org.nd4j.autodiff.samediff.SameDiff.directExecHelper(SameDiff.java:3181)
at org.nd4j.autodiff.samediff.SameDiff.execBackwards(SameDiff.java:4693)
at org.nd4j.autodiff.samediff.SameDiff.execBackwards(SameDiff.java:4629)
at org.nd4j.autodiff.samediff.SameDiff.execBackwards(SameDiff.java:4587)
at org.nd4j.autodiff.samediff.SameDiff.execBackwards(SameDiff.java:4596)
at org.deeplearning4j.examples.LN6.main(LN6.java:66)
Caused by: java.lang.RuntimeException: NDArray::reshapei: bad input shape!
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:2006)
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1700)
... 10 more

@longzhendong

This comment has been minimized.

Copy link
Contributor Author

@longzhendong longzhendong commented Nov 6, 2019

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

@AlexDBlack AlexDBlack commented Nov 6, 2019

What are you trying to accomplish here? i.e., what is your goal?

@longzhendong

This comment has been minimized.

Copy link
Contributor Author

@longzhendong longzhendong commented Nov 6, 2019

@AlexDBlack The derivative of input to logSumExp, used for the implementation of CRF layer

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

@AlexDBlack AlexDBlack commented Nov 6, 2019

But you're using an ExternalErrorsFunction... That's only used when you have gradients that are not coming from SameDiff...

So to be clear, your "full" network is something along the lines of:
SameDiff(label -> logsumexp) -> NotSameDiff(CRF)?
And you have gradient calculation that you want to pass in to SameDiff from the CRF implementation?

@longzhendong

This comment has been minimized.

Copy link
Contributor Author

@longzhendong longzhendong commented Nov 6, 2019

@AlexDBlack "Label" is just a variable name, and the problem here is to compute the gradient to report an error

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

@AlexDBlack AlexDBlack commented Nov 7, 2019

@AlexDBlack AlexDBlack self-assigned this Nov 7, 2019
@AlexDBlack AlexDBlack added this to the 1.0.0-beta6 milestone Nov 7, 2019
@raver119

This comment has been minimized.

Copy link
Contributor

@raver119 raver119 commented Nov 7, 2019

mmm, this op definitely needs C++ implementation. cc @shugeo

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

@AlexDBlack AlexDBlack commented Nov 7, 2019

Turns out cause was trivial mistake in constructor
https://github.com/KonduitAI/deeplearning4j/pull/35/files

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

@AlexDBlack AlexDBlack commented Nov 7, 2019

@raver119 LogSumExp is on the list here btw: #8099

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.