Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libnd4j: strided_slice_bp bug #8342

Closed
longzhendong opened this issue Nov 3, 2019 · 7 comments
Closed

libnd4j: strided_slice_bp bug #8342

longzhendong opened this issue Nov 3, 2019 · 7 comments

Comments

@longzhendong
Copy link
Contributor

@longzhendong longzhendong commented Nov 3, 2019

Please indicate relevant versions, including, if relevant:

  • Deeplearning4j version :1.0.0-beta5
  • Platform information :win10

public class LN4 {

public static void main(String[] args) {

	

	SameDiff sd = SameDiff.create();

	SDVariable input = sd.var("input", DataType.FLOAT, 1, 2);

	SDVariable label = sd.var("label", DataType.FLOAT, 1, 2);

	INDArray inputArr = Nd4j.linspace(2, 3, 2).reshape(new int[] { 1,2 });
	INDArray labelArr = Nd4j.linspace(1, 4, 2).reshape(new int[] { 1,2 });
	
	
	SDVariable a= input.get(SDIndex.all(),SDIndex.point(0));
	SDVariable b= input.get(SDIndex.all(),SDIndex.point(1));

	SDVariable c=sd.stack("stack", 1, a,b);

	SDVariable m= sd.math().pow(c.sub(label), 2) ;
	sd.setLossVariables(m);
	
	System.out.println(sd.summary());
	
	for(int i=0;i<5;i++){
		sd.associateArrayWithVariable(inputArr, input);
		sd.associateArrayWithVariable(labelArr, label);
		sd.execBackwards(null,Operation.INFERENCE);
		System.out.println(input.getGradient().getArr());
		System.out.println(c.getGradient().getArr());
		System.out.println("==============");
		sd.clearPlaceholders(true);
        sd.clearOpInputs();
	}
}

}

Result:

[[ 0, 0]]
[[ 2.0000, -2.0000]]

[[ 2.0000, 2.0000]]
[[ 2.0000, -2.0000]]

[[ 2.0000, 2.0000]]
[[ 2.0000, -2.0000]]

[[ 0, 0]]
[[ 2.0000, -2.0000]]

[[ -2.0000, -2.0000]]
[[ 2.0000, -2.0000]]

@longzhendong

This comment has been minimized.

Copy link
Contributor Author

@longzhendong longzhendong commented Nov 3, 2019

@longzhendong

This comment has been minimized.

Copy link
Contributor Author

@longzhendong longzhendong commented Nov 3, 2019

@AlexDBlack The differential value of c and input should be the same!

@longzhendong longzhendong changed the title When you compute differentials with samediff, you get a different result each time When you compute differentials with samediff, it get a different result each time Nov 3, 2019
@longzhendong longzhendong changed the title When you compute differentials with samediff, it get a different result each time When it compute differentials with samediff, it get a different result each time Nov 3, 2019
@longzhendong

This comment has been minimized.

Copy link
Contributor Author

@longzhendong longzhendong commented Nov 4, 2019

@AlexDBlack Do you have time to look at this issue?

@AlexDBlack AlexDBlack self-assigned this Nov 4, 2019
@AlexDBlack AlexDBlack added the C++ label Nov 4, 2019
@AlexDBlack

This comment has been minimized.

Copy link
Contributor

@AlexDBlack AlexDBlack commented Nov 4, 2019

Thanks for reporting.

This is a bug in strided slice backprop - reproducible with the following test case:
https://gist.github.com/AlexDBlack/a2e9f87e46dc20dbc5eadcab39a8a993

I have checked the array shapes and iargs, that looks reasonable to me, so it is likely a bug in the implementation.

@AlexDBlack AlexDBlack added the Bug label Nov 4, 2019
@AlexDBlack AlexDBlack changed the title When it compute differentials with samediff, it get a different result each time libnd4j: strided_slice_bp bug Nov 4, 2019
@AlexDBlack AlexDBlack added this to the 1.0.0-beta6 milestone Nov 4, 2019
@longzhendong

This comment has been minimized.

Copy link
Contributor Author

@longzhendong longzhendong commented Nov 4, 2019

@AlexDBlack I want to implement BPTT with samediff, which will slice the input in time steps and then stack it into a new tensor to calculate the gradient. Can I calculate the gradient of loss to the input?

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

@AlexDBlack AlexDBlack commented Nov 4, 2019

@longzhendong on the current master (and snapshots, as soon as they are back up) you could use SameDiff.calculateGradients method and specify that you want the input gradient returned.

Another way to implement TBPTT would be to have placeholders for both the initial RNN state and the normal input... split it externally, using INDArray.get instead of internally using SDVariable.get.
Your outputs are the activations and the RNN last time step (which you can store to feed in as placeholders for the next step)

@AlexDBlack AlexDBlack removed their assignment Nov 4, 2019
@shugeo shugeo mentioned this issue Nov 5, 2019
0 of 4 tasks complete
@AlexDBlack

This comment has been minimized.

Copy link
Contributor

@AlexDBlack AlexDBlack commented Nov 6, 2019

Confirmed fixed and merged to eclipse master also
Thanks for reporting @longzhendong, and for the fix @shugeo

@AlexDBlack AlexDBlack closed this Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.