Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DL4J: SameDiffOutputLayer doInit() method not called if first operation is fit #7785

Closed
longzhendong opened this issue May 27, 2019 · 12 comments

Comments

@longzhendong
Copy link

commented May 27, 2019

Version Information

Please indicate relevant versions, including, if relevant:

  • Deeplearning4j 1.0.0-beta4
  • platform information (window10)

When I implemented the custom output layer with the SameDiffOutputLayer and threw a NullPointerException, I guessed that sameDiff was called "backpropGradient" without being initialized。

The exception stack is as follows:

Exception in thread "main" java.lang.NullPointerException
at org.deeplearning4j.nn.layers.samediff.SameDiffOutputLayer.backpropGradient(SameDiffOutputLayer.java:138)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.calcBackpropGradients(MultiLayerNetwork.java:1898)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2684)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2627)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:160)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:63)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fitHelper(MultiLayerNetwork.java:1675)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:1596)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:1583)
at org.deeplearning4j.examples.samediff.dl4j.Ex1BasicSameDiffOutputLayerExample.main(Ex1BasicSameDiffOutputLayerExample.java:80)

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

Sorry, there's not enough information here to help.
The original example runs OK.
Can you share your code for the custom output layer?

@longzhendong

This comment has been minimized.

Copy link
Author

commented May 27, 2019

@AlexDBlack
public class MinimalSameDiffOutputLayer extends SameDiffOutputLayer {

private static final long serialVersionUID = 1L;

private WeightInit weightInit;

private int nIn;
private int nOut;
private Activation activation;

public MinimalSameDiffOutputLayer(int nIn, int nOut, Activation activation, WeightInit weightInit) {
	this.nIn = nIn;
	this.nOut = nOut;
	this.activation = activation;
	this.weightInit = weightInit;
}

@Override
public SDVariable defineLayer(SameDiff sameDiff, SDVariable layerInput, SDVariable labels,
		Map<String, SDVariable> paramTable) {
	    SDVariable weights = paramTable.get(DefaultParamInitializer.WEIGHT_KEY);
        SDVariable bias = paramTable.get(DefaultParamInitializer.BIAS_KEY);
        SDVariable input=sameDiff.var("input");
        SDVariable label= sameDiff.var("labels");
        SDVariable mmul = sameDiff.mmul("mmul", input, weights);
        SDVariable z = mmul.add("z", bias);
        SDVariable a =activation.asSameDiff("out", sameDiff, z);
        
        return  label.mul(sameDiff.math().log(a)).sum("output",1);
        
}

@Override
public String activationsVertexName() {
	return "output";
}

@Override
public void defineParameters(SDLayerParams params) {
	params.addWeightParam(DefaultParamInitializer.WEIGHT_KEY, nIn, nOut);
	params.addBiasParam(DefaultParamInitializer.BIAS_KEY, 1, nOut);
}

@Override
public void initializeParameters(Map<String, INDArray> params) {
	params.get(DefaultParamInitializer.BIAS_KEY).assign(0);
	initWeights(nIn, nOut, weightInit, params.get(DefaultParamInitializer.WEIGHT_KEY));

}

@Override
public InputType getOutputType(int layerIndex, InputType inputType) {
	return InputType.feedForward(nOut);
}

public int getNIn() {
	return nIn;
}

public void setNIn(int nIn) {
	this.nIn = nIn;
}

public int getNOut() {
	return nOut;
}

public void setNOut(int nOut) {
	this.nOut = nOut;
}

public Activation getActivation() {
	return activation;
}

public void setActivation(Activation activation) {
	this.activation = activation;
}

}

@longzhendong

This comment has been minimized.

Copy link
Author

commented May 27, 2019

@AlexDBlack I see that the SameDiffLayer is called activate first, so doInit is called first, while the SameDiffOutputLayer is called backpropGradient first, before sameDiff is initialized

@longzhendong

This comment has been minimized.

Copy link
Author

commented May 27, 2019

@AlexDBlack Can you give me an example of implementing custom layer for SameDiffOutputLayer? thank you!

@longzhendong

This comment has been minimized.

Copy link
Author

commented May 27, 2019

@AlexDBlack I modified some github on your example, add a custom SameDiffOutputLayer, the code is as follows:
public static void main(String[] args) throws Exception {

    int networkNumInputs = 28*28;       //For MNIST - 28x28 pixels
    int networkNumOutputs = 10;         //For MNIST - 10 classes
    int layerSize = 128;                //128 units for the SameDiff layers

    MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
        .updater(new Adam(1e-1))
        .seed(12345)
        .list()
        //Add two custom layers:
        .layer(new MinimalSameDiffDense(networkNumInputs, layerSize, Activation.TANH, WeightInit.XAVIER))
        .layer(new MinimalSameDiffDense(layerSize, layerSize, Activation.TANH, WeightInit.XAVIER))
        //Combine with a standard DL4J output layer
        .layer(new MinimalSameDiffOutputLayer(layerSize,10,Activation.SOFTMAX,WeightInit.XAVIER))
        .build();

    MultiLayerNetwork net = new MultiLayerNetwork(conf);
    net.init();
    net.setListeners(new ScoreIterationListener(50));

    System.out.println(net.summary());

    //Train and evaluate the network with the custom SameDiff layer
    //Note that training and evaluation is the same as with built-in layers
    DataSetIterator train = new MnistDataSetIterator(20, true, 12345);
    net.fit(train, 1);  //Train for 1 epoch

    DataSetIterator test = new MnistDataSetIterator(20, false, 12345);
    Evaluation e = net.evaluate(test);
    System.out.println(e.stats());

    //Also: validate correctness of the network/layer
   // validateLayer();
}
@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented May 28, 2019

OK, so there's definitely a bug here.
Here's a workaround:

net.output(Nd4j.rand(DataType.FLOAT, 1, 784));		//***Call this before fit to ensure layer gets initialized***
net.fit(train, 1);  //Train for 1 epoch

Second: there's 2 bugs in your defineLayer method.

  • Don't define your own input/labels variables
  • You forgot the negative on your log likelihood

Instead, use this:

    @Override
    public SDVariable defineLayer(SameDiff sameDiff, SDVariable layerInput, SDVariable labels,
                                  Map<String, SDVariable> paramTable) {
        SDVariable weights = paramTable.get(DefaultParamInitializer.WEIGHT_KEY);
        SDVariable bias = paramTable.get(DefaultParamInitializer.BIAS_KEY);
        SDVariable mmul = sameDiff.mmul("mmul", layerInput, weights);
        SDVariable z = mmul.add("z", bias);
        SDVariable a =activation.asSameDiff("out", sameDiff, z);

        return  labels.mul(sameDiff.math().log(a).neg()).sum("output",1);
    }

@AlexDBlack AlexDBlack changed the title SameDiffOutputLayer,Implement a custom layer,NullPointerException DL4J: SameDiffOutputLayer doInit() method not called if first operation is fit May 28, 2019

@longzhendong

This comment has been minimized.

Copy link
Author

commented May 28, 2019

@AlexDBlack Thank you very much!

AlexDBlack added a commit that referenced this issue May 28, 2019
Multiple fixes (#7793)
* #7786 SharedTrainingMaster environment variable fix

* #7785 SameDiffOutputLayer - doInit fix

* #7779 DL4J resources (pretrained models etc) URL: use https

* #7778 Remove DynamicCustomOp.sameDiffBuilder

* Revert #7779

* #7779 Use https resources address that works correctly (certificates match hostname)

* #7754 BaseNDArray.castTo - no-op if already correct type

* Handful of fixes for ND4J sessions tests

* Handful of fixes for ND4J sessions tests

* #7730 Webjars dependencies - lock down versions
@longzhendong

This comment has been minimized.

Copy link
Author

commented May 28, 2019

@AlexDBlack Evaluation will make mistakes for the same code when the training set and the test set minibatch are different.
The following code:
image

Exceptions are as follows

Exception in thread "main" java.lang.RuntimeException: Op [multiply] execution failed
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1645)
at org.nd4j.autodiff.samediff.internal.InferenceSession.getOutputs(InferenceSession.java:449)
at org.nd4j.autodiff.samediff.internal.InferenceSession.getOutputs(InferenceSession.java:37)
at org.nd4j.autodiff.samediff.internal.AbstractSession.output(AbstractSession.java:250)
at org.nd4j.autodiff.samediff.SameDiff.exec(SameDiff.java:3932)
at org.nd4j.autodiff.samediff.SameDiff.execSingle(SameDiff.java:3901)
at org.nd4j.autodiff.samediff.SameDiff.execAndEndResult(SameDiff.java:3193)
at org.deeplearning4j.nn.layers.samediff.SameDiffOutputLayer.activateHelper(SameDiffOutputLayer.java:110)
at org.deeplearning4j.nn.layers.samediff.SameDiffOutputLayer.activate(SameDiffOutputLayer.java:84)
at org.deeplearning4j.nn.layers.AbstractLayer.activate(AbstractLayer.java:257)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.outputOfLayerDetached(MultiLayerNetwork.java:1273)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.doEvaluationHelper(MultiLayerNetwork.java:3348)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.doEvaluation(MultiLayerNetwork.java:3300)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.evaluate(MultiLayerNetwork.java:3493)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.evaluate(MultiLayerNetwork.java:3403)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.evaluate(MultiLayerNetwork.java:3235)
at org.deeplearning4j.examples.samediff.dl4j.Ex1BasicSameDiffOutputLayerExample.main(Ex1BasicSameDiffOutputLayerExample.java:101)
Caused by: java.lang.RuntimeException: Op validation failed
at org.nd4j.nativeblas.Nd4jCpu$NativeOps.execCustomOp(Native Method)
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:2045)
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1637)

@Charele

This comment has been minimized.

Copy link

commented May 28, 2019

@longzhendong
Can you complete this program if you using same batchSize?

@longzhendong

This comment has been minimized.

Copy link
Author

commented May 29, 2019

@Charele Program can execute

@longzhendong

This comment has been minimized.

Copy link
Author

commented May 29, 2019

@AlexDBlack
In a multithreaded environment, there is also the issue of thread insecurity because multiple threads call "doInit" at the same time.

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

@AlexDBlack
In a multithreaded environment, there is also the issue of thread insecurity because multiple threads call "doInit" at the same time.

That's not correct, for two reasons.
First, you shouldn't be fitting a net from multiple threads simultaneously.
Second, it's synchronized anyway, so no race condition is possible: https://github.com/deeplearning4j/deeplearning4j/blob/c85d9629da93fa5b8c54fc632f890066bf742aa4/deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/nn/multilayer/MultiLayerNetwork.java#L1603

Anyway, fix has been merged to master a while ago.

@AlexDBlack AlexDBlack closed this Jun 4, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.