EmbeddingSequenceLayer not working with BertIterator #9481

thomas-trendsoft · 2021-10-11T14:08:58Z

Issue Description

I try to build a EmbeddingSequenceLayer training with some BertIterator. My expectation was that the model would do a fit.
The current behavious is an DataType Exception, see below on additional information.

Version Information

Please indicate relevant versions, including, if relevant:

Deeplearning4j version = M1.1
Platform information = MacOS
CUDA = no

Additional Information

Code:

public static void main(String[] args) throws IOException, InterruptedException {
int sentlen = 100;

	LabeledSentenceProvider    provider;
	
	provider = ... some provider ...
	
	BertWordPieceTokenizerFactory tokenizerFactory = new BertWordPieceTokenizerFactory(new File("./vocab.txt"), false, true, Charsets.UTF_8);
	
	BertIterator iter = new BertIterator.Builder()
			.tokenizer(tokenizerFactory)
			.lengthHandling(LengthHandling.FIXED_LENGTH, sentlen)
			.minibatchSize(10)
			.sentenceProvider(provider)
			.task(Task.UNSUPERVISED)
			.vocabMap(tokenizerFactory.getVocab())
			.featureArrays(BertIterator.FeatureArrays.INDICES_MASK)
			.masker(new BertMaskedLMMasker(new Random(12345), 0.2, 0.5, 0.5))
			.unsupervisedLabelFormat(BertIterator.UnsupervisedLabelFormat.RANK3_NCL)
			.maskToken("[MASK]")
			.build();
	
	
	int vocabsize = tokenizerFactory.getVocab().size();
	
	HashMap<String,InputPreProcessor> preproc = new HashMap<>();
	preproc.put("output", new RnnToFeedForwardPreProcessor(RNNFormat.NCW));
	
	ComputationGraphConfiguration.GraphBuilder builder = new NeuralNetConfiguration.Builder()
            .seed(42345)
            .l2(0.0001)
            .weightInit(WeightInit.XAVIER)
            .updater(new Adam(0.0015))
            .graphBuilder();
        
	builder.setInputTypes(InputType.recurrent(vocabsize, sentlen, RNNFormat.NCW));
	builder.addInputs("token");
	
	System.out.println("VOCAB size: " + vocabsize);
	
	// embedding tokens layer
	builder.addLayer("emb", new EmbeddingSequenceLayer.Builder()
			.nIn(vocabsize)
			.nOut(756)
			.build() , "token");
	
	// try a single transfer block first
	
	// attention multi head 
	builder.addLayer("attention1",new SelfAttentionLayer.Builder()
			.nIn(756)
			.nOut(756)
			.nHeads(2)
			.projectInput(true)
			.build()
			, "emb");
	
	// feed forward
	builder.addLayer("ffint1", new LSTM.Builder()
			.nOut(756)
			.build(), "attention1");
	

	
	// make it output
	builder.addLayer("output", new RnnOutputLayer.Builder()
			.nOut(vocabsize)
			.dataFormat(RNNFormat.NCW)
			.activation(Activation.SOFTMAX)
			.build(), "ffint1");
	
	builder.setOutputs("output");

	
	//ComputationGraphConfiguration
	ComputationGraph model = new ComputationGraph(builder.build());
	
	model.fit(iter);
	
	
}

`

StackTrace:

Exception in thread "main" java.lang.IllegalArgumentException: Op.X must have same data type as Op.Y: X.datatype=FLOAT, Y.datatype=INT at org.nd4j.common.base.Preconditions.throwEx(Preconditions.java:633) at org.nd4j.common.base.Preconditions.checkArgument(Preconditions.java:134) at org.nd4j.linalg.api.ops.BaseBroadcastOp.validateDataTypes(BaseBroadcastOp.java:200) at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:889) at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:879) at org.nd4j.linalg.factory.Broadcast.mul(Broadcast.java:149) at org.deeplearning4j.nn.layers.feedforward.embedding.EmbeddingSequenceLayer.backpropGradient(EmbeddingSequenceLayer.java:64) at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doBackward(LayerVertex.java:148) at org.deeplearning4j.nn.graph.ComputationGraph.calcBackpropGradients(ComputationGraph.java:2772) at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1381) at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1341) at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:174) at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:61) at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52) at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1165) at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1115) at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1082)

Contributing

Sorry no fix, only a big thanks for your great work.

The text was updated successfully, but these errors were encountered:

treo · 2021-10-11T14:13:22Z

https://github.com/eclipse/deeplearning4j/blob/master/deeplearning4j/deeplearning4j-nlp-parent/deeplearning4j-nlp/src/main/java/org/deeplearning4j/iterator/BertIterator.java#L206 uses int[] to create the mask arrays.

This looks like it was missed when we added support for keeping the original array type on creation instead of forcing all tensors to be either float or double.

Fixes #9481

agibsonccc added a commit that referenced this issue Oct 18, 2021

Fixes #9481

fb61e78

agibsonccc mentioned this issue Oct 18, 2021

Fixes https://github.com/eclipse/deeplearning4j/issues/9481 #9493

Merged

agibsonccc closed this as completed in #9493 Oct 18, 2021

agibsonccc added a commit that referenced this issue Oct 18, 2021

Merge pull request #9493 from eclipse/ag_fix_9481

d665d5e

Fixes #9481

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EmbeddingSequenceLayer not working with BertIterator #9481

EmbeddingSequenceLayer not working with BertIterator #9481

thomas-trendsoft commented Oct 11, 2021

treo commented Oct 11, 2021

EmbeddingSequenceLayer not working with BertIterator #9481

EmbeddingSequenceLayer not working with BertIterator #9481

Comments

thomas-trendsoft commented Oct 11, 2021

Issue Description

Version Information

Additional Information

Contributing

treo commented Oct 11, 2021