Error in PickNegLogSoftmax #72

vered1986 · 2016-09-08T12:16:02Z

I'm trying to use pickneglogsoftmax_batch (using gpycnn, v2) and get the following error:

python: /home/vered/cnn/cnn/nodes.cc:1380: void cnn::PickNegLogSoftmax::forward_dev_impl(const MyDevice&, const std::vector<const cnn::Tensor*>&, cnn::Tensor&) const [with MyDevice = cnn::Device_GPU]: Assertion `pvals->size() == fx.d.batch_elems()' failed.

With the following code:

loss = pickneglogsoftmax_batch(batch_predictions, batch_labels)

where the shape of batch_predictions is (batch_size * output_dim, 1). Other shapes (i.e. a matrix of batch_size x output_dim or its transpose) return a dimension error.

Thanks in advance.

The text was updated successfully, but these errors were encountered:

neubig · 2016-09-09T00:47:00Z

Actually, (batch_size * outputdim) is the wrong thing to do here. Each Tensor object has a special "batch" dimension, so it needs to be of size ([outputdim, 1], batch_size) where batch_size corresponds to this batch dimension. Creating lookup inputs in this shape is supported through lookup_batch, as in minibatch.py. If you actually want to create non-lookup input with minibatches, this is supported in the c++ version but perhaps not in the python version yet? Right, @yoavg ?

vered1986 · 2016-09-09T11:40:13Z

Thanks!

I did originally call lookup_batch, but then fed these vectors into LSTMs, concatenated their outputs and used that as the input of another hidden layer, with some reshapes along the way. So it's either a lack of pycnn support, or wrong usage on my side.

Anyway, I'm currently using pickneglogsoftmax for each instance and summing them up, which works well, though I suppose slower.

neubig · 2016-09-09T13:52:39Z

If that's the case, this shouldn't cause an error. If you can create a minimal example that reproduces the error I can take a look and try to debug.

vered1986 · 2016-09-09T15:14:16Z

Here is an example code that produces an error:

from pycnn import *

m = Model()
lp = m.add_lookup_parameters((100, 10))

# Add parameters
W = parameter(m.add_parameters((5, 10)))
b = parameter(m.add_parameters((5)))
builder = LSTMBuilder(1, 10, 10, m)

# Batch lookup (for every timestamp of the LSTM)
batch = [(1, 2, 3, 4, 5), (5, 4, 3, 2, 1), (1, 1, 2, 2, 3)]
minibatch_size = 3
seq_length = 5
inputs = [lookup_batch(lp, [batch[i][t] for i in range(minibatch_size)]) for t in range(seq_length)]

# Apply some transformations
lstm_outputs = builder.initial_state().transduce(inputs)[-1]
lstm_outputs = transpose(reshape(lstm_outputs, (10, minibatch_size))) # reshape to (minibatch_size, 10)
output = colwise_add(W * transpose(lstm_outputs), b)

print 'output.npvalue().shape =', output.npvalue().shape, '(batch_size=3, output dimension=5)'

# Using pickneglogsoftmax_batch (fails with "what():  Bad input dimensions in PickNegLogSoftmax: [{5,3}]")
labels = [1, 2, 3]
print pickneglogsoftmax_batch(output, labels).npvalue()

The output size seems to be wrong after the operations I applied, so pickneglogsoftmax_batch raises an error (and again, if I reshape output to (15, 1) I get the first error I mentioned).

Thanks!

neubig · 2016-10-11T13:42:26Z

OK, I've reproduced this and will try to fix it soon. Thanks for the easy-to-reproduce example.

neubig · 2016-10-12T11:36:23Z

Hmm, this is because of the reshape here. dynet treats the number of minibatches as a special dimension, so the reshape needs to not be a normal (10 x minibatchsize) dimensioned matrix, but a (10x1) matrix, with the special "minibatch" dimension be equal to minibatchsize. I'm not exactly sure how to do this in the python version though...

yoavg · 2016-10-12T15:23:43Z

How would you do it in the cpp version?

neubig · 2016-10-12T15:25:31Z

Dim({10}, minibatchsize)

yoavg · 2016-10-12T16:10:26Z

is Dim({10}) and Dim({10}, 1) equivalent? (batch size 1?)

On Wed, Oct 12, 2016 at 6:25 PM, Graham Neubig notifications@github.com
wrote:

Dim({10}, minibatchsize)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#72 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAcVtGjWIahQzOjGIvle6p2DO2g4wTyOks5qzPvsgaJpZM4J37LN
.

neubig · 2016-10-12T16:12:16Z

Yes.

yoavg · 2016-10-12T16:41:55Z

I pushed batch size support in reshape (and in Dim in general).
The code now works with the following change:

from dynet import *

m = Model()
lp = m.add_lookup_parameters((100, 10))

# Add parameters
W = parameter(m.add_parameters((5, 10)))
b = parameter(m.add_parameters((5)))
builder = LSTMBuilder(1, 10, 10, m)

# Batch lookup (for every timestamp of the LSTM)
batch = [(1, 2, 3, 4, 5), (5, 4, 3, 2, 1), (1, 1, 2, 2, 3)]
minibatch_size = 3
seq_length = 5
inputs = [lookup_batch(lp, [batch[i][t] for i in range(minibatch_size)]) for t in range(seq_length)]

# Apply some transformations
lstm_outputs = builder.initial_state().transduce(inputs)[-1]
lstm_outputs = transpose(reshape(lstm_outputs, (10,), batch_size=minibatch_size)) # reshape to (minibatch_size, 10)
output = colwise_add(W * transpose(lstm_outputs), b)

print 'output.npvalue().shape =', output.npvalue().shape, '(batch_size=3, output dimension=5)'

# Using pickneglogsoftmax_batch (fails with "what():  Bad input dimensions in PickNegLogSoftmax: [{5,3}]")
labels = [1, 2, 3]
print pickneglogsoftmax_batch(output, labels).npvalue()

Still need to document it, though.

(also, may still be buggy when converting to numpy array, this may need a bit more work)

vered1986 · 2016-10-13T08:52:38Z

Thank you! The example code seems to work now.

vered1986 · 2016-10-15T21:24:07Z

Sorry, I might have closed this bug too soon. Though there is no exception now, I can't seem to get the loss to decrease between iterations (which previously worked well with my workaround). I printed the value of lstm_outputs after the reshape, and it is a zero matrix. It happens also in the example code.

I'm not sure if it's a bug in the conversion to numpy array (when calling print lstm_outputs.npvalue()) or this is the actual values of the matrix. In the former case, it might be that I simply don't do the update as I should (I've made an "educated guess" based on several C examples). Here is an example code that reproduces this problem:

import numpy as np
from dynet import *

m = Model()
lp = m.add_lookup_parameters((100, 10))

# Add parameters
W = parameter(m.add_parameters((5, 10)))
builder = LSTMBuilder(1, 10, 10, m)

trainer = AdamTrainer(m)
num_epochs = 10
minibatch_size = 3
seq_length = 5
train_set = [(1, 1, 1, 1, 1), (2, 2, 2, 2, 2), (1, 1, 1, 1, 1)]
labels = [1, 2, 1]

for epoch in range(num_epochs):

    total_loss = 0.0

    # Use the entire small training set as a batch, just for the example
    batch_indices = np.random.permutation(len(train_set))

    # Batch lookup (for every timestamp of the LSTM)
    inputs = [lookup_batch(lp, [train_set[i][t] for i in range(minibatch_size)]) for t in range(seq_length)]

    # Apply some transformations
    lstm_outputs = builder.initial_state().transduce(inputs)[-1]
    lstm_outputs = reshape(lstm_outputs, (10,), batch_size=minibatch_size) # reshape to (minibatch_size, 10)
    output = W * lstm_outputs

    # Compute the loss for this batch
    curr_labels = [labels[i] for i in batch_indices]
    loss = pickneglogsoftmax_batch(output, curr_labels)

    sum_loss = sum_batches(loss)

    total_loss += sum_loss.value()
    sum_loss.backward()
    trainer.update()

    trainer.update_epoch()
    total_loss /= len(labels)
    print 'Epoch', (epoch + 1), '/', num_epochs, 'Loss =', total_loss

The output:

Epoch 1 / 10 Loss = 1.6094379425
Epoch 2 / 10 Loss = 1.6094379425
Epoch 3 / 10 Loss = 1.6094379425
Epoch 4 / 10 Loss = 1.6094379425
Epoch 5 / 10 Loss = 1.6094379425
Epoch 6 / 10 Loss = 1.6094379425
Epoch 7 / 10 Loss = 1.6094379425
Epoch 8 / 10 Loss = 1.6094379425
Epoch 9 / 10 Loss = 1.6094379425
Epoch 10 / 10 Loss = 1.6094379425

Thanks!

yoavg · 2016-10-16T12:37:28Z

this is lilkely related to #99
@neubig , can you take a look?

neubig · 2016-10-19T21:00:43Z

I've confirmed that this works now, due to the fix of #99

neubig added the need more info label Sep 9, 2016

neubig added major bug Issues that silently cause incorrect results, break installation on common environments, etc. and removed need more info labels Sep 9, 2016

neubig self-assigned this Oct 11, 2016

neubig added minor bug Bugs that aren't too bad, only concern documentation, or have easy work-arounds and removed major bug Issues that silently cause incorrect results, break installation on common environments, etc. labels Oct 12, 2016

yoavg mentioned this issue Oct 12, 2016

reshape with batch size zeros out values #99

Closed

vered1986 closed this as completed Oct 13, 2016

vered1986 reopened this Oct 15, 2016

neubig closed this as completed Oct 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in PickNegLogSoftmax #72

Error in PickNegLogSoftmax #72

vered1986 commented Sep 8, 2016

neubig commented Sep 9, 2016

vered1986 commented Sep 9, 2016

neubig commented Sep 9, 2016

vered1986 commented Sep 9, 2016

neubig commented Oct 11, 2016

neubig commented Oct 12, 2016

yoavg commented Oct 12, 2016

neubig commented Oct 12, 2016

yoavg commented Oct 12, 2016

neubig commented Oct 12, 2016

yoavg commented Oct 12, 2016

vered1986 commented Oct 13, 2016

vered1986 commented Oct 15, 2016 •

edited

Loading

yoavg commented Oct 16, 2016

neubig commented Oct 19, 2016

Error in PickNegLogSoftmax #72

Error in PickNegLogSoftmax #72

Comments

vered1986 commented Sep 8, 2016

neubig commented Sep 9, 2016

vered1986 commented Sep 9, 2016

neubig commented Sep 9, 2016

vered1986 commented Sep 9, 2016

neubig commented Oct 11, 2016

neubig commented Oct 12, 2016

yoavg commented Oct 12, 2016

neubig commented Oct 12, 2016

yoavg commented Oct 12, 2016

neubig commented Oct 12, 2016

yoavg commented Oct 12, 2016

vered1986 commented Oct 13, 2016

vered1986 commented Oct 15, 2016 • edited Loading

yoavg commented Oct 16, 2016

neubig commented Oct 19, 2016

vered1986 commented Oct 15, 2016 •

edited

Loading