Skip to content
This repository has been archived by the owner on Aug 18, 2021. It is now read-only.

A bug in seq2seq translation #54

Closed
caozhen-alex opened this issue Aug 30, 2017 · 6 comments
Closed

A bug in seq2seq translation #54

caozhen-alex opened this issue Aug 30, 2017 · 6 comments

Comments

@caozhen-alex
Copy link

caozhen-alex commented Aug 30, 2017

When we run the code testing the models, It raises the error :

RuntimeError: Expected argument self to have 1 dimension(s), but has 2 at /Users/soumith/miniconda2/conda-bld/pytorch_1502000696751/work/torch/csrc/generic/TensorMethods.cpp:23020

Details are show below:

EncoderRNN (
  (embedding): Embedding(10, 10)
  (gru): GRU(10, 10, num_layers=2)
)
AttnDecoderRNN (
  (embedding): Embedding(10, 10)
  (gru): GRU(20, 10, num_layers=2, dropout=0.1)
  (out): Linear (20 -> 10)
  (attn): Attn (
    (attn): Linear (10 -> 10)
  )
)
---------------------------------------------------------
RuntimeError            Traceback (most recent call last)
<ipython-input-14-7c49add1a901> in <module>()
     22 
     23 for i in range(3):
---> 24     decoder_output, decoder_context, decoder_hidden, decoder_attn = decoder_test.forward(word_inputs[i], decoder_context, decoder_hidden, encoder_outputs)
     25     print(decoder_output.size(), decoder_hidden.size(), decoder_attn.size())
     26     decoder_attns[0, i] = decoder_attn.squeeze(0).cpu().data

<ipython-input-13-1e8710146be2> in forward(self, word_input, last_context, last_hidden, encoder_outputs)
     30 
     31         # Calculate attention from current RNN state and all encoder outputs; apply to encoder outputs
---> 32         attn_weights = self.attn(rnn_output.squeeze(0), encoder_outputs)
     33         context = attn_weights.bmm(encoder_outputs.transpose(0, 1)) # B x 1 x N
     34 

~/anaconda/envs/pytorch_nmt3.5/lib/python3.5/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    222         for hook in self._forward_pre_hooks.values():
    223             hook(self, input)
--> 224         result = self.forward(*input, **kwargs)
    225         for hook in self._forward_hooks.values():
    226             hook_result = hook(self, input, result)

<ipython-input-12-f800dd294bc2> in forward(self, hidden, encoder_outputs)
     22         # Calculate energies for each encoder output
     23         for i in range(seq_len):
---> 24             attn_energies[i] = self.score(hidden, encoder_outputs[i])
     25 
     26         # Normalize energies to weights in range 0 to 1, resize to 1 x 1 x seq_len

<ipython-input-12-f800dd294bc2> in score(self, hidden, encoder_output)
     35         elif self.method == 'general':
     36             energy = self.attn(encoder_output)
---> 37             energy = hidden.dot(energy)
     38             return energy
     39 

~/anaconda/envs/pytorch_nmt3.5/lib/python3.5/site-packages/torch/autograd/variable.py in dot(self, other)
    629 
    630     def dot(self, other):
--> 631         return Dot.apply(self, other)
    632 
    633     def _addcop(self, op, args, inplace):

~/anaconda/envs/pytorch_nmt3.5/lib/python3.5/site-packages/torch/autograd/_functions/blas.py in forward(ctx, vector1, vector2)
    209         ctx.save_for_backward(vector1, vector2)
    210         ctx.sizes = (vector1.size(), vector2.size())
--> 211         return vector1.new((vector1.dot(vector2),))
    212 
    213     @staticmethod

RuntimeError: Expected argument self to have 1 dimension(s), but has 2 at /Users/soumith/miniconda2/conda-bld/pytorch_1502000696751/work/torch/csrc/generic/TensorMethods.cpp:23020
@czs0x55aa
Copy link

In pytorch v0.2, it removed implicit flattening for dot. pytorch/pytorch#2313
You can use the following writing

torch.dot(hidden.view(-1), energy.view(-1))

@caozhen-alex
Copy link
Author

@czs0x55aa Thank you very much for answering. But where should I revise in this case--seq2seq translation.

@czs0x55aa
Copy link

you need to modify score function in Attention Model.
maybe you can use code follow:

def score(self, hidden, encoder_output):
        if self.method == 'dot':
            energy =torch.dot(hidden.view(-1), encoder_output.view(-1))
        elif self.method == 'general':
            energy = self.attn(encoder_output)
            energy = torch.dot(hidden.view(-1), energy.view(-1))
        elif self.method == 'concat':
            energy = self.attn(torch.cat((hidden, encoder_output), 1))
            energy = torch.dot(self.v.view(-1), energy.view(-1))
        return energy

but this implementation will very slower in GPU. #56

@caozhen-alex
Copy link
Author

@czs0x55aa Thank you very much! It seems work. However, it raised another error.


ValueError Traceback (most recent call last)
in ()
8
9 # Run the train function
---> 10 loss = train(input_variable, target_variable, encoder, decoder, encoder_optimizer, decoder_optimizer, criterion)
11
12 # Keep track of loss

in train(input_variable, target_variable, encoder, decoder, encoder_optimizer, decoder_optimizer, criterion, max_length)
32 for di in range(target_length):
33 decoder_output, decoder_context, decoder_hidden, decoder_attention = decoder(decoder_input, decoder_context, decoder_hidden, encoder_outputs)
---> 34 loss += criterion(decoder_output[0], target_variable[di])
35 decoder_input = target_variable[di] # Next target is next input
36

~/anaconda/envs/pytorch_nmt3.5/lib/python3.5/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
222 for hook in self._forward_pre_hooks.values():
223 hook(self, input)
--> 224 result = self.forward(*input, **kwargs)
225 for hook in self._forward_hooks.values():
226 hook_result = hook(self, input, result)

~/anaconda/envs/pytorch_nmt3.5/lib/python3.5/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
130 _assert_no_grad(target)
131 return F.nll_loss(input, target, self.weight, self.size_average,
--> 132 self.ignore_index)
133
134

~/anaconda/envs/pytorch_nmt3.5/lib/python3.5/site-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index)
674 return _functions.thnn.NLLLoss2d.apply(input, target, weight, size_average, ignore_index)
675 else:
--> 676 raise ValueError('Expected 2 or 4 dimensions (got {})'.format(dim))
677
678

ValueError: Expected 2 or 4 dimensions (got 1)

Btw, I am a beginner of coding. How can I deal with this kind of error raised from source code? Thank you for help!

@czs0x55aa
Copy link

czs0x55aa commented Sep 8, 2017

sorry, i'm not facing this issue.
maybe you should to check the tensor size of decoder_output and target_variable

@quoniammm
Copy link

You should change:
loss += criterion(decoder_output[0], target_variable[di]) to
loss += criterion(decoder_output, target_variable[di])

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants