Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with Image captioning #60

Closed
mhsamavatian opened this issue Sep 11, 2017 · 5 comments
Closed

Error with Image captioning #60

mhsamavatian opened this issue Sep 11, 2017 · 5 comments

Comments

@mhsamavatian
Copy link

When I Run the evaluation command
python sample.py --image='png/example.png'

I got this error

Traceback (most recent call last):
File "sample.py", line 97, in
main(args)
File "sample.py", line 61, in main
sampled_ids = decoder.sample(feature)
File "/users/PAS1273/osu8235/pytorch/pytorch-tutorial/tutorials/03-advanced/image_captioning/model.py", line 62, in sample
hiddens, states = self.lstm(inputs, states) # (batch_size, 1, hidden_size),
File "/users/PAS1273/osu8235/.local/lib/python2.7/site-packages/torch/nn/modules/module.py", line 224, in call
result = self.forward(*input, **kwargs)
File "/users/PAS1273/osu8235/.local/lib/python2.7/site-packages/torch/nn/modules/rnn.py", line 162, in forward
output, hidden = func(input, self.all_weights, hx)
File "/users/PAS1273/osu8235/.local/lib/python2.7/site-packages/torch/nn/_functions/rnn.py", line 351, in forward
return func(input, *fargs, **fkwargs)
File "/users/PAS1273/osu8235/.local/lib/python2.7/site-packages/torch/autograd/function.py", line 284, in _do_forward
flat_output = super(NestedIOFunction, self)._do_forward(*flat_input)
File "/users/PAS1273/osu8235/.local/lib/python2.7/site-packages/torch/autograd/function.py", line 306, in forward
result = self.forward_extended(*nested_tensors)
File "/users/PAS1273/osu8235/.local/lib/python2.7/site-packages/torch/nn/_functions/rnn.py", line 293, in forward_extended
cudnn.rnn.forward(self, input, hx, weight, output, hy)
File "/users/PAS1273/osu8235/.local/lib/python2.7/site-packages/torch/backends/cudnn/rnn.py", line 208, in forward
'input must have 3 dimensions, got {}'.format(input.dim()))
RuntimeError: input must have 3 dimensions, got 2

@jysenj
Copy link

jysenj commented Sep 13, 2017

i got this to, have you fixed it?

@mhsamavatian
Copy link
Author

mhsamavatian commented Sep 13, 2017

Modify sample function in model.py like below
I added 'inputs = inputs.unsqueeze(1)' in last like of for loop and changed sampled_ids = torch.cat(sampled_ids, 1) to sampled_ids = torch.cat(sampled_ids, 0)


`def sample(self, features, states=None):

    """Samples captions for given image features (Greedy search)."""
    sampled_ids = []
    inputs = features.unsqueeze(1)
    for i in range(20):                                      # maximum sampling length
        hiddens, states = self.lstm(inputs, states)          # (batch_size, 1, hidden_size), 
        outputs = self.linear(hiddens.squeeze(1))            # (batch_size, vocab_size)
        predicted = outputs.max(1)[1]
        sampled_ids.append(predicted)
        inputs = self.embed(predicted)
        inputs = inputs.unsqueeze(1)
    sampled_ids = torch.cat(sampled_ids, 0)                  # (batch_size, 20)
    return sampled_ids.squeeze()`

@yunjey
Copy link
Owner

yunjey commented Sep 28, 2017

@mhsamavatian Thanks, you are right. I updated the code :-)

@yunjey yunjey closed this as completed Sep 28, 2017
@WangWenshan
Copy link

WangWenshan commented Dec 6, 2017

However, I have an error when I do that:
runtimeerror input must have 3 dimensions, got 4

@autogyro
Copy link

Hi there
I am also meet that problem, and then I add the 'inputs = inputs.unsqueeze(1)',

def sample(self, features, states=None):
    """Samples captions for given image features (Greedy search)."""
    sampled_ids = []
    inputs = features.unsqueeze(1)
    for i in range(20):                                      # maximum sampling length
        hiddens, states = self.lstm(inputs, states)          # (batch_size, 1, hidden_size), 
        outputs = self.linear(hiddens.squeeze(1))            # (batch_size, vocab_size)
        predicted = outputs.max(1)[1]
        sampled_ids.append(predicted)
        inputs = self.embed(predicted)
        inputs = inputs.unsqueeze(1)
    sampled_ids = torch.cat(sampled_ids, 1)                  # (batch_size, 20)
    return sampled_ids.squeeze()

but, I got the following:
File "D:\Dev\image_captioning\model.py", line 134, in sample

sampled_ids = torch.cat(sampled_ids, 1) # (batch_size, 20)

RuntimeError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants