evaluate error on val #5

bidongqinxian · 2018-05-29T03:15:01Z

Hello, when I evaluate on val datasets, the error appears, so what's wrong?

Traceback (most recent call last):
  File "eval.py", line 17, in <module>
    dest_dir=Path('/data'), batch_size=128)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 256, in generate_programs
    programs_pred = program_generator.reinforce_sample(questions_var)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 121, in reinforce_sample
    encoded = self.encoder(x)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 91, in encoder
    embed = self.encoder_embed(x)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/sparse.py", line 103, in forward
    self.scale_grad_by_freq, self.sparse
RuntimeError: save_for_backward can only save input or output tensors, but argument 0 doesn't satisfy this condition

and this is the place I changed in test-eval:

vocab_path = Path('data/vocab.json')
model_path = Path('models/clevr-reg-hres.pt')
tbd_net = load_tbd_net(model_path, load_vocab(vocab_path))

program_generator = load_program_generator(Path('models/program_generator.pt'))
generate_programs(Path('data/val_questions.h5'), program_generator, 
                  dest_dir=Path('/data'), batch_size=128)
                  
use_np_features = False
if use_np_features:
    features = np.load(str(Path('data/test/test_features.npy')), mmap_mode='r')
else:
    features = h5py.File(Path('data/val_features.h5'))['features']

question_np = np.load(Path('data/val_questions.npy'))
image_idx_np = np.load(Path('data/val_image_idxs.npy'))
programs_np = np.load(Path('data/val_programs.npy'))

The text was updated successfully, but these errors were encountered:

davidmascharka · 2018-05-29T13:55:14Z

Could you let us know which version of PyTorch you're running and what version of our code you have?

Note: I updated your comment to use triple-backticks so the line breaks in your code blocks are displayed properly.

bidongqinxian · 2018-05-30T01:21:02Z

thank you for your reply! I use PyTorch v0.3.0 and python3.5, the code is the latest you put on github. and I extract the images and questions feature by using this instructions, I got 2 h5 files about val.

davidmascharka · 2018-05-30T12:50:36Z

The latest version of master was rewritten to run in PyTorch 0.4, which unfortunately broke compatibility with v0.3 and earlier. If you git checkout tags/torch0.3 then the code should run. There's a note buried in our README but it's a bit hard to find.

Alternatively, upgrading to PyTorch 0.4 has many niceties that are worth the upgrade. The PyTorch API has also stabilized and is not expected to change dramatically.

bidongqinxian · 2018-06-01T15:05:58Z

I upgrade my pytorch to 0.4, and the old problem disappear,but new problem came like that

Generating programs...
/home/dengwei/tbd-nets/utils/generate_programs.py:261: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  volatile=True)
Traceback (most recent call last):
  File "eval.py", line 17, in <module>
    dest_dir=Path('data/val'), batch_size=128)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 263, in generate_programs
    programs_pred = program_generator.reinforce_sample(questions_var)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 137, in reinforce_sample
    y[:, t][not_done] = cur_output_data[not_done]
RuntimeError: expand(torch.LongTensor{[128, 1]}, size=[128]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

so, the batchsize is not right?but I change the batchsize the error still exist. could u give me some advice to solve it?

davidmascharka · 2018-06-01T16:40:47Z

Are you on the latest master? That line that's failing should actually be

y[:, t][not_done] = cur_output_data[not_done][0]

and it looks like your file is missing that [0] at the end. This should be line 132 of utils/generate_programs.py

davidmascharka · 2018-06-04T12:40:16Z

@bidongqinxian did this fix the issue?

bidongqinxian · 2018-06-04T13:29:35Z

yes, I downloaded the former master and now I update the latest master.But unfortunately, other error came out as:

Traceback (most recent call last):
  File "eval.py", line 72, in <module>
    outputs = tbd_net(feats, programs)
  File "/home/dengwei/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dengwei/tbd-nets/tbd/module_net.py", line 195, in forward
    output = module(feat_input, output)
  File "/home/dengwei/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dengwei/tbd-nets/tbd/modules.py", line 92, in forward
    attended_feats = torch.mul(feats, attn.repeat(1, self.dim, 1, 1))
RuntimeError: The size of tensor a (128) must match the size of tensor b (16384) at non-singleton dimension 1

did I miss something? thanks for your advice.@davidmascharka

davidmascharka · 2018-06-04T13:40:25Z

Since that fixes the original issue here I'm going to close this.

Without seeing your eval.py file I really can't say where the error is coming from. It looks like something is getting reshaped incorrectly or the batch is malformed. Feel free to open another issue and post that code if you're unable to diagnose from there.

davidmascharka closed this as completed Jun 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evaluate error on val #5

evaluate error on val #5

bidongqinxian commented May 29, 2018 •

edited by davidmascharka

Loading

davidmascharka commented May 29, 2018

bidongqinxian commented May 30, 2018

davidmascharka commented May 30, 2018 •

edited

Loading

bidongqinxian commented Jun 1, 2018

davidmascharka commented Jun 1, 2018

davidmascharka commented Jun 4, 2018

bidongqinxian commented Jun 4, 2018 •

edited

Loading

davidmascharka commented Jun 4, 2018

evaluate error on val #5

evaluate error on val #5

Comments

bidongqinxian commented May 29, 2018 • edited by davidmascharka Loading

davidmascharka commented May 29, 2018

bidongqinxian commented May 30, 2018

davidmascharka commented May 30, 2018 • edited Loading

bidongqinxian commented Jun 1, 2018

davidmascharka commented Jun 1, 2018

davidmascharka commented Jun 4, 2018

bidongqinxian commented Jun 4, 2018 • edited Loading

davidmascharka commented Jun 4, 2018

bidongqinxian commented May 29, 2018 •

edited by davidmascharka

Loading

davidmascharka commented May 30, 2018 •

edited

Loading

bidongqinxian commented Jun 4, 2018 •

edited

Loading