Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

evaluate error on val #5

Closed
bidongqinxian opened this issue May 29, 2018 · 8 comments
Closed

evaluate error on val #5

bidongqinxian opened this issue May 29, 2018 · 8 comments

Comments

@bidongqinxian
Copy link

bidongqinxian commented May 29, 2018

Hello, when I evaluate on val datasets, the error appears, so what's wrong?

Traceback (most recent call last):
  File "eval.py", line 17, in <module>
    dest_dir=Path('/data'), batch_size=128)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 256, in generate_programs
    programs_pred = program_generator.reinforce_sample(questions_var)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 121, in reinforce_sample
    encoded = self.encoder(x)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 91, in encoder
    embed = self.encoder_embed(x)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/sparse.py", line 103, in forward
    self.scale_grad_by_freq, self.sparse
RuntimeError: save_for_backward can only save input or output tensors, but argument 0 doesn't satisfy this condition

and this is the place I changed in test-eval:

vocab_path = Path('data/vocab.json')
model_path = Path('models/clevr-reg-hres.pt')
tbd_net = load_tbd_net(model_path, load_vocab(vocab_path))

program_generator = load_program_generator(Path('models/program_generator.pt'))
generate_programs(Path('data/val_questions.h5'), program_generator, 
                  dest_dir=Path('/data'), batch_size=128)
                  
use_np_features = False
if use_np_features:
    features = np.load(str(Path('data/test/test_features.npy')), mmap_mode='r')
else:
    features = h5py.File(Path('data/val_features.h5'))['features']

question_np = np.load(Path('data/val_questions.npy'))
image_idx_np = np.load(Path('data/val_image_idxs.npy'))
programs_np = np.load(Path('data/val_programs.npy'))
@davidmascharka
Copy link
Owner

Could you let us know which version of PyTorch you're running and what version of our code you have?

Note: I updated your comment to use triple-backticks so the line breaks in your code blocks are displayed properly.

@bidongqinxian
Copy link
Author

thank you for your reply! I use PyTorch v0.3.0 and python3.5, the code is the latest you put on github. and I extract the images and questions feature by using this instructions, I got 2 h5 files about val.

@davidmascharka
Copy link
Owner

davidmascharka commented May 30, 2018

The latest version of master was rewritten to run in PyTorch 0.4, which unfortunately broke compatibility with v0.3 and earlier. If you git checkout tags/torch0.3 then the code should run. There's a note buried in our README but it's a bit hard to find.

Alternatively, upgrading to PyTorch 0.4 has many niceties that are worth the upgrade. The PyTorch API has also stabilized and is not expected to change dramatically.

@bidongqinxian
Copy link
Author

I upgrade my pytorch to 0.4, and the old problem disappear,but new problem came like that

Generating programs...
/home/dengwei/tbd-nets/utils/generate_programs.py:261: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  volatile=True)
Traceback (most recent call last):
  File "eval.py", line 17, in <module>
    dest_dir=Path('data/val'), batch_size=128)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 263, in generate_programs
    programs_pred = program_generator.reinforce_sample(questions_var)
  File "/home/dengwei/tbd-nets/utils/generate_programs.py", line 137, in reinforce_sample
    y[:, t][not_done] = cur_output_data[not_done]
RuntimeError: expand(torch.LongTensor{[128, 1]}, size=[128]): the number of sizes provided (1) must be greater or equal to the number of dimensions in the tensor (2)

so, the batchsize is not right?but I change the batchsize the error still exist. could u give me some advice to solve it?

@davidmascharka
Copy link
Owner

Are you on the latest master? That line that's failing should actually be

y[:, t][not_done] = cur_output_data[not_done][0]

and it looks like your file is missing that [0] at the end. This should be line 132 of utils/generate_programs.py

@davidmascharka
Copy link
Owner

@bidongqinxian did this fix the issue?

@bidongqinxian
Copy link
Author

bidongqinxian commented Jun 4, 2018

yes, I downloaded the former master and now I update the latest master.But unfortunately, other error came out as:

Traceback (most recent call last):
  File "eval.py", line 72, in <module>
    outputs = tbd_net(feats, programs)
  File "/home/dengwei/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dengwei/tbd-nets/tbd/module_net.py", line 195, in forward
    output = module(feat_input, output)
  File "/home/dengwei/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/dengwei/tbd-nets/tbd/modules.py", line 92, in forward
    attended_feats = torch.mul(feats, attn.repeat(1, self.dim, 1, 1))
RuntimeError: The size of tensor a (128) must match the size of tensor b (16384) at non-singleton dimension 1

did I miss something? thanks for your advice.@davidmascharka

@davidmascharka
Copy link
Owner

Since that fixes the original issue here I'm going to close this.

Without seeing your eval.py file I really can't say where the error is coming from. It looks like something is getting reshaped incorrectly or the batch is malformed. Feel free to open another issue and post that code if you're unable to diagnose from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants