Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on tensors size when inputing actual image size #4

Open
pasalvetti opened this issue Dec 23, 2020 · 6 comments
Open

Error on tensors size when inputing actual image size #4

pasalvetti opened this issue Dec 23, 2020 · 6 comments

Comments

@pasalvetti
Copy link

Hello,

First of all, congrats for this amazing work, and thank you for sharing it.

When using the attribute --image_size and specifying the actual size of my input frames (720x964) in order for them not to be cropped, I get the following error :
Sizes of tensors must match except in dimension 3. Got 120 and 121 (The offending index is 0)
I tried numerous values and got this kind of error almost systematically. After a lot of trial and error, it seems that values matching the pattern 16 * p x 32 * q works (p and q integers).

Do you have any idea of what could be the cause of this error and if I am doing something wrong?

Thanks a lot.

NB : there is a small typo in README.md in the Test section : image-size instead of image_size.
image

@RachelBlin
Copy link

Hello,

As the previous comment says, thank you very much for sharing your work.

I also tested you work on my own frames (500x500) and got a similar error:
RuntimeError: Sizes of tensors must match except in dimension 2. Got 62 and 63 (The offending index is 0)

After checking the code details, I think it might come from the class WarpNet(nn.Module): in the def __init__(self, batch_size): (models/NonlocalNet.py file) where they compute the features to be concatenated. We can see on top of each layer definition that they expect dimensions 44*44 as output for each of the four layers (probably corresponding to the features' dimensions for their default frame size), making two upsampling and one downsampling. The problem might be due to the fact the downsampling function must deal with features with odd dimensions at some point and trunc or/and round these numbers, causing a dimension mismatch between the four returned features.

As an illustration you can see that if the inputs to the layer functions are of shape:

torch.Size([1, 128, 125, 125])
torch.Size([1, 256, 62, 62])
torch.Size([1, 512, 31, 31])
torch.Size([1, 512, 15, 15])

The feature will be of shape:

torch.Size([1, 64, 63, 63]) # downsampling 125*125 by 2 returning 63*63
torch.Size([1, 64, 62, 62]) # keeping 62*62
torch.Size([1, 64, 62, 62]) # upsampling 31*31 by two returning 62*62
torch.Size([1, 64, 62, 60]) # upsampling 15*15 by 4 by two returning 60*60

However, I don't know how to correct that issue yet. I was wondering if @pasalvetti has some updates since the opening of the issue ?

Thanks a lot.

@hrdunn
Copy link

hrdunn commented Jun 14, 2021

@RachelBlin @pasalvetti I have run into this issue as well. Has either of you managed to overcome this?

@RachelBlin
Copy link

Hi @hrdunn, unfortunately no, I gave up on the code and used another method. The only solution I found was reshaping the input images so they can be divided by 2^4.

@hrdunn
Copy link

hrdunn commented Jun 16, 2021

@RachelBlin Interesting. Wonder if it has to do with the model being trained on specific image size. @zhangmozhe is this the case? Would we need to retrain the model to output with higher resolutions? Also, do you know if I could run inference on a TPU with the current code?

@semel1
Copy link

semel1 commented Jul 23, 2021

Can't mange to get "image_size" to work. Tried -- image_size 216,384 as stated in the help (-h) "--image_size IMAGE_SIZE the image size, eg. [216,384]" - it trows an error: "test.py: error: argument --image_size: invalid int value: '216,384'". Can anybody please explain the meaning of that option and how properly use it. Thanks in advance for any help you are able to provide.

@krishnacck
Copy link

parser.add_argument("--image_size", type=int, default=[216 * 6, 384 * 6], help="the image size, eg. [216,384]")

the above code worked for me by multiplying the image size by even numbers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants