Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about image size #26

Closed
YokkaBear opened this issue Nov 19, 2020 · 2 comments
Closed

Questions about image size #26

YokkaBear opened this issue Nov 19, 2020 · 2 comments

Comments

@YokkaBear
Copy link

Hi, @elliottwu , Sorry to bother you again, but I have 2 questions about the setting of image size:

  1. Will increasing the input image_size improve the reconstruction effect? Since I have another dataset trained on the unsup3d model, but I didn't get satisfactory recon results, so I wonder if increasing the input image_size will heal the problem;

  2. I have tried increasing the image_size of the input image, set image_size in data_loader as 128 (2 times as original image_size=64), but I encountered the following error:

RuntimeError: The size of tensor a (128) must match the size of tensor b (4224) at non-singleton dimension 0

After checking, I found the two tensors are canon_normal and canon_light_d.view(-1,1,1,3) in the forward process, a element-wise multiplication will be operated on them, but they are unequal on the first dimension respectively:

torch.Size([128, 128, 128, 3])
torch.Size([4224, 1, 1, 3])

So I wonder if you have encountered this kind of error, and how you solved it. Thank you very much, and looking forward to your response.

@ALLinLLM
Copy link

ALLinLLM commented Feb 8, 2021

I meet this problem too when I try to train the model with 128x128 input

And I address the problem by adding a conv layer in the Encoder in unsup3d/network.py:

class Encoder(nn.Module):
    def __init__(self, cin, cout, nf=64, activation=nn.Tanh, input_size=64):
        super(Encoder, self).__init__()
        network = [
            nn.Conv2d(cin, nf, kernel_size=4, stride=2, padding=1, bias=False),  # 64x64 -> 32x32
            nn.ReLU(inplace=True),
            nn.Conv2d(nf, nf*2, kernel_size=4, stride=2, padding=1, bias=False),  # 32x32 -> 16x16
            nn.ReLU(inplace=True),
            nn.Conv2d(nf*2, nf*4, kernel_size=4, stride=2, padding=1, bias=False),  # 16x16 -> 8x8
            nn.ReLU(inplace=True),
            nn.Conv2d(nf*4, nf*8, kernel_size=4, stride=2, padding=1, bias=False),  # 8x8 -> 4x4
            nn.ReLU(inplace=True),
            ]  # 4x4 -> 1x1
        if input_size==128:
            network.extend([
                nn.Conv2d(nf*8, nf*8, kernel_size=4, stride=2, padding=1, bias=False),  # 8x8 -> 4x4
                nn.ReLU(inplace=True)])
        network.extend([
            nn.Conv2d(nf*8, nf*8, kernel_size=4, stride=1, padding=0, bias=False),
            nn.ReLU(inplace=True),
            nn.Conv2d(nf*8, cout, kernel_size=1, stride=1, padding=0, bias=False)
        ])
        if activation is not None:
            network += [activation()]
        self.network = nn.Sequential(*network)

    def forward(self, input):
        return self.network(input).reshape(input.size(0),-1) # 64|-> 64,4,1,1 128|-> 64 4, 5, 5

BTW, train with 128x128 need a lot GPU memory, and I set the batchsize to 8 on my RTX 2080ti

And the result is not good as the 64x64 input, you can refer to the #9 issue, the author explain the reason

@YokkaBear
Copy link
Author

@vegetable09 Thank you for your reply, I will refer to this solution if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants