Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some comments about code of PartialConv2d #28

Open
shaibagon opened this issue Nov 2, 2020 · 4 comments
Open

Some comments about code of PartialConv2d #28

shaibagon opened this issue Nov 2, 2020 · 4 comments

Comments

@shaibagon
Copy link

shaibagon commented Nov 2, 2020

Hi,
I really like your work - thank you for sharing the code.

I have some remarks on the code of PartialConv2d:

1. Using buffers instead of simple "member tensors":

The class attribute self.weight_maskUpdater is defined as a "plain" tensor:

if self.multi_channel:
self.weight_maskUpdater = torch.ones(self.out_channels, self.in_channels, self.kernel_size[0], self.kernel_size[1])
else:
self.weight_maskUpdater = torch.ones(1, 1, self.kernel_size[0], self.kernel_size[1])

As a result, when the model is transferred to GPU, or data type is changing you need to explicitly check for it and change it:
if self.weight_maskUpdater.type() != input.type():
self.weight_maskUpdater = self.weight_maskUpdater.to(input)

A more elegant way is to use non-persistent buffers:

        if self.multi_channel:
            self.register_buffer(name='weight_maskUpdater', persistent=False,
                                 tensor=torch.ones(self.out_channels, self.in_channels,
                                                   self.kernel_size[0], self.kernel_size[1]))
        else:
            self.register_buffer(name='weight_maskUpdater', persistent=False,
                                 tensor=torch.ones(1, 1, self.kernel_size[0], self.kernel_size[1]))

This way self.weight_maskUpdater will be affected by any .to(...) / .cuda() / .cpu() invoked on the model hosting this layer, making the condition on line 49 redundent.

2. Use of torch.ones_like

Instead of torch.ones(...).to(...)

if self.multi_channel:
mask = torch.ones(input.data.shape[0], input.data.shape[1], input.data.shape[2], input.data.shape[3]).to(input)
else:
mask = torch.ones(1, 1, input.data.shape[2], input.data.shape[3]).to(input)

You can use torch.ones_like which is simpler and easier to read:

                    if self.multi_channel:
                        mask = torch.ones_like(input)
                    else:
                        mask = torch.ones_like(input[:1, :1, ...])

3. No need to import Variable

You import Variable, but never use it.

from torch.autograd import Variable

BTW All these comments are applicable to the code of PartialConv3d.

@liuguilin1225
Copy link
Contributor

Thanks for your great suggestions. Will update them accordingly after testing.

@ivanstepanovftw
Copy link

ivanstepanovftw commented Mar 3, 2024

torch.ones_like(input[:1, :1, ...])

Does input[:1, :1, ...] taking a slice of the tensor without making a copy?

@shaibagon
Copy link
Author

@ivanstepanovftw I don't think it makes a copy. However, since it is only used as an argument for torch.ones_like the values of the tensor are not being used at all, only the shape, dtype and device.

@ivanstepanovftw
Copy link

Also I think this operation is useless:

output = torch.mul(output, self.update_mask)

because input will be multiplied by the mask anyway here:
raw_out = super(PartialConv2d, self).forward(torch.mul(input, mask) if mask_in is not None else input)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants