Skip to content
This repository has been archived by the owner on Nov 29, 2023. It is now read-only.

See if anti-aliased CNN helps #27

Open
jacobbieker opened this issue Jun 29, 2021 · 3 comments
Open

See if anti-aliased CNN helps #27

jacobbieker opened this issue Jun 29, 2021 · 3 comments
Assignees
Labels
enhancement New feature or request
Projects

Comments

@jacobbieker
Copy link
Member

https://github.com/adobe/antialiased-cnns

Since normal convolutional, pooling, etc. layers ignore the Nyquist sampling theorem, they can be very sensitive to slight changes in input. This adds an extra layer that can fix that. On previous work I've done, it helped get an improvement of a few percent on the task, so might help here.

@jacobbieker jacobbieker added the enhancement New feature or request label Jun 29, 2021
@jacobbieker jacobbieker self-assigned this Jun 29, 2021
@jacobbieker jacobbieker added this to To do in ML Research via automation Jun 29, 2021
@tcapelle
Copy link

tcapelle commented Jun 29, 2021

I am curious about this, please tell us when you tried this. I have found that CoorConvs help to get better results consistently, they act as positional embeddings.

class AddCoords(Module):
    "Add coordinates to image"
    def __init__(self, with_r=False):
        self.with_r = with_r

    def forward(self, input_tensor):
        """
        Args:
            input_tensor: shape(batch, channel, x_dim, y_dim)
        """
        batch_size, _, x_dim, y_dim = input_tensor.size()

        xx_channel = torch.arange(x_dim).repeat(1, y_dim, 1)
        yy_channel = torch.arange(y_dim).repeat(1, x_dim, 1).transpose(1, 2)

        xx_channel = xx_channel.float() / (x_dim - 1)
        yy_channel = yy_channel.float() / (y_dim - 1)

        xx_channel = xx_channel * 2 - 1
        yy_channel = yy_channel * 2 - 1

        xx_channel = xx_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3)
        yy_channel = yy_channel.repeat(batch_size, 1, 1, 1).transpose(2, 3)

        ret = torch.cat([
            input_tensor,
            xx_channel.type_as(input_tensor),
            yy_channel.type_as(input_tensor)], dim=1)

        if self.with_r:
            rr = torch.sqrt(torch.pow(xx_channel.type_as(input_tensor) - 0.5, 2) + torch.pow(yy_channel.type_as(input_tensor) - 0.5, 2))
            ret = torch.cat([ret, rr], dim=1)

        return ret

and the conv:

@delegates(nn.Conv2d)
class CoordConv(Module):
    "Like a 2D Conv but with coordinates"
    def __init__(self, in_channels, out_channels, kernel_size, **kwargs, ):
        self.addcoords = AddCoords(with_r=True)
        in_size = in_channels+3
        self.conv = nn.Conv2d(in_size, out_channels, kernel_size, **kwargs)

    def forward(self, x):
        ret = self.addcoords(x)
        ret = self.conv(ret)
        return ret

@jacobbieker
Copy link
Member Author

Oh interesting! Yeah, I'll update this issue when I try it out to see if it makes a difference. Still have been working out a few kinks in the data pipeline, but it seems good to go now. I'll also try benchmarking models with your approach as well, to see how it works with all three.

@jacobbieker
Copy link
Member Author

This library also seems like a cool way to easily try out BlurPool, and other things with Torch models: https://github.com/mosaicml/composer

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
No open projects
Development

No branches or pull requests

2 participants