Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number of parameter changing when changing dilation #25

Closed
kinalmehta opened this issue Jan 27, 2021 · 5 comments
Closed

Number of parameter changing when changing dilation #25

kinalmehta opened this issue Jan 27, 2021 · 5 comments

Comments

@kinalmehta
Copy link

I happened to observe that when I change the dilation value, the number of parameters change. This is not the case with the standard torch.nn.Conv2D module. Is there any specific reason it happens in e2cnn.nn.R2Conv.

If this behaviour is expected, can you please direct me to the right resource.

Environment:

Python=3.7.9
torch=1.7.1
e2cnn=0.1.5

Code to reproduce issue


import torch
import torch.nn as nn
import torch.nn.functional as F

import numpy as np
import math

import e2cnn
import e2cnn.nn as enn
from e2cnn.nn import init
from e2cnn import gspaces   


class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        N = 8
        self.gspace = gspaces.Rot2dOnR2(N)
        self.in_type = enn.FieldType(self.gspace, [self.gspace.trivial_repr] * 3)
        self.out_type = enn.FieldType(self.gspace, [self.gspace.regular_repr] * 16)
        self.layer = enn.R2Conv(self.in_type, self.out_type, 3,
                      stride=1,
                      padding=1,
                      dilation=1,
                      bias=True,
                      )
        self.invariant = enn.GroupPooling(self.out_type)
    def forward(self, x):
        x = enn.GeometricTensor(x, self.in_type)
        out = self.layer(x)
        out = self.invariant(out)
        out = out.tensor
        return out

class ModelDilated(nn.Module):
    def __init__(self):
        super(ModelDilated, self).__init__()
        N = 8
        self.gspace = gspaces.Rot2dOnR2(N)
        self.in_type = enn.FieldType(self.gspace, [self.gspace.trivial_repr] * 3)
        self.out_type = enn.FieldType(self.gspace, [self.gspace.regular_repr] * 16)
        self.layer = enn.R2Conv(self.in_type, self.out_type, 3,
                      stride=1,
                      padding=2,
                      dilation=2,
                      bias=True,
                      )
        self.invariant = enn.GroupPooling(self.out_type)
    def forward(self, x):
        x = enn.GeometricTensor(x, self.in_type)
        out = self.layer(x)
        out = self.invariant(out)
        out = out.tensor
        return out


if __name__=="__main__":
    m = Model()
    md = ModelDilated()
    ip = torch.randn(1,3,100,100)
    op1 = m(ip)
    op2 = md(ip)

    totalParams = sum(p.numel() for p in m.parameters())
    totalParams2 = sum(p.numel() for p in md.parameters())
    print(totalParams, totalParams2)
    print(op1.shape, op2.shape)

Output

304 400
torch.Size([1, 16, 100, 100]) torch.Size([1, 16, 100, 100])

@Gabri95
Copy link
Collaborator

Gabri95 commented Jan 28, 2021

Hi @kinalmehta

Thank for your question.

The behaviour is not totally unexpected.
Unfortunately, using dilated filters is not super trivial when using steerable filters, since dilated filters can produce angular aliasing issues (due to their sparsity).

As a short answer, you can try to pass the argument frequencies_cutoff=1. when you use diltation=2 to obtain a similar number of parameters.

Here is the long answer:

First of all, recall what a basis for the steerable filters looks like. See Figure 2 in https://arxiv.org/pdf/1711.07289.pdf

The equivariant property only constraints the angular part of the filters but not the radial one.
Therefore, we split the radial part in a number of independent rings.
In a normal (dense) filter, larger rings are sampled on a larger numbers of cells of the filter.
This allows one to also consider higher frequencies for the angular component of the largest rings.

The perfect trade off for the number of frequencies to use in each ring is hard to estimate theoretically.
What we did, instead, was to manually search for combinations which were containing sufficiently high frequencies while not introducing too much aliasing.

The default parameters in R2Conv use our manually tuned trade-off, which works quite well for dense filters, but is not tuned for sparse filters like dilated ones.
This means that, if you are using a 3x3 dilated filter with dilation 2, it corresponds to a 5x5 filter and you will sample high frequencies as if your 5x5 filter is dense.

This is the reason why dilated filters have more parameters.

I would actually recommend trying to use a stronger frequency cut-off when using dilated filters.
You can tune this with the parameter frequencies_cutoff in R2Conv.

Have a look at this answer, where I gave an a bit more detailed explaination of the parameters you need to tune: #18 (comment)
It is interesting for you from the sentence "A steerable filter is in general split in multiple rings,.....".

In your case, when using diltation, the set of rings is computed as if the filter is not dilated and then the filters are scaled by the dilation; see this line.
For instance, a 3x3 filter with dilation 2 has two rings at radii 0. and 2. (the center of the filter and a ring that passes through the cell in position (2, 2) of the 5x5 grid).
The default policy associates a maximum frequency of 0 at radius 0 and 3 at radius 2; see this line (where r is the radius and you can ignore the max_radius in this case).
The idea of this "policy" is that on radius r you can generally sample frequencies up to 2*r (with some correction for the largest rings since they can partially fall outside the grid), but it assumes dense filters such that larger rings are sampled on more cells.
I would recomment to use at most frequency 2 for the outer ring of radius 2. This should also give you the same number of parameters of the dense 3x3 filter.
You can do so by passing the argument frequencies_cutoff=1., which is interpreted as allowing max frequency 1. * r = r at radius r.

Does this make sense for you?

Gabriele

@kinalmehta
Copy link
Author

Hi @Gabri95 ,

Thanks for such a detailed answer.
The solution worked.

My steerable Convolution concepts are a bit week, but referring to the answer gave me a decent overview of why there are different number of parameters in the two case.

I am using dilated convolution during evaluation and training the model using (max-pool+non-dilated) version.
Do you think this will adversely effect the prediction?

Thanks again
Kinal

@Gabri95
Copy link
Collaborator

Gabri95 commented Feb 6, 2021

hi @kinalmehta

I am happy it was useful :)

This is hard to tell a priori.
Using pooling (especially max-pooling) in general introduces aliasing issues whcih break equivariance (even translation equivariance).
Still, in Deep learning we usually use deep networks with max-pooling and find great results; so I don't expect any significant additional adversely effect with respect to a conventional CNN.
Actually, the fact the steerable filters are bandlimited and rather smooth should help and make downsampling rather stable.

However, you will probably observe some more noise when checking explicitly the rotation equivariance of the model.

In any case, you can always try to experiment a bit with different bandlimiting of the filters to find a better trade-off for the smoothness of the filters (which reduces the equivariance error).

If you find some interesting result, I'd be curious to hear about it so, please, let me know :)

Best,
Gabriele

@purse1996
Copy link

I want to use R2Conv in atrous spatial pyramid pooling(ASPP), whose dilation is 12, 24, 36. But the result is very very pool. The code is as follows. Could you give some suggestions?

conv3x3(in_dim, reduction_dim, dilation=r, padding=r)(r=12, 24, 36)
def conv3x3(inplanes, out_planes, stride=1, padding=1, groups=1, dilation=1):
"""3x3 convolution with padding"""
in_type = FIELD_TYPE['regular'](gspace, inplanes)
out_type = FIELD_TYPE['regular'](gspace, out_planes)
return enn.R2Conv(in_type, out_type, 3,
stride=stride,
padding=padding,
groups=groups,
bias=False,
dilation=dilation,
sigma=None,
frequencies_cutoff=lambda r: 3 * r,
initialize=False)

@Gabri95
Copy link
Collaborator

Gabri95 commented Mar 26, 2021

Hi @purse1996

I think you may have some issue with the frequencies cutoff.

If you use a 3x3 filter with dilation D, the outer pixels will have radius D.
You frequency cutoff policy allows frequencies up to 3*D to be sampled there. However, such dilated filter is very sparse.
In particular, the orbit of a pixel will be sampled at most on 4 locations, so I'd recommend not using frequencies higher than 2.
You could use frequencies_cutoff = lambda r: min(r, 2) such that

  • in the central pixel you have max frequency = 0
  • on other pixels you have max frequency = 2

However, keep in mind you filter is still very sparse, which also means that it will most likely not be very stable to continuous rotations (but should still be equivariant to 90 deg ones).
Does this help?

Gabriele

@Gabri95 Gabri95 closed this as completed Apr 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants