Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential improvement for CNN size=None (or bug?) #23

Closed
brando90 opened this issue Oct 28, 2021 · 2 comments
Closed

potential improvement for CNN size=None (or bug?) #23

brando90 opened this issue Oct 28, 2021 · 2 comments

Comments

@brando90
Copy link
Contributor

brando90 commented Oct 28, 2021

I noticed for size=None you assume an activation is a neuron.

It is possible to have this instead:

            # - convolution layer [M, C, H, W]
            if size is None:
                # - no downsampling: [M, C, H, W] -> [M, C, H*W]
                # an activation is an (effective) neuron is an activation (in the spatial dimension)
                # (effective) data size is

                # flatten(2) -> flatten from 2 to -1 (end)
                self_tensor = self_tensor.flatten(start_dim=2, end_dim=-1).contiguous()
                other_tensor = other_tensor.flatten(start_dim=2, end_dim=-1).contiguous()

                # improvement [M, C, H, W] -> [M, C*H*W]
                self_tensor = self_tensor.flatten(start_dim=3, end_dim=-1).contiguous()
                other_tensor = other_tensor.flatten(start_dim=3, end_dim=-1).contiguous()
                return self.cca_function(self_tensor, other_tensor).item()

Original paper
Screen Shot 2021-10-28 at 12 42 40 PM

I am aware you later go and compare them by looping through each data point later, which is not exactly equivalent as the above - though that is a small nuance. That approach assumes C is the effective size of the data and that each activation in the spatial dimension is a filter. But usually a neuron (vector) is considered to have size with respect to the data set so usually it's [M, CHW] or [MHW, C]. So I'm unsure why having the filter size as the effective size of the data set for CCA is justified.

I will go with [MHW, C] since I think the definition of a neuron per filter makes more sense and each patch seen by a filter as a data point makes more sense. I think due to the nature of CCA, this is fine to apply even across layers. If you want to know why I'm happy to copy paste that section of the background section of my paper here.

see:

Screen Shot 2021-10-28 at 12 48 47 PM

Thanks for your great library and feedback!

@moskomule
Copy link
Owner

Thank you for your comment.
I did not intend to compare different layers, but maybe I should add an option.
The reason I used looping was to avoid an OOM problem I faced at that time.

@brando90
Copy link
Contributor Author

brando90 commented Oct 28, 2021

makes sense. I was worried about that in my current implementation.

I added support to compare via filter and via your implementation and TODO via activation as options in my fork:

https://github.com/brando90/ultimate-anatome/blob/40b91003fed50b06c10cef9639193e8c31dc3802/anatome/similarity.py#L428

feel free to copy paste it, ask question etc. Hope it helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants