You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed for size=None you assume an activation is a neuron.
It is possible to have this instead:
# - convolution layer [M, C, H, W]
if size is None:
# - no downsampling: [M, C, H, W] -> [M, C, H*W]
# an activation is an (effective) neuron is an activation (in the spatial dimension)
# (effective) data size is
# flatten(2) -> flatten from 2 to -1 (end)
self_tensor = self_tensor.flatten(start_dim=2, end_dim=-1).contiguous()
other_tensor = other_tensor.flatten(start_dim=2, end_dim=-1).contiguous()
# improvement [M, C, H, W] -> [M, C*H*W]
self_tensor = self_tensor.flatten(start_dim=3, end_dim=-1).contiguous()
other_tensor = other_tensor.flatten(start_dim=3, end_dim=-1).contiguous()
return self.cca_function(self_tensor, other_tensor).item()
Original paper
I am aware you later go and compare them by looping through each data point later, which is not exactly equivalent as the above - though that is a small nuance. That approach assumes C is the effective size of the data and that each activation in the spatial dimension is a filter. But usually a neuron (vector) is considered to have size with respect to the data set so usually it's [M, CHW] or [MHW, C]. So I'm unsure why having the filter size as the effective size of the data set for CCA is justified.
I will go with [MHW, C] since I think the definition of a neuron per filter makes more sense and each patch seen by a filter as a data point makes more sense. I think due to the nature of CCA, this is fine to apply even across layers. If you want to know why I'm happy to copy paste that section of the background section of my paper here.
see:
Thanks for your great library and feedback!
The text was updated successfully, but these errors were encountered:
Thank you for your comment.
I did not intend to compare different layers, but maybe I should add an option.
The reason I used looping was to avoid an OOM problem I faced at that time.
I noticed for size=None you assume an activation is a neuron.
It is possible to have this instead:
Original paper
I am aware you later go and compare them by looping through each data point later, which is not exactly equivalent as the above - though that is a small nuance. That approach assumes C is the effective size of the data and that each activation in the spatial dimension is a filter. But usually a neuron (vector) is considered to have size with respect to the data set so usually it's [M, CHW] or [MHW, C]. So I'm unsure why having the filter size as the effective size of the data set for CCA is justified.
I will go with
[MHW, C]
since I think the definition of a neuron per filter makes more sense and each patch seen by a filter as a data point makes more sense. I think due to the nature of CCA, this is fine to apply even across layers. If you want to know why I'm happy to copy paste that section of the background section of my paper here.see:
Thanks for your great library and feedback!
The text was updated successfully, but these errors were encountered: