You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to understand your TNT implementation and one thing that got me a bit confused is why there are 2 parameters patch_tokens and patch_pos_emb which seems to have the same purpose - to encode patch position. Isn't one of them redundant?
self.patch_tokens=nn.Parameter(torch.randn(num_patch_tokens+1, patch_dim))
self.patch_pos_emb=nn.Parameter(torch.randn(num_patch_tokens+1, patch_dim))
...
patches=repeat(self.patch_tokens[:(n+1)], 'n d -> b n d', b=b)
patches+=rearrange(self.patch_pos_emb[:(n+1)], 'n d -> () n d')
The text was updated successfully, but these errors were encountered:
Hi!
I'm trying to understand your TNT implementation and one thing that got me a bit confused is why there are 2 parameters
patch_tokens
andpatch_pos_emb
which seems to have the same purpose - to encode patch position. Isn't one of them redundant?The text was updated successfully, but these errors were encountered: