You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello Menhao,
I have some question about the difference between code and paper.
In the paper, 'Equation (5) is the similarity between the i-th pixel and the j-th rows of M', external attention do the attention between pixels, however in the code I think Conv1d only can do the attention amount only one pixels’ channels, and pixels in F not do attention with pixels in M.
And in the paper 'In fact, we find that a small _S_, e.g. 64, works well in experiments.' but in the code d is setted to 64 instead of S.
The text was updated successfully, but these errors were encountered:
M is a set of "pattern" feature vectors (#vector = S), not feature vectors of pixels.
Conv1d.weight can be viewed as M, #out channel = S, # in channel = d.
Therefore the result of Conv1d will be the dot product between feature vectors of pixels and M's rows(shape of M = S × d = out × in, out channels ↔ rows).
k in code = S in paper, c in code = d in paper.
等一下,为什么你要用英文提问,作者也是中国人啊。
Hello Menhao,
I have some question about the difference between code and paper.
In the paper,
'Equation (5) is the similarity between the i-th pixel and the j-th rows of M'
, external attention do the attention between pixels, however in the code I thinkConv1d
only can do the attention amount only one pixels’ channels, and pixels in F not do attention with pixels in M.And in the paper
'In fact, we find that a small _S_, e.g. 64, works well in experiments.'
but in the code d is setted to 64 instead of S.The text was updated successfully, but these errors were encountered: