Absolute Position Encoding：Why are the two tensors not alternately merged？ #1925

davinca · 2023-04-28T10:01:07Z

Line 453 in bafdc1b

signal = tf.concat([tf.sin(scaled_time), tf.cos(scaled_time)], axis=1)

In the orginal paper, the position_embedding is like this: [..., sin i, cos i, ...]

martinpopel · 2023-04-28T10:22:37Z

See #177 and #1591 (and #1677).

davinca · 2023-04-28T11:37:58Z

just different orderings of the same set of channels, The effects of both are consistent theoretically.

davinca closed this as completed Apr 28, 2023

Provide feedback