You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the "Fully Packed Layout (THD)" under Case 3 on this page, I noticed the following description:
`Q = aabb`
dimension = [B = 2, H = 1, S = 8, D = 64]
stride = [S × H × D = 512, D = 64, H × D = 64, 1]
What confuses me is that, despite using ragged_tensors, the dimensions still appear the same as they would be without ragged_tensors.
From my understanding, ragged_tensors should offer two key benefits:
Improved memory access efficiency (due to more compact data arrangement).
Memory savings (when sequences within a batch have varying lengths, ragged_tensors provide a more compact memory layout, as shown by the example Q=aabbb instead of Q[b=0]=aa000000, Q[b=1]=BBB00000).
However, in this case, the dimensions are still given as [B, H, S, D], which seems to suggest that the purpose of using ragged_tensors here is purely to improve memory access efficiency, without any memory savings.
Could you kindly clarify whether my understanding is correct?
The text was updated successfully, but these errors were encountered:
In the "Fully Packed Layout (THD)" under Case 3 on this page, I noticed the following description:
What confuses me is that, despite using ragged_tensors, the dimensions still appear the same as they would be without ragged_tensors.
From my understanding, ragged_tensors should offer two key benefits:
Q=aabbb
instead ofQ[b=0]=aa000000, Q[b=1]=BBB00000
).However, in this case, the dimensions are still given as [B, H, S, D], which seems to suggest that the purpose of using ragged_tensors here is purely to improve memory access efficiency, without any memory savings.
Could you kindly clarify whether my understanding is correct?
The text was updated successfully, but these errors were encountered: