Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the Confusion about Ragged Tensors in the Documentation #135

Open
yhyang201 opened this issue Mar 12, 2025 · 0 comments
Open

Regarding the Confusion about Ragged Tensors in the Documentation #135

yhyang201 opened this issue Mar 12, 2025 · 0 comments

Comments

@yhyang201
Copy link

In the "Fully Packed Layout (THD)" under Case 3 on this page, I noticed the following description:

`Q = aabb`  
dimension = [B = 2, H = 1, S = 8, D = 64]  
stride = [S × H × D = 512, D = 64, H × D = 64, 1]

What confuses me is that, despite using ragged_tensors, the dimensions still appear the same as they would be without ragged_tensors.
From my understanding, ragged_tensors should offer two key benefits:

  1. Improved memory access efficiency (due to more compact data arrangement).
  2. Memory savings (when sequences within a batch have varying lengths, ragged_tensors provide a more compact memory layout, as shown by the example Q=aabbb instead of Q[b=0]=aa000000, Q[b=1]=BBB00000).

However, in this case, the dimensions are still given as [B, H, S, D], which seems to suggest that the purpose of using ragged_tensors here is purely to improve memory access efficiency, without any memory savings.
Could you kindly clarify whether my understanding is correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant