Skip to content

Regarding the Confusion about Ragged Tensors in the Documentation #135

Closed
@yhyang201

Description

@yhyang201

In the "Fully Packed Layout (THD)" under Case 3 on this page, I noticed the following description:

`Q = aabb`  
dimension = [B = 2, H = 1, S = 8, D = 64]  
stride = [S × H × D = 512, D = 64, H × D = 64, 1]

What confuses me is that, despite using ragged_tensors, the dimensions still appear the same as they would be without ragged_tensors.
From my understanding, ragged_tensors should offer two key benefits:

  1. Improved memory access efficiency (due to more compact data arrangement).
  2. Memory savings (when sequences within a batch have varying lengths, ragged_tensors provide a more compact memory layout, as shown by the example Q=aabbb instead of Q[b=0]=aa000000, Q[b=1]=BBB00000).

However, in this case, the dimensions are still given as [B, H, S, D], which seems to suggest that the purpose of using ragged_tensors here is purely to improve memory access efficiency, without any memory savings.
Could you kindly clarify whether my understanding is correct?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions