Regarding the Confusion about Ragged Tensors in the Documentation

In the "Fully Packed Layout (THD)" under Case 3 on [this page](https://docs.nvidia.com/deeplearning/cudnn/frontend/latest/operations/Attention.html#supported-tensor-layouts), I noticed the following description: 
 
```
`Q = aabb`  
dimension = [B = 2, H = 1, S = 8, D = 64]  
stride = [S × H × D = 512, D = 64, H × D = 64, 1]
```

What confuses me is that, despite using ragged_tensors, the dimensions still appear the same as they would be without ragged_tensors.  
From my understanding, ragged_tensors should offer two key benefits:  
1. Improved memory access efficiency (due to more compact data arrangement).  
2. Memory savings (when sequences within a batch have varying lengths, ragged_tensors provide a more compact memory layout, as shown by the example `Q=aabbb` instead of `Q[b=0]=aa000000, Q[b=1]=BBB00000`).

However, in this case, the dimensions are still given as [B, H, S, D], which seems to suggest that the purpose of using ragged_tensors here is purely to improve memory access efficiency, without any memory savings. 
Could you kindly clarify whether my understanding is correct?  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Regarding the Confusion about Ragged Tensors in the Documentation #135

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Regarding the Confusion about Ragged Tensors in the Documentation #135

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions