Batching data in 3D can be tricky due to the heterogeneous sizes.
For instance, point clouds can have different number of points, which means we can't always just concatenate the tensors on a batch axis.
Kaolin supports different batching strategies:
Exact batching is the logical representation for homogeneous data.
For instance, if you sample the same numbers of points from a batch of meshes, you would just have a single tensor of shape (\text{batch_size}, \text{number_of_points}, 3).
Heterogeneous tensors are padded to identical dimensions with a constant value so that they can be concatenated on a batch axis. This is similar to padding for the batching of image data of different shapes.
Note
The last dimension must always be of the size of the element, e.g. 3 for 3D points (element of point clouds) or 1 for a grayscale pixel (element of grayscale textures).
For instance, for two textures T_0 and T_1 of shape (32, 32, 3) and (64, 16, 3) the batched tensor will be of shape (2, max(32, 64), max(32, 16), 3) = (2, 64, 32, 3) and the padding value will be 0. T_0 will be padded on the 1st axis by 32 while T_1 will be padded on the 2nd axis by 16.
You can also enforce a specific maximum shape (if you want to have a fix memory consumption or use optimization like cudnn algorithm selection).
For instance, you can force T_0 and T_1 to be batched with a maximum shape of (128, 128), the batched tensor will be of shape (2, 128, 128, 3), T_0 will be padded on the 1st axis and 2nd axis by 96 and T_1 will be padded on the 1st axis by 64 and on the 2nd axis by 112.
For more information on how to do padded batching check :func:`kaolin.ops.batch.list_to_padded`
- :attr:`shape_per_tensor`: 2D :class:`torch.LongTensor` stores the shape of each sub-tensor except the last dimension in the padded tensor. E.g., in the example above :attr:`shape_per_tensor` would be
torch.LongTensor([[32, 32], [64, 16]])
. Refer to :func:`kaolin.ops.batch.get_shape_per_tensor` for more information.
Heterogeneous tensors are reshaped to 2D (-1, \text{last_dimension}) and concatenated on the first axis. This is similar to packed sentences in NLP.
Note
The last dimension must always be of the size of the element, e.g. 3 for 3D points (element of point clouds) or 1 for a grayscale pixel (element of grayscale textures).
For instance, for two textures T_0 and T_1 of shape (32, 32, 3) and (64, 16, 3) The batched tensor will be of shape (32 * 32 + 64 * 16, 3). T_0 will be reshaped to (32 * 32, 3) and T_1 will be reshaped (64 * 16, 3), before being concatenated on the first axis.
For more information on how to do padded batching check :func:`kaolin.ops.batch.list_to_packed`
- :attr:`shape_per_tensor`: 2D :class:`torch.LongTensor` stores the shape of each sub-tensor except the last dimension in the padded tensor. E.g., in the example above :attr:`shape_per_tensor` would be
torch.LongTensor([[32, 32], [64, 16]])
. Refer to :func:`kaolin.ops.batch.get_shape_per_tensor` for more information.
- :attr:`first_idx`: 1D :class:`torch.LongTensor` stores the first index of each subtensor and the last index + 1 on the first axis in the packed tensor. E.g., in the example above :attr:`first_idx` would be
torch.LongTensor([0, 1024, 2048])
. This attribute are used for delimiting each subtensor into the packed tensor, for instance, to slice or index. Refer to :func:`kaolin.ops.batch.get_first_idx` for more information.
.. automodule:: kaolin.ops.batch
:platform: Windows-x86_64, Linux-x86_64
:members:
:undoc-members:
:show-inheritance: