Skip to content

Special axes

Albert Zeyer edited this page Nov 9, 2021 · 5 revisions

Special axes of Data are time_dim_axis and feature_dim_axes, which have some special meaning for some layers and operations in RETURNN.

(batch_dim_axis as well, but it is not really ambiguous, so we don't cover that much here.)

(Related: see issue #586 on whether time_dim_axis and feature_dim_axes should be removed.)

Some operations or layers operate on spatial axes, which are defined as:

[
  axis
  for axis in range(self.batch_ndim)
  if axis != self.batch_dim_axis
  and (axis != self.feature_dim_axis or
       axis == self.time_dim_axis or
       self.batch_shape[axis] is None)]

This returns all axes except of the feature dim axis and batch dim axis. But further, if feature dim is same as time dim, or feature dim is dynamic, it would also include that.

A list of layers which make use of special axes or spatial axes:

  • All layers deriving from _ConcatInputLayer, which concatenate multiple inputs in the feature dimension, thus using feature_dim_axis (only used when multiple inputs are actually passed, otherwise ignored).
  • LinearLayer operates on the feature dim, using feature_dim_axis, or operates on sparse tensors. It accepts any tensor of any number of dimensions, as long as it has a feature dim, or is sparse (with finite number of classes).
  • RecLayer operates on the feature dim (using feature_dim_axis) and iterates over the time dim (time_dim_axis). Most builtin units (e.g. LSTM) expect a 3D input (batch, time, feature, in any order). A rec subnet can operate on anything as long as it has a time dim.
  • ConvLayer, TransposedConvLayer, PoolLayer operate on spatial axes and on the feature dim.
  • GenericAttentionLayer makes use of the time dim and feature dim of the base.
  • SoftmaxOverSpatialLayer uses time_dim_axis by default, although you can also explicitly specify the axis.
  • Some layers use input time_dim_axis to determine whether they run in a recurrent loop. E.g. CumsumLayer. This is a somewhat special meaning of the (default) argument axis="T".
  • ... (this list is incomplete)