Skip to content

Releases: desh2608/pytorch-tdnn

v1.1.0

18 Dec 22:29
Compare
Choose a tag to compare

The following changes have been made:

  1. The semi-orthogonal loss is now computed as the Frobenius norm of P (P = torch.mm(M, M.T)), instead of the Frobenius norm of (P - \alpha^2 I). This makes it consistent with the loss reporting in Kaldi.

  2. The forward() function in the TDNNF class now takes semi_ortho_step as argument instead of training. This allows the calling function to make the decision about whether or not to take the step towards semi-orthogonality.

  3. The initialization of the TDNN layer now takes a bias argument, which specifies whether or not to use bias in the Conv1D layer. When using the TDNN for the SemiOrthogonalConv class for TDNNF, we set bias = False, so that the matrix factorization checks out correctly.

First release

15 Dec 22:37
Compare
Choose a tag to compare

This release contains basic versions of TDNN and TDNN-F layers, with some constraints for the contexts.

Using the TDNN layer

from pytorch_tdnn.tdnn import TDNN as TDNNLayer

tdnn = TDNNLayer(
  512, # input dim
  512, # output dim
  [-3,0,3], # context
)

Note: The context list should follow these constraints:

  • The length of the list should be 2 or an odd number.
  • If the length is 2, it should be of the form [-1,1] or [-3,3], but not
    [-1,3], for example.
  • If the length is an odd number, they should be evenly spaced with a 0 in the
    middle. For example, [-3,0,3] is allowed, but [-3,-1,0,1,3] is not.

Using the TDNNF layer

from pytorch_tdnn.tdnnf import TDNNF as TDNNFLayer

tdnn = TDNNFLayer(
  512, # input dim
  512, # output dim
  256, # bottleneck dim
  1, # time stride
)

Note: Time stride should be greater than or equal to 0. For example, if
the time stride is 1, a context of [-1,1] is used for each stage of splicing.

Credits