This release contains the checkpoints of:
- The frozen WavLM encoder (taken from the the original microsoft repo https://github.com/microsoft/unilm/tree/master/wavlm), please see the original WavLM paper (https://arxiv.org/abs/2110.13900) from the original authors for more details on this.
- The best TransFusion model described in our paper (462k updates, which is 3.7M steps given our gradient accumulation setup).
- The letter vocabulary used to convert indices to characters and vice versa.
- The full original checkpoint of the best model described in the paper (in case you wish to fine-tune from this model). Because it is too large, it is split into multiple parts (mp_rank_00_model_states.pt.part*). Please download each part (all the .pt.part* files), and concatenate them together with
cat mp_rank_00_model_states.pt.part* > mp_rank_00_model_states.pt
. This is the full checkpoint associated withtransfusion_462k_slim.pt
. The full concatenated checkpoint has md5 checksum:b7465223b1c688cc6c723872f7ada54c