This recipe replaces ivectors used in the v1 recipe with embeddings extracted
 from a deep neural network.  In the scripts, we refer to these embeddings as
 "xvectors."  The recipe is closely based on the following paper: but uses a wideband
 rather than narrowband MFCC config.

 In addition to the VoxCeleb datasets used for training and evaluation (see
 ../README.txt) we also use the following datasets for augmentation.