Skip to content

Derpimort/VGGVox-PyTorch

Repository files navigation

VGGVox-PyTorch

Implementing VGGVox for VoxCeleb1 dataset in PyTorch.

Train

pip install -r requirements.txt
python3 train.py --dir ./data/
Specify data dir with --dir

Notes

  • 81.79% Top-1 & 93.17 Top-5 Test-set accuracy, pretty satisfactory. Find details in results.txt.
  • Training on the V100 takes 4 mins per epoch.

Model

What i've done so far:

  • All the data preprocessed exactly as author's matlab code. Checked and verified online on matlab
  • Random 3s cropped segments for training.
  • Copy all hyperparameter... LR, optimizer params, batch size from the author's net.
  • Stabilize PyTorch's BatchNorm and test variants. Improved results by a small percentage.
  • Try onesided spectrogram input as mentioned on the author's github.
  • Port the authors network from matlab and train. The matlab model has 1300 outputs dimension, will test it later.
  • Copy weights from the matlab network and test.

References and Citations:

@InProceedings{Nagrani17,
 author       = "Nagrani, A. and Chung, J.~S. and Zisserman, A.",
 title        = "VoxCeleb: a large-scale speaker identification dataset",
 booktitle    = "INTERSPEECH",
 year         = "2017",
}


@InProceedings{Nagrani17,
 author       = "Chung, J.~S. and Nagrani, A. and Zisserman, A.",
 title        = "VoxCeleb2: Deep Speaker Recognition",
 booktitle    = "INTERSPEECH",
 year         = "2018",
}

About

Implementing VGGVox for Speaker Identification on VoxCeleb1 dataset in PyTorch.

Topics

Resources

License

Stars

Watchers

Forks

Languages