Vietnam-Celeb: a large-scale dataset for Vietnamese speaker recognition

Contact email: thanh.pv.ds@gmail.com

Paper URL: https://www.isca-speech.org/archive/interspeech_2023/pham23b_interspeech.html

Citation:

@inproceedings{pham23b_interspeech,
  author={Viet Thanh Pham and Xuan Thai Hoa Nguyen and Vu Hoang and Thi Thu Trang Nguyen},
  title={{Vietnam-Celeb: a large-scale dataset for Vietnamese speaker recognition}},
  year=2023,
  booktitle={Proc. INTERSPEECH 2023},
  pages={1918--1922},
  doi={10.21437/Interspeech.2023-1989}
}

This is the repository for the anonymous submission of the Vietnam-Celeb dataset at Interspeech 2023.
This repository includes four parts of the dataset:

To extract the 4 parts, run the two following codes:

zip -F vietnam-celeb-part.zip --out full-dataset.zip
unzip full-dataset.zip

The data folder contains the utterances of every speakers in the dataset, in which each speaker has a folder with its name being the ID of that speaker.
There are three text files corresponding to the datasets that we have split, as discussed in the anonymous submission to Interspeech 2023:
- vietnam-celeb-t.txt: list of utterances in the training set of Vietnam-Celeb
- vietnam-celeb-e.txt: Pairs of utterances in the Vietnam-Celeb-E test set.
- vietnam-celeb-h.txt: Pairs of utterances in the Vietnam-Celeb-H test set.
We also include a TSV file containing the information of every speaker in the dataset, which include the following attributes:
- speaker_id: ID of the speaker
- gender: gender of the speaker
- dialect: Vietnamese dialect of the speaker
- source: the crawling source of the utterances of a speaker.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository files navigation

Vietnam-Celeb: a large-scale dataset for Vietnamese speaker recognition

Contact email: thanh.pv.ds@gmail.com

Paper URL: https://www.isca-speech.org/archive/interspeech_2023/pham23b_interspeech.html

Citation:

About

Releases

Packages

thanhpv2102/Vietnam-Celeb.Interspeech

Folders and files

Latest commit

History

README.md

README.md

Repository files navigation

Vietnam-Celeb: a large-scale dataset for Vietnamese speaker recognition

Contact email: thanh.pv.ds@gmail.com

Paper URL: https://www.isca-speech.org/archive/interspeech_2023/pham23b_interspeech.html

Citation:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages