PHO-LID (Accepted to Interspeech 2022 as an oral presentation)

PHO-LID: A Unified Model to Incorporate Acoustic-Phonetic and Phonotactic Information for Language Identification

Some people have successfully reproduced the results (and achieved even better ones). There are no specific hyper-parameters or random seed, just remember to have a low number of nega_frame (see the config.json).

DDP is not applicable during training, will try to update soon. Single-GPU training works well.
An example to run it: python train_PHOLID.py --json /home/tony/PHOLID_config_file.json

The training data txt file should be look like:

data0_path.npy(space)language_index0(space)T'_0
data1_path.npy language_index1 T'_1
data2_path.npy language_index2 T'_2
data3_path.npy language_index3 T'_3

Where the T'_0 is sequence length of the data reshaped from T x D to T' x 20 x D, 20 denotes 20 frames in each embedding. You may also prepare your own script and only use this model config.

If you find this paper useful, cite:

@inproceedings{liu22e_interspeech,
author={Hexin Liu and Leibny Paola {Garcia Perera} and Andy Khong and Suzy Styles and Sanjeev Khudanpur},
title={{PHO-LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language Identification}},
year=2022,
booktitle={Proc. Interspeech 2022},
pages={2233--2237},
doi={10.21437/Interspeech.2022-354}
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
LICENSE		LICENSE
README.md		README.md
config_example.json		config_example.json
data_load.py		data_load.py
model.py		model.py
scoring.py		scoring.py
ssl_sampler.py		ssl_sampler.py
train_PHOLID.py		train_PHOLID.py
transformer.py		transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

config_example.json

config_example.json

data_load.py

data_load.py

model.py

model.py

scoring.py

scoring.py

ssl_sampler.py

ssl_sampler.py

train_PHOLID.py

train_PHOLID.py

transformer.py

transformer.py

Repository files navigation

PHO-LID (Accepted to Interspeech 2022 as an oral presentation)

About

Releases

Packages

Languages

License

Lhx94As/PHO-LID

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages