-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get error #7
Comments
meldataset.py: #coding: utf-8 import os import torch from g2pM import G2pM import logging class MelDataset(torch.utils.data.Dataset):
class Collater(object):
def build_dataloader(path_list,
|
hi, what does your dict table for Mandarin look like? |
Thank you! BTW, if the input to GP2M is pinyin, it seems the output is also pinyin. How will it be changed to phonemes in the dict? |
word_index_dict.txt |
|
how to change meldataset.py?I still got error. @Kristopher-Chen |
it seems there is something wrong with your speaker label in the train list... |
I change it like this: def _load_tensor(self, data):
I didnt use g2p,only make txt into an array,like this:['zhi', 'ye', 'lian','sai',....] c/home/mike/anaconda3/envs/asr/bin/python /home/mike/PycharmProjects/AuxiliaryASR/train.py nan nan nan nan |
I believe this error says one of your lines in your |
train_list.txt |
I believe something is wrong with your labels. The loss should not be NaN and the WER should not be this high after 170 epochs of training. Can you discuss it with @Charlottecuc because it looks like she could train on this dataset with no problem? It looks like you have created so many tokens (420 tokens) and they aren't actually phonemes but syllables. |
@MMMMichaelzhang The WER should be lower than 0.2 after about 20 epochs. Could you print your final text tensors in meldataset.py? The text tensor and the corrsponding index should match the sentences in your training data file. Your can check whether there is something wrong in your preprocessing steps. Besides, it seems that your word.dict file is not correct. The dict file should cover all possible Mandarin phonemes (e.g. |
However, if you add all possible Mandarin pinyins, there will be too many tokens to learn. So a good choice is to split pinyins into phonemes. |
@MMMMichaelzhang , is there some tool to convert pinyins to the phonemes? |
|
thanks for your reply.It helps a lot.I am trying to setup again. @Charlottecuc @yl4579 |
@Kristopher-Chen |
My train loss became negative, I don't know why。 @yl4579 @Charlottecuc |
@MMMMichaelzhang This is expected, see #4 |
@yl4579 It seems something not ideal with the eval loss, and, though acc is quite high, wer is almost 45% in my case. I used the dict with tones(1-5). |
@Kristopher-Chen For some reason, your model overfits very badly because your evaluation loss starts to increase after the 40th epoch, you may want to add more data or use data augmentation. An idea training curve should look like the reply above you. |
@MMMMichaelzhang how many hours of data did you use? |
about 20 hours @Kristopher-Chen |
It seems too limited data is used... LibriTTS includes over 500h+ of data. |
More training data is needed. I used around 400 hours of data and the WER can reach about 0.08 after epoch 80. |
Yes, thank you! I'm trying to use more training data. BTW, which open source are you using? |
@Charlottecuc did you add space between each pinyin? like this : ['b', 'iao1', ' ', 'g', 'an1', ' ', 'f', 'ang2', ' ', 'q', 'i3', ' ', 'b', 'i4', ' ', 'r', 'an2', ' ', 't', 'iao2', ' ', 'zh', 'eng3', ' ', 'sh', 'iii4', ' ', 'ch', 'ang3', ' ', 'zh', 'an4', ' ', 'l', 've4'] |
[train]: 24%|██▍ | 16/66 [00:04<00:15, 3.20it/s]
Traceback (most recent call last):
File "/home/mike/PycharmProjects/AuxiliaryASR/train.py", line 116, in
main()
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/home/mike/PycharmProjects/AuxiliaryASR/train.py", line 98, in main
train_results = trainer._train_epoch()
File "/home/mike/PycharmProjects/AuxiliaryASR/trainer.py", line 186, in _train_epoch
for train_steps_per_epoch, batch in enumerate(tqdm(self.train_dataloader, desc="[train]"), 1):
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in next
data = self._next_data()
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
return self._process_data(data)
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
data.reraise()
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise
raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/mike/anaconda3/envs/asr/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/mike/PycharmProjects/AuxiliaryASR/meldataset.py", line 60, in getitem
wave, text_tensor, speaker_id = self._load_tensor(data)
File "/home/mike/PycharmProjects/AuxiliaryASR/meldataset.py", line 78, in _load_tensor
speaker_id = int(speaker_id)
ValueError: invalid literal for int() with base 10: ''
my train_list :
/media/mike/yys/data_asr/SSB00800056.wav|wo men can jia guo xu duo zhong da huo dong de biao yan|0
/media/mike/yys/data_asr/SSB00050001.wav|guang zhou nv da xue sheng deng shan shi lian si tian jing fang zhao dao yi si nv shi|0
/media/mike/yys/data_asr/SSB00050002.wav|zhun zhong ke xue gui lv de yao qiu|0
/media/mike/yys/data_asr/SSB00050003.wav|qi lu wu ren shou piao|0
..
The text was updated successfully, but these errors were encountered: