Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pinyin preprocess problem #8

Closed
windowxiaoming opened this issue Sep 2, 2022 · 2 comments
Closed

pinyin preprocess problem #8

windowxiaoming opened this issue Sep 2, 2022 · 2 comments

Comments

@windowxiaoming
Copy link

windowxiaoming commented Sep 2, 2022

005804 你当#1我傻啊#3?脑子#1那么大#2怎么#1塞进去#4
ni3 dang1 wo2 sha3 a5 nao3 zi5 na4 me5 da4 zen3 me5 sai1 jin4 qu4

txt_struct=[['', ['']], ['你', ['n', 'i3']], ['当', ['d', 'ang1']], ['我', ['uo3']], ['傻', ['sh', 'a3']], ['啊', ['a', '?', 'n', 'ao3']], ['?', ['z', 'i']], ['脑', ['n', 'a4']], ['子', ['m', 'e']], ['那', ['d', 'a4']], ['么', ['z', 'en3']], ['大', ['m', 'e']], ['怎', ['s', 'ai1']], ['么', ['j', 'in4']], ['塞', ['q', 'v4', '?']], ['进', []], ['去', []], ['?', []], ['', ['']]]

ph_gb_word=['', 'n_i3', 'd_ang1', 'uo3', 'sh_a3', 'a_?n_ao3', 'z_i', 'n_a4', 'm_e', 'd_a4', 'z_en3', 'm_e', 's_ai1', 'j_in4', 'q_v4?', '', '', '', '']

what is 'a_?_n_ao3'

in the mfa_dict it appears ch_a1_d_ou1 ,a_?_n_ao3 and so on

@yerfor
Copy link
Owner

yerfor commented Sep 4, 2022

Hi there, this is a bug in data_gen/tts/txt_processors/zh.py, and I have solved it in the latest commit. To be specific, the previous code ignores the "?" or "!" when separating the words. I have made small modifications as follows, please refer to line 95-96 in data_gen/tts/txt_processors/zh.py for details.

        # elif ph in [',', '.']:
        elif ph in [',', '.', '?', '!']:

@windowxiaoming
Copy link
Author

@yerfor thank you for your reply,i will attempt to run the modification

@yerfor yerfor closed this as completed Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants