-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
自定義文字讀音 #1175
Comments
感謝您提供的建議,但最初只是想著讓生成的語句更加口語化而已,所以並沒有考慮到多音漢字這塊。 不過若有這需求的人,只需對 clean_text_inf 稍做更改即可。 |
感谢 @uone-1 ,在你的思路下,我做了一个改进版本的多音字的处理方式,符合你的要求 @juntaosun 。 使用方式: 具体用法如下(例子试听 demo 在最后): 他说他数<tone as=shu3>数</tone>不好,所以我就<tone as='jiāo'>教</tone>他怎么<tone as="shu4">数</tone>数。 为什么改用 tone 标签而不是继续用 {} 括号,是因为括号在我们的项目中有其他用途 😄
from pypinyin import pinyin, Style
def clean_text_inf(text, language):
# 查找自定义多音字
text, tone_data_list = find_custom_tone1(text)
phones, word2ph, norm_text = clean_text(text, language)
# 有自定义多音字列表的,需要修正多音字
# @see https://github.com/RVC-Boss/GPT-SoVITS/issues/1175
if len(tone_data_list) > 0:
revise_custom_tone(phones, word2ph, tone_data_list)
phones = cleaned_text_to_sequence(phones)
return phones, word2ph, norm_text
def find_custom_tone1(text: str):
"""
识别、提取文本中的多音字
"""
tone_list = []
txts = []
# 识别 tone 标记,形如<tone as=shu4>数</tone>或<tone as=\"shu3\">数</tone>或<tone as=\"shù\">数</tone>
ptn1 = re.compile(r"<tone.*?>(.*?)</tone>")
# 清除 tone 标记中不需要的部分
ptn2 = re.compile(r"(</?tone)|(as)|([>\"'\s=])")
matches = list(re.finditer(ptn1, text))
offset = 0
for match in matches:
# tone 标记之前的文本
pre = text[offset:match.start()]
txts.append(pre)
# tone 标签中的单个多音字
tone_text = match.group(1)
txts.append(tone_text)
# 提取读音,支持识别 Style.TONE 和 Style.TONE3
tone = match.group(0)
tone = re.sub(ptn2, "", tone)
tone = tone.replace(tone_text, "")
# 多音字在当前文本中的索引位置
pos = sum([len(s) for s in txts])
offset = match.end()
init, final = get_initial_final(tone_text, tone)
data = [tone, init, final, pos]
tone_list.append(data)
# 不能忘了最后一个 tone 标签后面可能还有剩余的内容
if offset < len(text):
txts.append(text[offset:])
text = ''.join(str(i) for i in txts)
text = text.replace(" ", "") # 去除空格
return text, tone_list
def get_initial_final(wd, py):
"""
根据自定义的多音字读音匹配正确的声母、韵母。这里参考的 text/chinese.py 中的 _get_initials_finals 方法。
"""
# 声母列表
initials = pinyin(wd, heteronym=True, neutral_tone_with_five=True, style=Style.INITIALS)[0]
# 韵母列表
finals_tone1 = pinyin(wd, heteronym=True, neutral_tone_with_five=True, style=Style.FINALS_TONE)[0]
finals_tone3 = pinyin(wd, heteronym=True, neutral_tone_with_five=True, style=Style.FINALS_TONE3)[0]
# 因为不知道用户端究竟传的是 TONE1 还是 TONE3 风格的,所以这里需要组合不同风格的声母、韵母去做对比
for init in initials:
# TONE1 风格
for final in finals_tone1:
compose = init + final
if compose == py:
return init, finals_tone3[finals_tone1.index(final)]
# TONE3 风格
for final in finals_tone3:
compose = init + final
if compose == py:
return init, final
# 无法匹配时,返回空以保持模型中默认读音
return "", ""
def revise_custom_tone(phones, word2ph, tone_data_list):
"""
修正自定义多音字
"""
for td in tone_data_list:
tone = td[0]
init = td[1]
final = td[2]
pos = td[3]
if init == "" and final == "":
# 如果匹配拼音的时候失败,这里保持模型中默认提供的读音
continue
wd_pos = 0
for i in range(0, pos):
wd_pos += word2ph[i]
org_init = phones[wd_pos - 2]
org_final = phones[wd_pos - 1]
phones[wd_pos - 2] = init
phones[wd_pos - 1] = final
print(f"[+]成功修改读音: {org_init}{org_final} => {tone}") 例子: 这个字的读音是<tone as="jué">角</tone>色,而不是<tone as="jiao3">角</tone>色 |
大佬,我用8.27版本网页ui,并行推理没效果,不能识别tone,能不能修复下? |
大佬,最新版用并行推理,能不能发下怎么改文件?这个太旧了,我试了不成功,我也不会改。。 |
在TextPreprocessor.py修改clean_text_inf函数,然后在TextPreprocessor.py里添加 from pypinyin import pinyin, Style,把新增函数get_initial_final和revise_custom_tone贴到最后面 |
感谢。如何自定义多音字列表呢? 比如我就想让狗发音为mao1 |
只需改寫 api.py 的 clean_text_inf 函數,並添加 find_custom_tone 函數,即可實現。
用法: 在需修改讀音的文字後方加上 {讀音}
Input: "我今天吃的很{2}飽。"
"很" 的讀音就從原先的3聲轉為2聲了
The text was updated successfully, but these errors were encountered: