-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
全角/半角字符转换 #185
Comments
sorry jinchuan, 邮件 miss 掉了才看到。 是的,full_to_half这样的参数不能在pip安装的Normalizer里设置(因为涉及到重新编译fst,这个在py包里暂时没法做到,后面我想想办法) 现行方案如readme 1.2 节 (Advanced Usage) 所示 https://github.com/wenet-e2e/WeTextProcessing?tab=readme-ov-file#12-advanced-usage git clone https://github.com/wenet-e2e/WeTextProcessing.git
cd WeTextProcessing
pip install -r requirements.txt
# `overwrite_cache` will rebuild all rules according to
# your modifications on tn/chinese/rules/xx.py (itn/chinese/rules/xx.py).
# After rebuild, you can find new far files at `$PWD/tn` and `$PWD/itn`.
python -m tn --text "你好。" --overwrite_cache --full_to_half false 上述过程会重新编译fst,在PATH_TO_GIT_CLONED_WETEXTPROCESSING/tn 文件夹下可以找到新的fst,然后再在py包中指定cache_dir # tn usage
>>> from tn.chinese.normalizer import Normalizer
>>> normalizer = Normalizer(cache_dir="PATH_TO_GIT_CLONED_WETEXTPROCESSING/tn")
>>> normalizer.normalize("你好。") |
Thanks!!! |
Hi, 最新1.0.0版本新增:
from tn.chinese.normalizer import Normalizer
normalizer = Normalizer(full_to_half=False, overwrite_cache=True)
print(normalizer.normalize("你好。"))
normalizer = Normalizer(full_to_half=True, overwrite_cache=True)
print(normalizer.normalize("你好。")) details: https://github.com/wenet-e2e/WeTextProcessing/releases/tag/1.0.0 |
你好,感谢开源:)
在使用该工具包时,希望不对全角/半角字符进行转换,希望保留中文标点符号,但似乎在设置相关参数后没有达到目的:
请问是否是使用时存在错误? 感谢!
The text was updated successfully, but these errors were encountered: