Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

分词错误 #120

Open
wencan opened this issue Oct 15, 2023 · 0 comments
Open

分词错误 #120

wencan opened this issue Oct 15, 2023 · 0 comments

Comments

@wencan
Copy link

wencan commented Oct 15, 2023

thulac.thulac().cut('本书由百度官方出品,百度公司CTO王海峰博士作序,张钹院士、李未院士、百度集团副总裁吴甜联袂推荐。')
输出:
[['本书', 'r'], ['由', 'p'], ['百', 'm'], ['度', 'q'], ['官方', 'n'], ['出', 'v'], ['品', 'g'], [',', 'w'], ['百', 'm'], ['度', 'q'], ['公司', 'n'], ['CTO', 'nz'], ['王海峰', 'np'], ['博士', 'n'], ['作', 'v'], ['序', 'n'], [',', 'w'], ['张钹', 'np'], ['院士', 'n'], ['、', 'w'], ['李未', 'np'], ['院士', 'n'], ['、', 'w'], ['百', 'm'], ['度', 'q'], ['集团', 'n'], ['副', 'a'], ['总裁', 'n'], ['吴甜', 'np'], ['联袂', 'd'], ['推荐', 'v'], ['。', 'w']]


thulac.thulac(seg_only=True).cut('本书由百度官方出品,百度公司CTO王海峰博士作序,张钹院士、李未院士、百度集团副总裁吴甜联袂推荐。')
输出:
[['本书', ''], ['由', ''], ['百度', ''], ['官方', ''], ['出品', ''], [',', ''], ['百度', ''], ['公司', ''], ['CTO', ''], ['王海峰', ''], ['博士', ''], ['作', ''], ['序', ''], [',', ''], ['张钹', ''], ['院士', ''], ['、', ''], ['李未', ''], ['院士', ''], ['、', ''], ['百度', ''], ['集团', ''], ['副', ''], ['总裁', ''], ['吴甜联袂', ''], ['推荐', ''], ['。', '']]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant