Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exlucde word fragments from FREQ in posseg.cut #248

Merged
merged 1 commit into from
Apr 2, 2015
Merged

Conversation

wangbin
Copy link
Contributor

@wangbin wangbin commented Apr 2, 2015

感觉应该是在从Trie转换到Prefix Map的时候遗漏了对posseg.cut的更新,导致有的切分结果出现不一致,比如:

>>> from jieba import posseg
>>> list(posseg.cut("伊藤洋华堂总府店"))

结果是:

[伊/ns, 藤/nr, 洋华堂/n, 总府/n, 店/n]

转换之前的结果是:

[伊藤/nr, 洋华堂/n, 总府/n, 店/n]

fxsjy added a commit that referenced this pull request Apr 2, 2015
exlucde word fragments from FREQ in posseg.cut
@fxsjy fxsjy merged commit 753c1be into fxsjy:master Apr 2, 2015
fxsjy added a commit that referenced this pull request Apr 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants